Class HuggingFaceTokenCountEstimator

java.lang.Object
dev.langchain4j.model.embedding.onnx.HuggingFaceTokenCountEstimator
All Implemented Interfaces:
dev.langchain4j.model.TokenCountEstimator

public class HuggingFaceTokenCountEstimator extends Object implements dev.langchain4j.model.TokenCountEstimator
A token count estimator for models that can be found on HuggingFace.
Uses DJL's HuggingFaceTokenizer under the hood.
Requires tokenizer.json to instantiate. An example.
  • Constructor Details

    • HuggingFaceTokenCountEstimator

      public HuggingFaceTokenCountEstimator()
      Creates an instance of a HuggingFaceTokenCountEstimator using a built-in tokenizer.json file.
    • HuggingFaceTokenCountEstimator

      public HuggingFaceTokenCountEstimator(Path pathToTokenizer)
      Creates an instance of a HuggingFaceTokenCountEstimator using a provided tokenizer.json file.
      Parameters:
      pathToTokenizer - The path to the tokenizer file (e.g., "/path/to/tokenizer.json")
    • HuggingFaceTokenCountEstimator

      public HuggingFaceTokenCountEstimator(Path pathToTokenizer, Map<String,String> options)
      Creates an instance of a HuggingFaceTokenCountEstimator using a provided tokenizer.json file and a map of DJL's tokenizer options.
      Parameters:
      pathToTokenizer - The path to the tokenizer file (e.g., "/path/to/tokenizer.json")
      options - The DJL's tokenizer options
    • HuggingFaceTokenCountEstimator

      public HuggingFaceTokenCountEstimator(String pathToTokenizer)
      Creates an instance of a HuggingFaceTokenCountEstimator using a provided tokenizer.json file.
      Parameters:
      pathToTokenizer - The path to the tokenizer file (e.g., "/path/to/tokenizer.json")
    • HuggingFaceTokenCountEstimator

      public HuggingFaceTokenCountEstimator(String pathToTokenizer, Map<String,String> options)
      Creates an instance of a HuggingFaceTokenCountEstimator using a provided tokenizer.json file and a map of DJL's tokenizer options.
      Parameters:
      pathToTokenizer - The path to the tokenizer file (e.g., "/path/to/tokenizer.json")
      options - The DJL's tokenizer options
  • Method Details

    • estimateTokenCountInText

      public int estimateTokenCountInText(String text)
      Specified by:
      estimateTokenCountInText in interface dev.langchain4j.model.TokenCountEstimator
    • estimateTokenCountInMessage

      public int estimateTokenCountInMessage(dev.langchain4j.data.message.ChatMessage message)
      Specified by:
      estimateTokenCountInMessage in interface dev.langchain4j.model.TokenCountEstimator
    • estimateTokenCountInMessages

      public int estimateTokenCountInMessages(Iterable<dev.langchain4j.data.message.ChatMessage> messages)
      Specified by:
      estimateTokenCountInMessages in interface dev.langchain4j.model.TokenCountEstimator