Class HuggingFaceTokenizer
java.lang.Object
dev.langchain4j.model.embedding.onnx.HuggingFaceTokenizer
- All Implemented Interfaces:
dev.langchain4j.model.Tokenizer
A HuggingFace tokenizer.
Uses DJL's
Requires
Uses DJL's
HuggingFaceTokenizer
under the hood.
Requires
tokenizer.json
to instantiate.
An example.-
Constructor Summary
ConstructorsConstructorDescriptionCreates an instance of aHuggingFaceTokenizer
using a built-intokenizer.json
file.HuggingFaceTokenizer
(String pathToTokenizer) Creates an instance of aHuggingFaceTokenizer
using a providedtokenizer.json
file.HuggingFaceTokenizer
(String pathToTokenizer, Map<String, String> options) Creates an instance of aHuggingFaceTokenizer
using a providedtokenizer.json
file and a map of DJL's tokenizer options.HuggingFaceTokenizer
(Path pathToTokenizer) Creates an instance of aHuggingFaceTokenizer
using a providedtokenizer.json
file.HuggingFaceTokenizer
(Path pathToTokenizer, Map<String, String> options) Creates an instance of aHuggingFaceTokenizer
using a providedtokenizer.json
file and a map of DJL's tokenizer options. -
Method Summary
Modifier and TypeMethodDescriptionint
estimateTokenCountInMessage
(dev.langchain4j.data.message.ChatMessage message) int
estimateTokenCountInMessages
(Iterable<dev.langchain4j.data.message.ChatMessage> messages) int
int
estimateTokenCountInToolExecutionRequests
(Iterable<dev.langchain4j.agent.tool.ToolExecutionRequest> toolExecutionRequests) int
estimateTokenCountInToolSpecifications
(Iterable<dev.langchain4j.agent.tool.ToolSpecification> toolSpecifications) Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface dev.langchain4j.model.Tokenizer
estimateTokenCountInForcefulToolExecutionRequest, estimateTokenCountInForcefulToolSpecification, estimateTokenCountInTools, estimateTokenCountInTools
-
Constructor Details
-
HuggingFaceTokenizer
public HuggingFaceTokenizer()Creates an instance of aHuggingFaceTokenizer
using a built-intokenizer.json
file. -
HuggingFaceTokenizer
Creates an instance of aHuggingFaceTokenizer
using a providedtokenizer.json
file.- Parameters:
pathToTokenizer
- The path to the tokenizer file (e.g., "/path/to/tokenizer.json")
-
HuggingFaceTokenizer
Creates an instance of aHuggingFaceTokenizer
using a providedtokenizer.json
file and a map of DJL's tokenizer options.- Parameters:
pathToTokenizer
- The path to the tokenizer file (e.g., "/path/to/tokenizer.json")options
- The DJL's tokenizer options
-
HuggingFaceTokenizer
Creates an instance of aHuggingFaceTokenizer
using a providedtokenizer.json
file.- Parameters:
pathToTokenizer
- The path to the tokenizer file (e.g., "/path/to/tokenizer.json")
-
HuggingFaceTokenizer
Creates an instance of aHuggingFaceTokenizer
using a providedtokenizer.json
file and a map of DJL's tokenizer options.- Parameters:
pathToTokenizer
- The path to the tokenizer file (e.g., "/path/to/tokenizer.json")options
- The DJL's tokenizer options
-
-
Method Details
-
estimateTokenCountInText
- Specified by:
estimateTokenCountInText
in interfacedev.langchain4j.model.Tokenizer
-
estimateTokenCountInMessage
public int estimateTokenCountInMessage(dev.langchain4j.data.message.ChatMessage message) - Specified by:
estimateTokenCountInMessage
in interfacedev.langchain4j.model.Tokenizer
-
estimateTokenCountInMessages
public int estimateTokenCountInMessages(Iterable<dev.langchain4j.data.message.ChatMessage> messages) - Specified by:
estimateTokenCountInMessages
in interfacedev.langchain4j.model.Tokenizer
-
estimateTokenCountInToolSpecifications
public int estimateTokenCountInToolSpecifications(Iterable<dev.langchain4j.agent.tool.ToolSpecification> toolSpecifications) - Specified by:
estimateTokenCountInToolSpecifications
in interfacedev.langchain4j.model.Tokenizer
-
estimateTokenCountInToolExecutionRequests
public int estimateTokenCountInToolExecutionRequests(Iterable<dev.langchain4j.agent.tool.ToolExecutionRequest> toolExecutionRequests) - Specified by:
estimateTokenCountInToolExecutionRequests
in interfacedev.langchain4j.model.Tokenizer
-