Package org.apache.camel.spi
Interface Tokenizer
public interface Tokenizer
An interface for tokenizing text data. Typically used for machine learning, artificial intelligence and interacting
with vector databases.
Implementations of this interface should provide a way to configure the tokenizer, and then use that configuration to
tokenize the data in the Exchange.
-
Nested Class Summary
Nested ClassesModifier and TypeInterfaceDescriptionstatic interface
A nested interface representing the configuration options for this tokenizer. -
Method Summary
Modifier and TypeMethodDescriptionvoid
configure
(Tokenizer.Configuration configuration) Configures this tokenizer using the provided configuration options.name()
Returns the name of this tokenizer, which can be used for identification or logging purposes.Creates a new configuration for this tokenizer, with default values.String[]
Tokenizes the data in the provided Exchange using the current configuration options.
-
Method Details
-
newConfiguration
Tokenizer.Configuration newConfiguration()Creates a new configuration for this tokenizer, with default values.- Returns:
- a new Configuration object
-
configure
Configures this tokenizer using the provided configuration options.- Parameters:
configuration
- the configuration to use
-
name
String name()Returns the name of this tokenizer, which can be used for identification or logging purposes.- Returns:
- the name of this tokenizer
-
tokenize
Tokenizes the data in the provided Exchange using the current configuration options.- Parameters:
exchange
- the Exchange to tokenize- Returns:
- an array of tokens produced by the tokenizer
-