Interface TextTokenizerRegistry
-
- All Known Implementing Classes:
TextTokenizerRegistryImpl
@API(EXPERIMENTAL) public interface TextTokenizerRegistry
Registry forTextTokenizer
s. This registry allows for full-text indexes to specify their tokenizer type through an index option, using the ""textTokenizerName"" option. The registry will then be queried for the tokenizer that has that name at index- and query-time.Note that there are two ways of adding elements to the tokenizer registry. The first is to use the
AutoService
annotation to mark aTextTokenizerFactory
implementation as one that should be loaded into the registry. The other is to callregister()
on this interface to register that tokenizer manually. This second way is useful for tokenizers that are built on the fly from configuration parameters, for example.
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description Map<String,TextTokenizerFactory>
getRegistry()
Returns all registered tokenizers.TextTokenizer
getTokenizer(String name)
Gets the tokenizer of the given name.void
register(TextTokenizerFactory tokenizerFactory)
Registers a new tokenizer in this registry.void
reset()
Clears the registry and reloads tokenizers from the classpath.
-
-
-
Method Detail
-
getTokenizer
@Nonnull TextTokenizer getTokenizer(@Nullable String name)
Gets the tokenizer of the given name. Ifname
isnull
, it returns an instance of theDefaultTextTokenizer
.- Parameters:
name
- the name of the tokenizer to retrieve- Returns:
- the tokenizer registered with the given name
- Throws:
MetaDataException
- if no such tokenizer exists
-
register
void register(@Nonnull TextTokenizerFactory tokenizerFactory)
Registers a new tokenizer in this registry. The tokenizer should have a different name from all tokenizers that are currently registered. This will throw an error if there is already a tokenizer present that is not pointer-equal to thetokenizerFactory
parameter given.- Parameters:
tokenizerFactory
- new tokenizer to register- Throws:
RecordCoreArgumentException
- if there is a tokenizer of the same name already registered
-
getRegistry
@Nonnull Map<String,TextTokenizerFactory> getRegistry()
Returns all registered tokenizers.- Returns:
- a map from tokenizer name to
TextTokenizerFactory
-
reset
void reset()
Clears the registry and reloads tokenizers from the classpath. This is intended mainly for testing purposes (to avoid having one test add a tokenizer to the registry that another test cannot override).
-
-