Package opennlp.tools.tokenize
Class DictionaryDetokenizer
- java.lang.Object
-
- opennlp.tools.tokenize.DictionaryDetokenizer
-
- All Implemented Interfaces:
Detokenizer
public class DictionaryDetokenizer extends java.lang.Object implements Detokenizer
A rule based detokenizer. Simple rules which indicate in which direction a token should be moved are looked up in aDetokenizationDictionary
object.- See Also:
Detokenizer
,DetokenizationDictionary
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface opennlp.tools.tokenize.Detokenizer
Detokenizer.DetokenizationOperation
-
-
Constructor Summary
Constructors Constructor Description DictionaryDetokenizer(DetokenizationDictionary dict)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Detokenizer.DetokenizationOperation[]
detokenize(java.lang.String[] tokens)
Detokenize the input tokens.java.lang.String
detokenize(java.lang.String[] tokens, java.lang.String splitMarker)
Detokenize the input tokens into a String.
-
-
-
Constructor Detail
-
DictionaryDetokenizer
public DictionaryDetokenizer(DetokenizationDictionary dict)
-
-
Method Detail
-
detokenize
public Detokenizer.DetokenizationOperation[] detokenize(java.lang.String[] tokens)
Description copied from interface:Detokenizer
Detokenize the input tokens.- Specified by:
detokenize
in interfaceDetokenizer
- Parameters:
tokens
- the tokens to detokenize.- Returns:
- the merge operations to detokenize the input tokens.
-
detokenize
public java.lang.String detokenize(java.lang.String[] tokens, java.lang.String splitMarker)
Description copied from interface:Detokenizer
Detokenize the input tokens into a String. Tokens which are connected without a space inbetween can be separated by a split marker.- Specified by:
detokenize
in interfaceDetokenizer
- Parameters:
tokens
- the token which should be concatenatedsplitMarker
- the split marker or null- Returns:
- the concatenated tokens
-
-