DictionaryCompoundWordTokenFilter (The Adobe Experience Manager SDK 2020.6.3583.20200602T203929Z-200507)

java.lang.Object
- org.apache.lucene.util.AttributeSource
- - org.apache.lucene.analysis.TokenStream
  - - org.apache.lucene.analysis.TokenFilter
    - - org.apache.lucene.analysis.compound.CompoundWordTokenFilterBase
      - org.apache.lucene.analysis.compound.DictionaryCompoundWordTokenFilter

All Implemented Interfaces:

Closeable, AutoCloseable
```
public class DictionaryCompoundWordTokenFilter
extends CompoundWordTokenFilterBase
```
A TokenFilter that decomposes compound words found in many Germanic languages.
"Donaudampfschiff" becomes Donau, dampf, schiff so that you can find "Donaudampfschiff" even when you only enter "schiff". It uses a brute-force algorithm to achieve this.
You must specify the required Version compatibility when creating CompoundWordTokenFilterBase:
- As of 3.1, CompoundWordTokenFilterBase correctly handles Unicode 4.0 supplementary characters in strings and char arrays provided as compound word dictionaries.

Nested Class Summary
- Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
  AttributeSource.AttributeFactory, AttributeSource.State

Field Summary
- Fields inherited from class org.apache.lucene.analysis.compound.CompoundWordTokenFilterBase
  DEFAULT_MAX_SUBWORD_SIZE, DEFAULT_MIN_SUBWORD_SIZE, DEFAULT_MIN_WORD_SIZE

Constructor Summary

Constructors
Constructor and Description
`DictionaryCompoundWordTokenFilter(Version matchVersion, TokenStream input, CharArraySet dictionary)` Creates a new `DictionaryCompoundWordTokenFilter`
`DictionaryCompoundWordTokenFilter(Version matchVersion, TokenStream input, CharArraySet dictionary, int minWordSize, int minSubwordSize, int maxSubwordSize, boolean onlyLongestMatch)` Creates a new `DictionaryCompoundWordTokenFilter`

Method Summary
- Methods inherited from class org.apache.lucene.analysis.compound.CompoundWordTokenFilterBase
  incrementToken, reset
- Methods inherited from class org.apache.lucene.analysis.TokenFilter
  close, end
- Methods inherited from class org.apache.lucene.util.AttributeSource
  addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString
- Methods inherited from class java.lang.Object
  getClass, notify, notifyAll, wait, wait, wait

- Constructor Detail
  - DictionaryCompoundWordTokenFilter
```
public DictionaryCompoundWordTokenFilter(Version matchVersion,
                                         TokenStream input,
                                         CharArraySet dictionary)
```
    Creates a new DictionaryCompoundWordTokenFilter
    
    Parameters:
    
    matchVersion - Lucene version to enable correct Unicode 4.0 behavior in the dictionaries if Version > 3.0. See CompoundWordTokenFilterBase for details.
    
    input - the TokenStream to process
    
    dictionary - the word dictionary to match against.
  - DictionaryCompoundWordTokenFilter
```
public DictionaryCompoundWordTokenFilter(Version matchVersion,
                                         TokenStream input,
                                         CharArraySet dictionary,
                                         int minWordSize,
                                         int minSubwordSize,
                                         int maxSubwordSize,
                                         boolean onlyLongestMatch)
```
    Creates a new DictionaryCompoundWordTokenFilter
    
    Parameters:
    
    matchVersion - Lucene version to enable correct Unicode 4.0 behavior in the dictionaries if Version > 3.0. See CompoundWordTokenFilterBase for details.
    
    input - the TokenStream to process
    
    dictionary - the word dictionary to match against.
    
    minWordSize - only words longer than this get processed
    
    minSubwordSize - only subwords longer than this get to the output stream
    
    maxSubwordSize - only subwords shorter than this get to the output stream
    
    onlyLongestMatch - Add only the longest matching subword to the stream

Class DictionaryCompoundWordTokenFilter

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource

Field Summary

Fields inherited from class org.apache.lucene.analysis.compound.CompoundWordTokenFilterBase

Constructor Summary

Method Summary

Methods inherited from class org.apache.lucene.analysis.compound.CompoundWordTokenFilterBase

Methods inherited from class org.apache.lucene.analysis.TokenFilter

Methods inherited from class org.apache.lucene.util.AttributeSource

Methods inherited from class java.lang.Object

Constructor Detail

DictionaryCompoundWordTokenFilter

DictionaryCompoundWordTokenFilter