Text (fdb-record-layer-core 2.8.88.0 API)

java.lang.Object
- com.apple.foundationdb.record.query.expressions.Text

```
@API(value=EXPERIMENTAL)
public abstract class Text
extends Object
```
Predicates that can be applied to a field that has been indexed with a full-text index. These allow for querying on properties of the text contents, e.g., whether the text contains a given token, token list, or phrase. Most of the methods here that allow for multiple tokens to be supplied can either be given a single string or a list. If a single string is given, then the string will be tokenized later using an appropriate tokenizer. If a list is given, then the assumption is that the user has already tokenized the string and the list is the result of that tokenization.
This type allows the user to specify a "tokenizer name". If one is given, then it will use this tokenizer to tokenize the query string (if not pre-tokenized) and will require that if an index is used, it uses the tokenizer provided. If no tokenizer is specified, then it will allow itself to be matched against any text index on the field and apply the index's tokenizer to the query string. If no suitable index can be found and a full scan with a post-filter has to be done, then a fallback tokenizer will be used both to tokenize the query string as well as to tokenize the record's text. By default, this is the DefaultTextTokenizer (with name ""), but one can specify a different one if one wishes.

This should be created by calling the text() method on a query Field or OneOfThem instance. For example, one might call: Query.field("text").text() to create a predicate on the text field's contents.

See Also:

TextIndexMaintainer, TextTokenizer, DefaultTextTokenizer

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`QueryComponent`	`contains(String token)` Checks if the field contains a token.
`QueryComponent`	`containsAll(List<String> tokens)` Checks if the field contains all of provided tokens.
`QueryComponent`	`containsAll(List<String> tokens, int maxDistance)` Checks if the field text contains all of the provided tokens within a given number of tokens.
`QueryComponent`	`containsAll(String tokens)` Checks if the field contains all of the provided tokens.
`QueryComponent`	`containsAll(String tokens, int maxDistance)` Checks if the field text contains all of the provided tokens within a given number of tokens.
`QueryComponent`	`containsAllPrefixes(List<String> tokenPrefixes)` Checks if the field contains tokens matching all of of the given prefixes.
`QueryComponent`	`containsAllPrefixes(List<String> tokenPrefixes, boolean strict)` Checks if the field contains tokens matching all of of the given prefixes.
`QueryComponent`	`containsAllPrefixes(List<String> tokenPrefixes, boolean strict, long expectedRecords, double falsePositivePercentage)` Checks if the field contains tokens matching all of of the given prefixes.
`QueryComponent`	`containsAllPrefixes(String tokenPrefixes)` Checks if the field contains tokens matching all of of the given prefixes.
`QueryComponent`	`containsAllPrefixes(String tokenPrefixes, boolean strict)` Checks if the field contains tokens matching all of of the given prefixes.
`QueryComponent`	`containsAllPrefixes(String tokenPrefixes, boolean strict, long expectedRecords, double falsePositivePercentage)` Checks if the field contains tokens matching all of of the given prefixes.
`QueryComponent`	`containsAny(List<String> tokens)` Checks if the field contains all of provided tokens.
`QueryComponent`	`containsAny(String tokens)` Checks if the field contains any of the provided tokens.
`QueryComponent`	`containsAnyPrefix(List<String> tokenPrefixes)` Checks if the field contains a token that matches any of the given prefixes.
`QueryComponent`	`containsAnyPrefix(String tokenPrefixes)` Checks if the field contains a token that matches any of the given prefixes.
`QueryComponent`	`containsPhrase(List<String> phraseTokens)` Checks if the field text contains the given phrase.
`QueryComponent`	`containsPhrase(String phrase)` Checks if the field contains the given phrase.
`QueryComponent`	`containsPrefix(String prefix)` Checks if the field contains any token matching the provided prefix.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Method Detail
  - contains
```
@Nonnull
public QueryComponent contains(@Nonnull
                                        String token)
```
    Checks if the field contains a token. This token should either be generated by the tokenizer associated with this text predicate or should be a plausible token that the tokenizer could have generated. This token will not be further sanitized or normalized before searching for it in the text.
    
    Parameters:
    
    token - the token to search for
    
    Returns:
    
    a new component for doing the actual evaluation
  - containsPrefix
```
@Nonnull
public QueryComponent containsPrefix(@Nonnull
                                              String prefix)
```
    Checks if the field contains any token matching the provided prefix. This should be the beginning of a token that could be generated by the tokenizer associated with this text predicate. No additional sanitization or normalization of this prefix will be performed before searching for it in the text.
    
    Parameters:
    
    prefix - the prefix to search for
    
    Returns:
    
    a new component for doing the actual evaluation
  - containsAll
```
@Nonnull
public QueryComponent containsAll(@Nonnull
                                           String tokens)
```
    Checks if the field contains all of the provided tokens. At query evaluation time, the tokens provided here will be tokenized into a list of tokens. This predicate will then return Boolean.TRUE if all of the tokens (except stop words) are present in the text field, Boolean.FALSE if any of them are not, and null if either the field is null or if the token list contains only stop words or is empty. If the same token appears multiple times in the token list, then the token must only appear at least once in the searched text to satisfy the filter (i.e., it is not required to appear as many times in the text as in the token list).
    
    Parameters:
    
    tokens - the tokens to search for
    
    Returns:
    
    a new component for doing the actual evaluation
  - containsAll
```
@Nonnull
public QueryComponent containsAll(@Nonnull
                                           List<String> tokens)
```
    Checks if the field contains all of provided tokens. This behaves like containsAll(String), except that the token list is assumed to have already been tokenized with an appropriate tokenizer. No further sanitization or normalization is performed on the tokens before searching for them in the text.
    
    Parameters:
    
    tokens - the tokens to search for
    
    Returns:
    
    a new component for doing the actual evaluation
  - containsAll
```
@Nonnull
public QueryComponent containsAll(@Nonnull
                                           String tokens,
                                           int maxDistance)
```
    Checks if the field text contains all of the provided tokens within a given number of tokens. For example, in the string "a b c" (tokenized by whitespace), tokens "a" and "c" are a distance of two tokens of each other, so containsAll("a c", 2) when evaluated against that string would return Boolean.TRUE, but containsAll("a c", 1) would return Boolean.FALSE. Stop words in the query string are ignored, and if there are no tokens in the string (or all tokens are stop words), this will evaluate to null. It will also evaluate to null if the field is null. If the same token appears multiple times in the token list, then the token must only appear at least once in the searched text to satisfy the filter (i.e., it is not required to appear as many times in the text as in the token list).
    
    Parameters:
    
    tokens - the tokens to search for
    
    maxDistance - the maximum distance (expressed in number of tokens) to allow between found
    
    Returns:
    
    a new component for doing the actual evaluation
  - containsAll
```
@Nonnull
public QueryComponent containsAll(@Nonnull
                                           List<String> tokens,
                                           int maxDistance)
```
    Checks if the field text contains all of the provided tokens within a given number of tokens. This behaves like containsAll(String, int) except that the token list is assumed to have already been tokenized with an appropriate tokenizer. No further sanitization or normalization is performed on the tokens before searching for them in the text.
    
    Parameters:
    
    tokens - the tokens to search for
    
    maxDistance - the maximum distance (expressed in number of tokens) to allow between found
    
    Returns:
    
    a new component for doing the actual evaluation
  - containsAllPrefixes
```
@Nonnull
public QueryComponent containsAllPrefixes(@Nonnull
                                                   String tokenPrefixes)
```
    Checks if the field contains tokens matching all of of the given prefixes. The given String will be tokenized into multiple tokens using an appropriate tokenizer. This variant of containsAllPrefixes is strict, i.e., the planner will ensure that it does not return any false positives when evaluated with an index scan. However, the scan can be made more efficient (if false positives are acceptable) if one uses one of the other variants of this function and supply false to the strict parameter.
    
    Parameters:
    
    tokenPrefixes - the token prefixes to search for
    
    Returns:
    
    a new component for doing the actual evaluation
    
    See Also:
    
    containsAllPrefixes(String, boolean)
  - containsAllPrefixes
```
@Nonnull
public QueryComponent containsAllPrefixes(@Nonnull
                                                   String tokenPrefixes,
                                                   boolean strict)
```
    Checks if the field contains tokens matching all of of the given prefixes. The given String will be tokenized into multiple tokens using an appropriate tokenizer. The strict parameter determines whether this comparison is strictly evaluated against an index. If the parameter is set to true, then this will return no false positives, but it may require that there are additional reads performed to filter out any false positives that occur internally during query execution.
    
    Parameters:
    
    tokenPrefixes - the token prefixes to search for
    
    strict - true if this should not return false positives
    
    Returns:
    
    a new component for doing the actual evaluation
  - containsAllPrefixes
```
@Nonnull
public QueryComponent containsAllPrefixes(@Nonnull
                                                   String tokenPrefixes,
                                                   boolean strict,
                                                   long expectedRecords,
                                                   double falsePositivePercentage)
```
    Checks if the field contains tokens matching all of of the given prefixes. The given String will be tokenized into multiple tokens using an appropriate tokenizer. The strict parameter behaves the same way here as it does in the other overload of containsAllPrefixes(). The expectedRecords and falsePositivePercentage flags can be used to tweak the behavior of underlying probabilistic data structures used during query execution. See the Comparisons.TextContainsAllPrefixesComparison class for more details.
    
    Parameters:
    
    tokenPrefixes - the token prefixes to search for
    
    strict - true if this should not return any false positives
    
    expectedRecords - the expected number of records read for each prefix
    
    falsePositivePercentage - an acceptable percentage of false positives for each token prefix
    
    Returns:
    
    a new component for doing the actual evaluation
    
    See Also:
    
    Comparisons.TextContainsAllPrefixesComparison, containsAllPrefixes(String, boolean)
  - containsAllPrefixes
```
@Nonnull
public QueryComponent containsAllPrefixes(@Nonnull
                                                   List<String> tokenPrefixes)
```
    Checks if the field contains tokens matching all of of the given prefixes. This will produce a component that behaves exactly like the component returned by the variant of containsAllPrefixes(String) that takes a single String, but this method assumes the token prefixes given are already tokenized and normalized.
    
    Parameters:
    
    tokenPrefixes - the token prefixes to search for
    
    Returns:
    
    a new component for doing the actual evaluation
    
    See Also:
    
    containsAllPrefixes(String)
  - containsAllPrefixes
```
@Nonnull
public QueryComponent containsAllPrefixes(@Nonnull
                                                   List<String> tokenPrefixes,
                                                   boolean strict)
```
    Checks if the field contains tokens matching all of of the given prefixes. This will produce a component that behaves exactly like the component returned by the variant of containsAllPrefixes(String, boolean) that takes a single String, but this method assumes the token prefixes given are already tokenized and normalized.
    
    Parameters:
    
    tokenPrefixes - the token prefixes to search for
    
    strict - true if this should not return any false positives
    
    Returns:
    
    a new component for doing the actual evaluation
    
    See Also:
    
    containsAllPrefixes(String, boolean)
  - containsAllPrefixes
```
@Nonnull
public QueryComponent containsAllPrefixes(@Nonnull
                                                   List<String> tokenPrefixes,
                                                   boolean strict,
                                                   long expectedRecords,
                                                   double falsePositivePercentage)
```
    Checks if the field contains tokens matching all of of the given prefixes. This will produce a component that behaves exactly like the component returned by the variant of containsAllPrefixes(String, boolean, long, double) that takes a single String, but this method assumes the token prefixes given are already tokenized and normalized.
    
    Parameters:
    
    tokenPrefixes - the token prefixes to search for
    
    strict - true if this should not return any false positives
    
    expectedRecords - the expected number of records read for each prefix
    
    falsePositivePercentage - an acceptable percentage of false positives for each token prefix
    
    Returns:
    
    a new component for doing the actual evaluation
    
    See Also:
    
    containsAllPrefixes(String, boolean, long, double)
  - containsPhrase
```
@Nonnull
public QueryComponent containsPhrase(@Nonnull
                                              String phrase)
```
    Checks if the field contains the given phrase. This will match the given field if the given phrased (when tokenized) forms a sublist of the original text. If the tokenization process removes any stop words from the phrase, this will match documents that contain any token in the place of the stop word. This will return Boolean.TRUE if all of the tokens (except stop words) can be found in the given document in the correct order, Boolean.FALSE if any cannot, and null if the phrase is empty or contains only stop words or if the field itself is null.
    
    Parameters:
    
    phrase - the phrase to search for
    
    Returns:
    
    a new component for doing the actual evaluation
  - containsPhrase
```
@Nonnull
public QueryComponent containsPhrase(@Nonnull
                                              List<String> phraseTokens)
```
    Checks if the field text contains the given phrase. This behaves like containsPhrase(String) except that the token list is assumed to have already been tokenized with an appropriate tokenizer. No further sanitization or normalization is performed on the tokens before searching for them in the text. It is assumed that the order of the tokens in the list is the same as the order of the tokens in the original phrase and that there are no gaps (except as indicated by including the empty string to indicate that there was a stop word in the original phrase).
    
    Parameters:
    
    phraseTokens - the tokens to search for in the order they appear in the phrase
    
    Returns:
    
    a new component for doing the actual evaluation
  - containsAny
```
@Nonnull
public QueryComponent containsAny(@Nonnull
                                           String tokens)
```
    Checks if the field contains any of the provided tokens. At query evaluation time, the tokens provided here will be tokenized into a list of tokens. This predicate will then return Boolean.TRUE if any of the tokens (not counting stop words) are present, Boolean.FALSE if all of them are not, and null if either the field is null or if the token list contains only stop words or is empty.
    
    Parameters:
    
    tokens - the tokens to search for
    
    Returns:
    
    a new component for doing the actual evaluation
  - containsAny
```
@Nonnull
public QueryComponent containsAny(@Nonnull
                                           List<String> tokens)
```
    Checks if the field contains all of provided tokens. This behaves like containsAny(String), except that the token list is assumed to have already been tokenized with an appropriate tokenizer. No further sanitization or normalization is performed on the tokens before searching for them in the text.
    
    Parameters:
    
    tokens - the tokens to search for
    
    Returns:
    
    a new component for doing the actual evaluation
  - containsAnyPrefix
```
@Nonnull
public QueryComponent containsAnyPrefix(@Nonnull
                                                 String tokenPrefixes)
```
    Checks if the field contains a token that matches any of the given prefixes. At query evaluation time, the string given is tokenized using an appropriate tokenizer.
    
    Parameters:
    
    tokenPrefixes - the token prefixes to search for
    
    Returns:
    
    a new component for doing the actual evaluation
  - containsAnyPrefix
```
@Nonnull
public QueryComponent containsAnyPrefix(@Nonnull
                                                 List<String> tokenPrefixes)
```
    Checks if the field contains a token that matches any of the given prefixes. This behaves like the variant of containsAnyPrefix(String) that takes a single String except that it assumes the token prefix list has already been tokenized and normalized.
    
    Parameters:
    
    tokenPrefixes - the token prefixes to search for
    
    Returns:
    
    a new component for doing the actual evaluation
    
    See Also:
    
    containsAnyPrefix(String)

Class Text

Method Summary

Methods inherited from class java.lang.Object

Method Detail

contains

containsPrefix

containsAll

containsAll

containsAll

containsAll

containsAllPrefixes

containsAllPrefixes

containsAllPrefixes

containsAllPrefixes

containsAllPrefixes

containsAllPrefixes

containsPhrase

containsPhrase

containsAny

containsAny

containsAnyPrefix

containsAnyPrefix