Class TermvectorsRequest<TDocument>

java.lang.Object
co.elastic.clients.elasticsearch._types.RequestBase
co.elastic.clients.elasticsearch.core.TermvectorsRequest<TDocument>
All Implemented Interfaces:
JsonpSerializable

@JsonpDeserializable public class TermvectorsRequest<TDocument> extends RequestBase implements JsonpSerializable
Get term vector information.

Get information and statistics about terms in the fields of a particular document.

You can retrieve term vectors for documents stored in the index or for artificial documents passed in the body of the request. You can specify the fields you are interested in through the fields parameter or by adding the fields to the request body. For example:

 GET /my-index-000001/_termvectors/1?fields=message
 
 

Fields can be specified using wildcards, similar to the multi match query.

Term vectors are real-time by default, not near real-time. This can be changed by setting realtime parameter to false.

You can request three types of values: term information, term statistics, and field statistics. By default, all term information and field statistics are returned for all fields but term statistics are excluded.

Term information

  • term frequency in the field (always returned)
  • term positions (positions: true)
  • start and end offsets (offsets: true)
  • term payloads (payloads: true), as base64 encoded bytes

If the requested information wasn't stored in the index, it will be computed on the fly if possible. Additionally, term vectors could be computed for documents not even existing in the index, but instead provided by the user.

warn Start and end offsets assume UTF-16 encoding is being used. If you want to use these offsets in order to get the original text that produced this token, you should make sure that the string you are taking a sub-string of is also encoded using UTF-16.

Behaviour

The term and field statistics are not accurate. Deleted documents are not taken into account. The information is only retrieved for the shard the requested document resides in. The term and field statistics are therefore only useful as relative measures whereas the absolute numbers have no meaning in this context. By default, when requesting term vectors of artificial documents, a shard to get the statistics from is randomly selected. Use routing only to hit a particular shard.

See Also:
  • Field Details

  • Method Details

    • of

      public static <TDocument> TermvectorsRequest<TDocument> of(Function<TermvectorsRequest.Builder<TDocument>,ObjectBuilder<TermvectorsRequest<TDocument>>> fn)
    • doc

      @Nullable public final TDocument doc()
      An artificial document (a document not present in the index) for which you want to retrieve term vectors.

      API name: doc

    • fieldStatistics

      @Nullable public final Boolean fieldStatistics()
      If true, the response includes:
      • The document count (how many documents contain this field).
      • The sum of document frequencies (the sum of document frequencies for all terms in this field).
      • The sum of total term frequencies (the sum of total term frequencies of each term in this field).

      API name: field_statistics

    • fields

      public final List<String> fields()
      A list of fields to include in the statistics. It is used as the default list unless a specific field list is provided in the completion_fields or fielddata_fields parameters.

      API name: fields

    • filter

      @Nullable public final Filter filter()
      Filter terms based on their tf-idf scores. This could be useful in order find out a good characteristic vector of a document. This feature works in a similar manner to the second phase of the More Like This Query.

      API name: filter

    • id

      @Nullable public final String id()
      A unique identifier for the document.

      API name: id

    • index

      public final String index()
      Required - The name of the index that contains the document.

      API name: index

    • offsets

      @Nullable public final Boolean offsets()
      If true, the response includes term offsets.

      API name: offsets

    • payloads

      @Nullable public final Boolean payloads()
      If true, the response includes term payloads.

      API name: payloads

    • perFieldAnalyzer

      public final Map<String,String> perFieldAnalyzer()
      Override the default per-field analyzer. This is useful in order to generate term vectors in any fashion, especially when using artificial documents. When providing an analyzer for a field that already stores term vectors, the term vectors will be regenerated.

      API name: per_field_analyzer

    • positions

      @Nullable public final Boolean positions()
      If true, the response includes term positions.

      API name: positions

    • preference

      @Nullable public final String preference()
      The node or shard the operation should be performed on. It is random by default.

      API name: preference

    • realtime

      @Nullable public final Boolean realtime()
      If true, the request is real-time as opposed to near-real-time.

      API name: realtime

    • routing

      @Nullable public final String routing()
      A custom value that is used to route operations to a specific shard.

      API name: routing

    • termStatistics

      @Nullable public final Boolean termStatistics()
      If true, the response includes:
      • The total term frequency (how often a term occurs in all documents).
      • The document frequency (the number of documents containing the current term).

      By default these values are not returned since term statistics can have a serious performance impact.

      API name: term_statistics

    • version

      @Nullable public final Long version()
      If true, returns the document version as part of a hit.

      API name: version

    • versionType

      @Nullable public final VersionType versionType()
      The version type.

      API name: version_type

    • serialize

      public void serialize(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper)
      Serialize this object to JSON.
      Specified by:
      serialize in interface JsonpSerializable
    • serializeInternal

      protected void serializeInternal(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper)
    • createTermvectorsRequestDeserializer

      public static <TDocument> JsonpDeserializer<TermvectorsRequest<TDocument>> createTermvectorsRequestDeserializer(JsonpDeserializer<TDocument> tDocumentDeserializer)
      Create a JSON deserializer for TermvectorsRequest
    • setupTermvectorsRequestDeserializer

      protected static <TDocument> void setupTermvectorsRequestDeserializer(ObjectDeserializer<TermvectorsRequest.Builder<TDocument>> op, JsonpDeserializer<TDocument> tDocumentDeserializer)