Class TermvectorsRequest<TDocument>
- All Implemented Interfaces:
JsonpSerializable
Get information and statistics about terms in the fields of a particular document.
You can retrieve term vectors for documents stored in the index or for
artificial documents passed in the body of the request. You can specify the
fields you are interested in through the fields parameter or by
adding the fields to the request body. For example:
GET /my-index-000001/_termvectors/1?fields=message
Fields can be specified using wildcards, similar to the multi match query.
Term vectors are real-time by default, not near real-time. This can be
changed by setting realtime parameter to false.
You can request three types of values: term information, term statistics, and field statistics. By default, all term information and field statistics are returned for all fields but term statistics are excluded.
Term information
- term frequency in the field (always returned)
- term positions (
positions: true) - start and end offsets (
offsets: true) - term payloads (
payloads: true), as base64 encoded bytes
If the requested information wasn't stored in the index, it will be computed on the fly if possible. Additionally, term vectors could be computed for documents not even existing in the index, but instead provided by the user.
warn Start and end offsets assume UTF-16 encoding is being used. If you want to use these offsets in order to get the original text that produced this token, you should make sure that the string you are taking a sub-string of is also encoded using UTF-16.
Behaviour
The term and field statistics are not accurate. Deleted documents are not
taken into account. The information is only retrieved for the shard the
requested document resides in. The term and field statistics are therefore
only useful as relative measures whereas the absolute numbers have no meaning
in this context. By default, when requesting term vectors of artificial
documents, a shard to get the statistics from is randomly selected. Use
routing only to hit a particular shard. Refer to the linked
documentation for detailed examples of how to use this API.
- See Also:
-
Nested Class Summary
Nested ClassesNested classes/interfaces inherited from class co.elastic.clients.elasticsearch._types.RequestBase
RequestBase.AbstractBuilder<BuilderT extends RequestBase.AbstractBuilder<BuilderT>> -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final JsonpDeserializer<TermvectorsRequest<Object>>Json deserializer forTermvectorsRequestbased on named deserializers provided by the callingJsonMapper.static final Endpoint<TermvectorsRequest<?>,TermvectorsResponse, ErrorResponse> Endpoint "termvectors". -
Method Summary
Modifier and TypeMethodDescriptionstatic <TDocument>
JsonpDeserializer<TermvectorsRequest<TDocument>>createTermvectorsRequestDeserializer(JsonpDeserializer<TDocument> tDocumentDeserializer) Create a JSON deserializer for TermvectorsRequestfinal TDocumentdoc()An artificial document (a document not present in the index) for which you want to retrieve term vectors.fields()A list of fields to include in the statistics.final BooleanIftrue, the response includes: The document count (how many documents contain this field). The sum of document frequencies (the sum of document frequencies for all terms in this field). The sum of total term frequencies (the sum of total term frequencies of each term in this field).final Filterfilter()Filter terms based on their tf-idf scores.final Stringid()A unique identifier for the document.final Stringindex()Required - The name of the index that contains the document.static <TDocument>
TermvectorsRequest<TDocument>of(Function<TermvectorsRequest.Builder<TDocument>, ObjectBuilder<TermvectorsRequest<TDocument>>> fn) final Booleanoffsets()Iftrue, the response includes term offsets.final Booleanpayloads()Iftrue, the response includes term payloads.Override the default per-field analyzer.final BooleanIftrue, the response includes term positions.final StringThe node or shard the operation should be performed on.final Booleanrealtime()If true, the request is real-time as opposed to near-real-time.final Stringrouting()A custom value that is used to route operations to a specific shard.voidserialize(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper) Serialize this object to JSON.protected voidserializeInternal(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper) protected static <TDocument>
voidsetupTermvectorsRequestDeserializer(ObjectDeserializer<TermvectorsRequest.Builder<TDocument>> op, JsonpDeserializer<TDocument> tDocumentDeserializer) final BooleanIftrue, the response includes: The total term frequency (how often a term occurs in all documents). The document frequency (the number of documents containing the current term).final Longversion()Iftrue, returns the document version as part of a hit.final VersionTypeThe version type.Methods inherited from class co.elastic.clients.elasticsearch._types.RequestBase
toString
-
Field Details
-
_DESERIALIZER
Json deserializer forTermvectorsRequestbased on named deserializers provided by the callingJsonMapper. -
_ENDPOINT
Endpoint "termvectors".
-
-
Method Details
-
of
public static <TDocument> TermvectorsRequest<TDocument> of(Function<TermvectorsRequest.Builder<TDocument>, ObjectBuilder<TermvectorsRequest<TDocument>>> fn) -
doc
An artificial document (a document not present in the index) for which you want to retrieve term vectors.API name:
doc -
fieldStatistics
Iftrue, the response includes:- The document count (how many documents contain this field).
- The sum of document frequencies (the sum of document frequencies for all terms in this field).
- The sum of total term frequencies (the sum of total term frequencies of each term in this field).
API name:
field_statistics -
fields
A list of fields to include in the statistics. It is used as the default list unless a specific field list is provided in thecompletion_fieldsorfielddata_fieldsparameters.API name:
fields -
filter
Filter terms based on their tf-idf scores. This could be useful in order find out a good characteristic vector of a document. This feature works in a similar manner to the second phase of the More Like This Query.API name:
filter -
id
A unique identifier for the document.API name:
id -
index
Required - The name of the index that contains the document.API name:
index -
offsets
Iftrue, the response includes term offsets.API name:
offsets -
payloads
Iftrue, the response includes term payloads.API name:
payloads -
perFieldAnalyzer
Override the default per-field analyzer. This is useful in order to generate term vectors in any fashion, especially when using artificial documents. When providing an analyzer for a field that already stores term vectors, the term vectors will be regenerated.API name:
per_field_analyzer -
positions
Iftrue, the response includes term positions.API name:
positions -
preference
The node or shard the operation should be performed on. It is random by default.API name:
preference -
realtime
If true, the request is real-time as opposed to near-real-time.API name:
realtime -
routing
A custom value that is used to route operations to a specific shard.API name:
routing -
termStatistics
Iftrue, the response includes:- The total term frequency (how often a term occurs in all documents).
- The document frequency (the number of documents containing the current term).
By default these values are not returned since term statistics can have a serious performance impact.
API name:
term_statistics -
version
Iftrue, returns the document version as part of a hit.API name:
version -
versionType
The version type.API name:
version_type -
serialize
Serialize this object to JSON.- Specified by:
serializein interfaceJsonpSerializable
-
serializeInternal
-
createTermvectorsRequestDeserializer
public static <TDocument> JsonpDeserializer<TermvectorsRequest<TDocument>> createTermvectorsRequestDeserializer(JsonpDeserializer<TDocument> tDocumentDeserializer) Create a JSON deserializer for TermvectorsRequest -
setupTermvectorsRequestDeserializer
protected static <TDocument> void setupTermvectorsRequestDeserializer(ObjectDeserializer<TermvectorsRequest.Builder<TDocument>> op, JsonpDeserializer<TDocument> tDocumentDeserializer)
-