Class TermvectorsRequest<TDocument>
- All Implemented Interfaces:
JsonpSerializable
Get information and statistics about terms in the fields of a particular document.
You can retrieve term vectors for documents stored in the index or for
artificial documents passed in the body of the request. You can specify the
fields you are interested in through the fields
parameter or by
adding the fields to the request body. For example:
GET /my-index-000001/_termvectors/1?fields=message
Fields can be specified using wildcards, similar to the multi match query.
Term vectors are real-time by default, not near real-time. This can be
changed by setting realtime
parameter to false
.
You can request three types of values: term information, term statistics, and field statistics. By default, all term information and field statistics are returned for all fields but term statistics are excluded.
Term information
- term frequency in the field (always returned)
- term positions (
positions: true
) - start and end offsets (
offsets: true
) - term payloads (
payloads: true
), as base64 encoded bytes
If the requested information wasn't stored in the index, it will be computed on the fly if possible. Additionally, term vectors could be computed for documents not even existing in the index, but instead provided by the user.
warn Start and end offsets assume UTF-16 encoding is being used. If you want to use these offsets in order to get the original text that produced this token, you should make sure that the string you are taking a sub-string of is also encoded using UTF-16.
Behaviour
The term and field statistics are not accurate. Deleted documents are not
taken into account. The information is only retrieved for the shard the
requested document resides in. The term and field statistics are therefore
only useful as relative measures whereas the absolute numbers have no meaning
in this context. By default, when requesting term vectors of artificial
documents, a shard to get the statistics from is randomly selected. Use
routing
only to hit a particular shard.
- See Also:
-
Nested Class Summary
Nested ClassesNested classes/interfaces inherited from class co.elastic.clients.elasticsearch._types.RequestBase
RequestBase.AbstractBuilder<BuilderT extends RequestBase.AbstractBuilder<BuilderT>>
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final JsonpDeserializer<TermvectorsRequest<Object>>
Json deserializer forTermvectorsRequest
based on named deserializers provided by the callingJsonMapper
.static final Endpoint<TermvectorsRequest<?>,
TermvectorsResponse, ErrorResponse> Endpoint "termvectors
". -
Method Summary
Modifier and TypeMethodDescriptionstatic <TDocument>
JsonpDeserializer<TermvectorsRequest<TDocument>>createTermvectorsRequestDeserializer
(JsonpDeserializer<TDocument> tDocumentDeserializer) Create a JSON deserializer for TermvectorsRequestfinal TDocument
doc()
An artificial document (a document not present in the index) for which you want to retrieve term vectors.fields()
A list of fields to include in the statistics.final Boolean
Iftrue
, the response includes: The document count (how many documents contain this field). The sum of document frequencies (the sum of document frequencies for all terms in this field). The sum of total term frequencies (the sum of total term frequencies of each term in this field).final Filter
filter()
Filter terms based on their tf-idf scores.final String
id()
A unique identifier for the document.final String
index()
Required - The name of the index that contains the document.static <TDocument>
TermvectorsRequest<TDocument>of
(Function<TermvectorsRequest.Builder<TDocument>, ObjectBuilder<TermvectorsRequest<TDocument>>> fn) final Boolean
offsets()
Iftrue
, the response includes term offsets.final Boolean
payloads()
Iftrue
, the response includes term payloads.Override the default per-field analyzer.final Boolean
Iftrue
, the response includes term positions.final String
The node or shard the operation should be performed on.final Boolean
realtime()
If true, the request is real-time as opposed to near-real-time.final String
routing()
A custom value that is used to route operations to a specific shard.void
serialize
(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper) Serialize this object to JSON.protected void
serializeInternal
(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper) protected static <TDocument>
voidsetupTermvectorsRequestDeserializer
(ObjectDeserializer<TermvectorsRequest.Builder<TDocument>> op, JsonpDeserializer<TDocument> tDocumentDeserializer) final Boolean
Iftrue
, the response includes: The total term frequency (how often a term occurs in all documents). The document frequency (the number of documents containing the current term).final Long
version()
Iftrue
, returns the document version as part of a hit.final VersionType
The version type.Methods inherited from class co.elastic.clients.elasticsearch._types.RequestBase
toString
-
Field Details
-
_DESERIALIZER
Json deserializer forTermvectorsRequest
based on named deserializers provided by the callingJsonMapper
. -
_ENDPOINT
Endpoint "termvectors
".
-
-
Method Details
-
of
public static <TDocument> TermvectorsRequest<TDocument> of(Function<TermvectorsRequest.Builder<TDocument>, ObjectBuilder<TermvectorsRequest<TDocument>>> fn) -
doc
An artificial document (a document not present in the index) for which you want to retrieve term vectors.API name:
doc
-
fieldStatistics
Iftrue
, the response includes:- The document count (how many documents contain this field).
- The sum of document frequencies (the sum of document frequencies for all terms in this field).
- The sum of total term frequencies (the sum of total term frequencies of each term in this field).
API name:
field_statistics
-
fields
A list of fields to include in the statistics. It is used as the default list unless a specific field list is provided in thecompletion_fields
orfielddata_fields
parameters.API name:
fields
-
filter
Filter terms based on their tf-idf scores. This could be useful in order find out a good characteristic vector of a document. This feature works in a similar manner to the second phase of the More Like This Query.API name:
filter
-
id
A unique identifier for the document.API name:
id
-
index
Required - The name of the index that contains the document.API name:
index
-
offsets
Iftrue
, the response includes term offsets.API name:
offsets
-
payloads
Iftrue
, the response includes term payloads.API name:
payloads
-
perFieldAnalyzer
Override the default per-field analyzer. This is useful in order to generate term vectors in any fashion, especially when using artificial documents. When providing an analyzer for a field that already stores term vectors, the term vectors will be regenerated.API name:
per_field_analyzer
-
positions
Iftrue
, the response includes term positions.API name:
positions
-
preference
The node or shard the operation should be performed on. It is random by default.API name:
preference
-
realtime
If true, the request is real-time as opposed to near-real-time.API name:
realtime
-
routing
A custom value that is used to route operations to a specific shard.API name:
routing
-
termStatistics
Iftrue
, the response includes:- The total term frequency (how often a term occurs in all documents).
- The document frequency (the number of documents containing the current term).
By default these values are not returned since term statistics can have a serious performance impact.
API name:
term_statistics
-
version
Iftrue
, returns the document version as part of a hit.API name:
version
-
versionType
The version type.API name:
version_type
-
serialize
Serialize this object to JSON.- Specified by:
serialize
in interfaceJsonpSerializable
-
serializeInternal
-
createTermvectorsRequestDeserializer
public static <TDocument> JsonpDeserializer<TermvectorsRequest<TDocument>> createTermvectorsRequestDeserializer(JsonpDeserializer<TDocument> tDocumentDeserializer) Create a JSON deserializer for TermvectorsRequest -
setupTermvectorsRequestDeserializer
protected static <TDocument> void setupTermvectorsRequestDeserializer(ObjectDeserializer<TermvectorsRequest.Builder<TDocument>> op, JsonpDeserializer<TDocument> tDocumentDeserializer)
-