Class ElasticsearchServiceSettings
java.lang.Object
co.elastic.clients.elasticsearch.inference.ElasticsearchServiceSettings
- All Implemented Interfaces:
JsonpSerializable
@JsonpDeserializable
public class ElasticsearchServiceSettings
extends Object
implements JsonpSerializable
- See Also:
-
Nested Class Summary
Nested Classes -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final JsonpDeserializer<ElasticsearchServiceSettings>
Json deserializer forElasticsearchServiceSettings
-
Method Summary
Modifier and TypeMethodDescriptionfinal AdaptiveAllocations
Adaptive allocations configuration details.final String
The deployment identifier for a trained model deployment.final String
modelId()
Required - The name of the model to use for the inference task.final Integer
The total number of allocations that are assigned to the model across machine learning nodes.final int
Required - The number of threads used by each model allocation during inference.static ElasticsearchServiceSettings
void
serialize
(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper) Serialize this object to JSON.protected void
serializeInternal
(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper) protected static void
setupElasticsearchServiceSettingsDeserializer
(ObjectDeserializer<ElasticsearchServiceSettings.Builder> op) toString()
-
Field Details
-
_DESERIALIZER
Json deserializer forElasticsearchServiceSettings
-
-
Method Details
-
of
-
adaptiveAllocations
Adaptive allocations configuration details. Ifenabled
is true, the number of allocations of the model is set based on the current load the process gets. When the load is high, a new model allocation is automatically created, respecting the value ofmax_number_of_allocations
if it's set. When the load is low, a model allocation is automatically removed, respecting the value ofmin_number_of_allocations
if it's set. Ifenabled
is true, do not set the number of allocations manually.API name:
adaptive_allocations
-
deploymentId
The deployment identifier for a trained model deployment. Whendeployment_id
is used themodel_id
is optional.API name:
deployment_id
-
modelId
Required - The name of the model to use for the inference task. It can be the ID of a built-in model (for example,.multilingual-e5-small
for E5) or a text embedding model that was uploaded by using the Eland client.API name:
model_id
-
numAllocations
The total number of allocations that are assigned to the model across machine learning nodes. Increasing this value generally increases the throughput. If adaptive allocations are enabled, do not set this value because it's automatically set.API name:
num_allocations
-
numThreads
public final int numThreads()Required - The number of threads used by each model allocation during inference. This setting generally increases the speed per inference request. The inference process is a compute-bound process;threads_per_allocations
must not exceed the number of available allocated processors per node. The value must be a power of 2. The maximum value is 32.API name:
num_threads
-
serialize
Serialize this object to JSON.- Specified by:
serialize
in interfaceJsonpSerializable
-
serializeInternal
-
toString
-
setupElasticsearchServiceSettingsDeserializer
protected static void setupElasticsearchServiceSettingsDeserializer(ObjectDeserializer<ElasticsearchServiceSettings.Builder> op)
-