java.lang.Object

co.elastic.clients.elasticsearch._types.RequestBase

co.elastic.clients.elasticsearch.ml.StartTrainedModelDeploymentRequest

All Implemented Interfaces:: JsonpSerializable

@JsonpDeserializable public class StartTrainedModelDeploymentRequest extends RequestBase implements JsonpSerializable

Start a trained model deployment. It allocates the model to every machine learning node.

See Also:

API specification

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static class

StartTrainedModelDeploymentRequest.Builder

Builder for StartTrainedModelDeploymentRequest.

Nested classes/interfaces inherited from class co.elastic.clients.elasticsearch._types.RequestBase
RequestBase.AbstractBuilder<BuilderT extends RequestBase.AbstractBuilder<BuilderT>>
Field Summary

Fields

Modifier and Type

Field

Description

static final JsonpDeserializer<StartTrainedModelDeploymentRequest>

_DESERIALIZER

Json deserializer for StartTrainedModelDeploymentRequest

static final Endpoint<StartTrainedModelDeploymentRequest,StartTrainedModelDeploymentResponse,ErrorResponse>

_ENDPOINT

Endpoint "ml.start_trained_model_deployment".
Method Summary

Modifier and Type

Method

Description

final AdaptiveAllocationsSettings

adaptiveAllocations()

Adaptive allocations configuration.

final String

cacheSize()

The inference cache size (in memory outside the JVM heap) per node for the model.

final String

deploymentId()

A unique identifier for the deployment of the model.

final String

modelId()

Required - The unique identifier of the trained model.

final Integer

numberOfAllocations()

The number of model allocations on each node where the model is deployed.

static StartTrainedModelDeploymentRequest

of(Function<StartTrainedModelDeploymentRequest.Builder,ObjectBuilder<StartTrainedModelDeploymentRequest>> fn)

final TrainingPriority

priority()

The deployment priority.

final Integer

queueCapacity()

Specifies the number of inference requests that are allowed in the queue.

void

serialize(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper)

Serialize this object to JSON.

protected void

serializeInternal(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper)

protected static void

setupStartTrainedModelDeploymentRequestDeserializer(ObjectDeserializer<StartTrainedModelDeploymentRequest.Builder> op)

final Integer

threadsPerAllocation()

Sets the number of threads used by each model allocation during inference.

final Time

timeout()

Specifies the amount of time to wait for the model to deploy.

final DeploymentAllocationState

waitFor()

Specifies the allocation status to wait for before returning.

Methods inherited from class co.elastic.clients.elasticsearch._types.RequestBase
toString

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Field Details
- _DESERIALIZER
  
  public static final JsonpDeserializer<StartTrainedModelDeploymentRequest> _DESERIALIZER
  
  Json deserializer for StartTrainedModelDeploymentRequest
- _ENDPOINT
  
  public static final Endpoint<StartTrainedModelDeploymentRequest,StartTrainedModelDeploymentResponse,ErrorResponse> _ENDPOINT
  
  Endpoint "ml.start_trained_model_deployment".
Method Details
- of
  
  public static StartTrainedModelDeploymentRequest of(Function<StartTrainedModelDeploymentRequest.Builder,ObjectBuilder<StartTrainedModelDeploymentRequest>> fn)
- adaptiveAllocations
  
  @Nullable public final AdaptiveAllocationsSettings adaptiveAllocations()
  
  Adaptive allocations configuration. When enabled, the number of allocations is set based on the current load. If adaptive_allocations is enabled, do not set the number of allocations manually.
  API name: adaptive_allocations
- cacheSize
  
  @Nullable public final String cacheSize()
  
  The inference cache size (in memory outside the JVM heap) per node for the model. The default value is the same size as the model_size_bytes. To disable the cache, 0b can be provided.
  API name: cache_size
- deploymentId
  
  @Nullable public final String deploymentId()
  
  A unique identifier for the deployment of the model.
  API name: deployment_id
- modelId
  
  public final String modelId()
  
  Required - The unique identifier of the trained model. Currently, only PyTorch models are supported.
  API name: model_id
- numberOfAllocations
  
  @Nullable public final Integer numberOfAllocations()
  
  The number of model allocations on each node where the model is deployed. All allocations on a node share the same copy of the model in memory but use a separate set of threads to evaluate the model. Increasing this value generally increases the throughput. If this setting is greater than the number of hardware threads it will automatically be changed to a value less than the number of hardware threads. If adaptive_allocations is enabled, do not set this value, because it’s automatically set.
  API name: number_of_allocations
- priority
  
  @Nullable public final TrainingPriority priority()
  
  The deployment priority.
  API name: priority
- queueCapacity
  
  @Nullable public final Integer queueCapacity()
  
  Specifies the number of inference requests that are allowed in the queue. After the number of requests exceeds this value, new requests are rejected with a 429 error.
  API name: queue_capacity
- threadsPerAllocation
  
  @Nullable public final Integer threadsPerAllocation()
  
  Sets the number of threads used by each model allocation during inference. This generally increases the inference speed. The inference process is a compute-bound process; any number greater than the number of available hardware threads on the machine does not increase the inference speed. If this setting is greater than the number of hardware threads it will automatically be changed to a value less than the number of hardware threads.
  API name: threads_per_allocation
- timeout
  
  @Nullable public final Time timeout()
  
  Specifies the amount of time to wait for the model to deploy.
  API name: timeout
- waitFor
  
  @Nullable public final DeploymentAllocationState waitFor()
  
  Specifies the allocation status to wait for before returning.
  API name: wait_for
- serialize
  
  public void serialize(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper)
  
  Serialize this object to JSON.
  
  Specified by:
  
  serialize in interface JsonpSerializable
- serializeInternal
  
  protected void serializeInternal(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper)
- setupStartTrainedModelDeploymentRequestDeserializer
  
  protected static void setupStartTrainedModelDeploymentRequestDeserializer(ObjectDeserializer<StartTrainedModelDeploymentRequest.Builder> op)

Class StartTrainedModelDeploymentRequest

Nested Class Summary

Nested classes/interfaces inherited from class co.elastic.clients.elasticsearch._types.RequestBase

Field Summary

Method Summary

Methods inherited from class co.elastic.clients.elasticsearch._types.RequestBase

Methods inherited from class java.lang.Object

Field Details

_DESERIALIZER

_ENDPOINT

Method Details

of

adaptiveAllocations

cacheSize

deploymentId

modelId

numberOfAllocations

priority

queueCapacity

threadsPerAllocation

timeout

waitFor

serialize

serializeInternal

setupStartTrainedModelDeploymentRequestDeserializer