Class ElasticsearchInferenceClient
- All Implemented Interfaces:
Closeable
,AutoCloseable
-
Field Summary
Fields inherited from class co.elastic.clients.ApiClient
transport, transportOptions
-
Constructor Summary
ConstructorsConstructorDescriptionElasticsearchInferenceClient
(ElasticsearchTransport transport, TransportOptions transportOptions) -
Method Summary
Modifier and TypeMethodDescriptionPerform chat completion inferencefinal BinaryResponse
chatCompletionUnified
(Function<ChatCompletionUnifiedRequest.Builder, ObjectBuilder<ChatCompletionUnifiedRequest>> fn) Perform chat completion inferencecompletion
(CompletionRequest request) Perform completion inference on the servicefinal CompletionResponse
Perform completion inference on the servicedelete
(DeleteInferenceRequest request) Delete an inference endpointfinal DeleteInferenceResponse
Delete an inference endpointget()
Get an inference endpointget
(GetInferenceRequest request) Get an inference endpointfinal GetInferenceResponse
Get an inference endpointput
(PutRequest request) Create an inference endpoint.final PutResponse
Create an inference endpoint.putAlibabacloud
(PutAlibabacloudRequest request) Create an AlibabaCloud AI Search inference endpoint.final PutAlibabacloudResponse
Create an AlibabaCloud AI Search inference endpoint.putAmazonbedrock
(PutAmazonbedrockRequest request) Create an Amazon Bedrock inference endpoint.final PutAmazonbedrockResponse
putAmazonbedrock
(Function<PutAmazonbedrockRequest.Builder, ObjectBuilder<PutAmazonbedrockRequest>> fn) Create an Amazon Bedrock inference endpoint.putAnthropic
(PutAnthropicRequest request) Create an Anthropic inference endpoint.final PutAnthropicResponse
Create an Anthropic inference endpoint.putAzureaistudio
(PutAzureaistudioRequest request) Create an Azure AI studio inference endpoint.final PutAzureaistudioResponse
putAzureaistudio
(Function<PutAzureaistudioRequest.Builder, ObjectBuilder<PutAzureaistudioRequest>> fn) Create an Azure AI studio inference endpoint.putAzureopenai
(PutAzureopenaiRequest request) Create an Azure OpenAI inference endpoint.final PutAzureopenaiResponse
Create an Azure OpenAI inference endpoint.putCohere
(PutCohereRequest request) Create a Cohere inference endpoint.final PutCohereResponse
Create a Cohere inference endpoint.putElasticsearch
(PutElasticsearchRequest request) Create an Elasticsearch inference endpoint.final PutElasticsearchResponse
putElasticsearch
(Function<PutElasticsearchRequest.Builder, ObjectBuilder<PutElasticsearchRequest>> fn) Create an Elasticsearch inference endpoint.putElser
(PutElserRequest request) Create an ELSER inference endpoint.final PutElserResponse
Create an ELSER inference endpoint.Create an Google AI Studio inference endpoint.putGoogleaistudio
(Function<PutGoogleaistudioRequest.Builder, ObjectBuilder<PutGoogleaistudioRequest>> fn) Create an Google AI Studio inference endpoint.Create a Google Vertex AI inference endpoint.putGooglevertexai
(Function<PutGooglevertexaiRequest.Builder, ObjectBuilder<PutGooglevertexaiRequest>> fn) Create a Google Vertex AI inference endpoint.putHuggingFace
(PutHuggingFaceRequest request) Create a Hugging Face inference endpoint.final PutHuggingFaceResponse
Create a Hugging Face inference endpoint.putJinaai
(PutJinaaiRequest request) Create an JinaAI inference endpoint.final PutJinaaiResponse
Create an JinaAI inference endpoint.putMistral
(PutMistralRequest request) Create a Mistral inference endpoint.final PutMistralResponse
Create a Mistral inference endpoint.putOpenai
(PutOpenaiRequest request) Create an OpenAI inference endpoint.final PutOpenaiResponse
Create an OpenAI inference endpoint.putVoyageai
(PutVoyageaiRequest request) Create a VoyageAI inference endpoint.final PutVoyageaiResponse
Create a VoyageAI inference endpoint.putWatsonx
(PutWatsonxRequest request) Create a Watsonx inference endpoint.final PutWatsonxResponse
Create a Watsonx inference endpoint.rerank
(RerankRequest request) Perform rereanking inference on the servicefinal RerankResponse
Perform rereanking inference on the servicesparseEmbedding
(SparseEmbeddingRequest request) Perform sparse embedding inference on the servicefinal SparseEmbeddingResponse
Perform sparse embedding inference on the servicestreamCompletion
(StreamCompletionRequest request) Perform streaming inference.final BinaryResponse
streamCompletion
(Function<StreamCompletionRequest.Builder, ObjectBuilder<StreamCompletionRequest>> fn) Perform streaming inference.textEmbedding
(TextEmbeddingRequest request) Perform text embedding inference on the servicefinal TextEmbeddingResponse
Perform text embedding inference on the serviceupdate
(UpdateInferenceRequest request) Update an inference endpoint.final UpdateInferenceResponse
Update an inference endpoint.withTransportOptions
(TransportOptions transportOptions) Creates a new client with some request optionsMethods inherited from class co.elastic.clients.ApiClient
_jsonpMapper, _transport, _transportOptions, close, getDeserializer, withTransportOptions
-
Constructor Details
-
ElasticsearchInferenceClient
-
ElasticsearchInferenceClient
public ElasticsearchInferenceClient(ElasticsearchTransport transport, @Nullable TransportOptions transportOptions)
-
-
Method Details
-
withTransportOptions
public ElasticsearchInferenceClient withTransportOptions(@Nullable TransportOptions transportOptions) Description copied from class:ApiClient
Creates a new client with some request options- Specified by:
withTransportOptions
in classApiClient<ElasticsearchTransport,
ElasticsearchInferenceClient>
-
chatCompletionUnified
public BinaryResponse chatCompletionUnified(ChatCompletionUnifiedRequest request) throws IOException, ElasticsearchException Perform chat completion inference- Throws:
IOException
ElasticsearchException
- See Also:
-
chatCompletionUnified
public final BinaryResponse chatCompletionUnified(Function<ChatCompletionUnifiedRequest.Builder, ObjectBuilder<ChatCompletionUnifiedRequest>> fn) throws IOException, ElasticsearchExceptionPerform chat completion inference- Parameters:
fn
- a function that initializes a builder to create theChatCompletionUnifiedRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
completion
public CompletionResponse completion(CompletionRequest request) throws IOException, ElasticsearchException Perform completion inference on the service- Throws:
IOException
ElasticsearchException
- See Also:
-
completion
public final CompletionResponse completion(Function<CompletionRequest.Builder, ObjectBuilder<CompletionRequest>> fn) throws IOException, ElasticsearchExceptionPerform completion inference on the service- Parameters:
fn
- a function that initializes a builder to create theCompletionRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
delete
public DeleteInferenceResponse delete(DeleteInferenceRequest request) throws IOException, ElasticsearchException Delete an inference endpoint- Throws:
IOException
ElasticsearchException
- See Also:
-
delete
public final DeleteInferenceResponse delete(Function<DeleteInferenceRequest.Builder, ObjectBuilder<DeleteInferenceRequest>> fn) throws IOException, ElasticsearchExceptionDelete an inference endpoint- Parameters:
fn
- a function that initializes a builder to create theDeleteInferenceRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
get
public GetInferenceResponse get(GetInferenceRequest request) throws IOException, ElasticsearchException Get an inference endpoint- Throws:
IOException
ElasticsearchException
- See Also:
-
get
public final GetInferenceResponse get(Function<GetInferenceRequest.Builder, ObjectBuilder<GetInferenceRequest>> fn) throws IOException, ElasticsearchExceptionGet an inference endpoint- Parameters:
fn
- a function that initializes a builder to create theGetInferenceRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
get
Get an inference endpoint- Throws:
IOException
ElasticsearchException
- See Also:
-
put
Create an inference endpoint. When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Mistral, Azure OpenAI, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.
- Throws:
IOException
ElasticsearchException
- See Also:
-
put
public final PutResponse put(Function<PutRequest.Builder, ObjectBuilder<PutRequest>> fn) throws IOException, ElasticsearchExceptionCreate an inference endpoint. When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Mistral, Azure OpenAI, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.
- Parameters:
fn
- a function that initializes a builder to create thePutRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
putAlibabacloud
public PutAlibabacloudResponse putAlibabacloud(PutAlibabacloudRequest request) throws IOException, ElasticsearchException Create an AlibabaCloud AI Search inference endpoint.Create an inference endpoint to perform an inference task with the
alibabacloud-ai-search
service.When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Throws:
IOException
ElasticsearchException
- See Also:
-
putAlibabacloud
public final PutAlibabacloudResponse putAlibabacloud(Function<PutAlibabacloudRequest.Builder, ObjectBuilder<PutAlibabacloudRequest>> fn) throws IOException, ElasticsearchExceptionCreate an AlibabaCloud AI Search inference endpoint.Create an inference endpoint to perform an inference task with the
alibabacloud-ai-search
service.When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Parameters:
fn
- a function that initializes a builder to create thePutAlibabacloudRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
putAmazonbedrock
public PutAmazonbedrockResponse putAmazonbedrock(PutAmazonbedrockRequest request) throws IOException, ElasticsearchException Create an Amazon Bedrock inference endpoint.Creates an inference endpoint to perform an inference task with the
amazonbedrock
service.info You need to provide the access and secret keys only once, during the inference model creation. The get inference API does not retrieve your access or secret keys. After creating the inference model, you cannot change the associated key pairs. If you want to use a different access and secret key pair, delete the inference model and recreate it with the same name and the updated keys.
When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Throws:
IOException
ElasticsearchException
- See Also:
-
putAmazonbedrock
public final PutAmazonbedrockResponse putAmazonbedrock(Function<PutAmazonbedrockRequest.Builder, ObjectBuilder<PutAmazonbedrockRequest>> fn) throws IOException, ElasticsearchExceptionCreate an Amazon Bedrock inference endpoint.Creates an inference endpoint to perform an inference task with the
amazonbedrock
service.info You need to provide the access and secret keys only once, during the inference model creation. The get inference API does not retrieve your access or secret keys. After creating the inference model, you cannot change the associated key pairs. If you want to use a different access and secret key pair, delete the inference model and recreate it with the same name and the updated keys.
When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Parameters:
fn
- a function that initializes a builder to create thePutAmazonbedrockRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
putAnthropic
public PutAnthropicResponse putAnthropic(PutAnthropicRequest request) throws IOException, ElasticsearchException Create an Anthropic inference endpoint.Create an inference endpoint to perform an inference task with the
anthropic
service.When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Throws:
IOException
ElasticsearchException
- See Also:
-
putAnthropic
public final PutAnthropicResponse putAnthropic(Function<PutAnthropicRequest.Builder, ObjectBuilder<PutAnthropicRequest>> fn) throws IOException, ElasticsearchExceptionCreate an Anthropic inference endpoint.Create an inference endpoint to perform an inference task with the
anthropic
service.When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Parameters:
fn
- a function that initializes a builder to create thePutAnthropicRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
putAzureaistudio
public PutAzureaistudioResponse putAzureaistudio(PutAzureaistudioRequest request) throws IOException, ElasticsearchException Create an Azure AI studio inference endpoint.Create an inference endpoint to perform an inference task with the
azureaistudio
service.When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Throws:
IOException
ElasticsearchException
- See Also:
-
putAzureaistudio
public final PutAzureaistudioResponse putAzureaistudio(Function<PutAzureaistudioRequest.Builder, ObjectBuilder<PutAzureaistudioRequest>> fn) throws IOException, ElasticsearchExceptionCreate an Azure AI studio inference endpoint.Create an inference endpoint to perform an inference task with the
azureaistudio
service.When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Parameters:
fn
- a function that initializes a builder to create thePutAzureaistudioRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
putAzureopenai
public PutAzureopenaiResponse putAzureopenai(PutAzureopenaiRequest request) throws IOException, ElasticsearchException Create an Azure OpenAI inference endpoint.Create an inference endpoint to perform an inference task with the
azureopenai
service.The list of chat completion models that you can choose from in your Azure OpenAI deployment include:
The list of embeddings models that you can choose from in your deployment can be found in the Azure models documentation.
When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Throws:
IOException
ElasticsearchException
- See Also:
-
putAzureopenai
public final PutAzureopenaiResponse putAzureopenai(Function<PutAzureopenaiRequest.Builder, ObjectBuilder<PutAzureopenaiRequest>> fn) throws IOException, ElasticsearchExceptionCreate an Azure OpenAI inference endpoint.Create an inference endpoint to perform an inference task with the
azureopenai
service.The list of chat completion models that you can choose from in your Azure OpenAI deployment include:
The list of embeddings models that you can choose from in your deployment can be found in the Azure models documentation.
When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Parameters:
fn
- a function that initializes a builder to create thePutAzureopenaiRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
putCohere
public PutCohereResponse putCohere(PutCohereRequest request) throws IOException, ElasticsearchException Create a Cohere inference endpoint.Create an inference endpoint to perform an inference task with the
cohere
service.When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Throws:
IOException
ElasticsearchException
- See Also:
-
putCohere
public final PutCohereResponse putCohere(Function<PutCohereRequest.Builder, ObjectBuilder<PutCohereRequest>> fn) throws IOException, ElasticsearchExceptionCreate a Cohere inference endpoint.Create an inference endpoint to perform an inference task with the
cohere
service.When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Parameters:
fn
- a function that initializes a builder to create thePutCohereRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
putElasticsearch
public PutElasticsearchResponse putElasticsearch(PutElasticsearchRequest request) throws IOException, ElasticsearchException Create an Elasticsearch inference endpoint.Create an inference endpoint to perform an inference task with the
elasticsearch
service.info Your Elasticsearch deployment contains preconfigured ELSER and E5 inference endpoints, you only need to create the enpoints using the API if you want to customize the settings.
If you use the ELSER or the E5 model through the
elasticsearch
service, the API request will automatically download and deploy the model if it isn't downloaded yet.info You might see a 502 bad gateway error in the response when using the Kibana Console. This error usually just reflects a timeout, while the model downloads in the background. You can check the download progress in the Machine Learning UI. If using the Python client, you can set the timeout parameter to a higher value.
After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Throws:
IOException
ElasticsearchException
- See Also:
-
putElasticsearch
public final PutElasticsearchResponse putElasticsearch(Function<PutElasticsearchRequest.Builder, ObjectBuilder<PutElasticsearchRequest>> fn) throws IOException, ElasticsearchExceptionCreate an Elasticsearch inference endpoint.Create an inference endpoint to perform an inference task with the
elasticsearch
service.info Your Elasticsearch deployment contains preconfigured ELSER and E5 inference endpoints, you only need to create the enpoints using the API if you want to customize the settings.
If you use the ELSER or the E5 model through the
elasticsearch
service, the API request will automatically download and deploy the model if it isn't downloaded yet.info You might see a 502 bad gateway error in the response when using the Kibana Console. This error usually just reflects a timeout, while the model downloads in the background. You can check the download progress in the Machine Learning UI. If using the Python client, you can set the timeout parameter to a higher value.
After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Parameters:
fn
- a function that initializes a builder to create thePutElasticsearchRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
putElser
public PutElserResponse putElser(PutElserRequest request) throws IOException, ElasticsearchException Create an ELSER inference endpoint.Create an inference endpoint to perform an inference task with the
elser
service. You can also deploy ELSER by using the Elasticsearch inference integration.info Your Elasticsearch deployment contains a preconfigured ELSER inference endpoint, you only need to create the enpoint using the API if you want to customize the settings.
The API request will automatically download and deploy the ELSER model if it isn't already downloaded.
info You might see a 502 bad gateway error in the response when using the Kibana Console. This error usually just reflects a timeout, while the model downloads in the background. You can check the download progress in the Machine Learning UI. If using the Python client, you can set the timeout parameter to a higher value.
After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Throws:
IOException
ElasticsearchException
- See Also:
-
putElser
public final PutElserResponse putElser(Function<PutElserRequest.Builder, ObjectBuilder<PutElserRequest>> fn) throws IOException, ElasticsearchExceptionCreate an ELSER inference endpoint.Create an inference endpoint to perform an inference task with the
elser
service. You can also deploy ELSER by using the Elasticsearch inference integration.info Your Elasticsearch deployment contains a preconfigured ELSER inference endpoint, you only need to create the enpoint using the API if you want to customize the settings.
The API request will automatically download and deploy the ELSER model if it isn't already downloaded.
info You might see a 502 bad gateway error in the response when using the Kibana Console. This error usually just reflects a timeout, while the model downloads in the background. You can check the download progress in the Machine Learning UI. If using the Python client, you can set the timeout parameter to a higher value.
After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Parameters:
fn
- a function that initializes a builder to create thePutElserRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
putGoogleaistudio
public PutGoogleaistudioResponse putGoogleaistudio(PutGoogleaistudioRequest request) throws IOException, ElasticsearchException Create an Google AI Studio inference endpoint.Create an inference endpoint to perform an inference task with the
googleaistudio
service.When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Throws:
IOException
ElasticsearchException
- See Also:
-
putGoogleaistudio
public final PutGoogleaistudioResponse putGoogleaistudio(Function<PutGoogleaistudioRequest.Builder, ObjectBuilder<PutGoogleaistudioRequest>> fn) throws IOException, ElasticsearchExceptionCreate an Google AI Studio inference endpoint.Create an inference endpoint to perform an inference task with the
googleaistudio
service.When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Parameters:
fn
- a function that initializes a builder to create thePutGoogleaistudioRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
putGooglevertexai
public PutGooglevertexaiResponse putGooglevertexai(PutGooglevertexaiRequest request) throws IOException, ElasticsearchException Create a Google Vertex AI inference endpoint.Create an inference endpoint to perform an inference task with the
googlevertexai
service.When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Throws:
IOException
ElasticsearchException
- See Also:
-
putGooglevertexai
public final PutGooglevertexaiResponse putGooglevertexai(Function<PutGooglevertexaiRequest.Builder, ObjectBuilder<PutGooglevertexaiRequest>> fn) throws IOException, ElasticsearchExceptionCreate a Google Vertex AI inference endpoint.Create an inference endpoint to perform an inference task with the
googlevertexai
service.When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Parameters:
fn
- a function that initializes a builder to create thePutGooglevertexaiRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
putHuggingFace
public PutHuggingFaceResponse putHuggingFace(PutHuggingFaceRequest request) throws IOException, ElasticsearchException Create a Hugging Face inference endpoint.Create an inference endpoint to perform an inference task with the
hugging_face
service.You must first create an inference endpoint on the Hugging Face endpoint page to get an endpoint URL. Select the model you want to use on the new endpoint creation page (for example
intfloat/e5-small-v2
), then select the sentence embeddings task under the advanced configuration section. Create the endpoint and copy the URL after the endpoint initialization has been finished.The following models are recommended for the Hugging Face service:
all-MiniLM-L6-v2
all-MiniLM-L12-v2
all-mpnet-base-v2
e5-base-v2
e5-small-v2
multilingual-e5-base
multilingual-e5-small
When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Throws:
IOException
ElasticsearchException
- See Also:
-
putHuggingFace
public final PutHuggingFaceResponse putHuggingFace(Function<PutHuggingFaceRequest.Builder, ObjectBuilder<PutHuggingFaceRequest>> fn) throws IOException, ElasticsearchExceptionCreate a Hugging Face inference endpoint.Create an inference endpoint to perform an inference task with the
hugging_face
service.You must first create an inference endpoint on the Hugging Face endpoint page to get an endpoint URL. Select the model you want to use on the new endpoint creation page (for example
intfloat/e5-small-v2
), then select the sentence embeddings task under the advanced configuration section. Create the endpoint and copy the URL after the endpoint initialization has been finished.The following models are recommended for the Hugging Face service:
all-MiniLM-L6-v2
all-MiniLM-L12-v2
all-mpnet-base-v2
e5-base-v2
e5-small-v2
multilingual-e5-base
multilingual-e5-small
When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Parameters:
fn
- a function that initializes a builder to create thePutHuggingFaceRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
putJinaai
public PutJinaaiResponse putJinaai(PutJinaaiRequest request) throws IOException, ElasticsearchException Create an JinaAI inference endpoint.Create an inference endpoint to perform an inference task with the
jinaai
service.To review the available
rerank
models, refer to https://jina.ai/reranker. To review the availabletext_embedding
models, refer to the https://jina.ai/embeddings/.When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Throws:
IOException
ElasticsearchException
- See Also:
-
putJinaai
public final PutJinaaiResponse putJinaai(Function<PutJinaaiRequest.Builder, ObjectBuilder<PutJinaaiRequest>> fn) throws IOException, ElasticsearchExceptionCreate an JinaAI inference endpoint.Create an inference endpoint to perform an inference task with the
jinaai
service.To review the available
rerank
models, refer to https://jina.ai/reranker. To review the availabletext_embedding
models, refer to the https://jina.ai/embeddings/.When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Parameters:
fn
- a function that initializes a builder to create thePutJinaaiRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
putMistral
public PutMistralResponse putMistral(PutMistralRequest request) throws IOException, ElasticsearchException Create a Mistral inference endpoint.Creates an inference endpoint to perform an inference task with the
mistral
service.When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Throws:
IOException
ElasticsearchException
- See Also:
-
putMistral
public final PutMistralResponse putMistral(Function<PutMistralRequest.Builder, ObjectBuilder<PutMistralRequest>> fn) throws IOException, ElasticsearchExceptionCreate a Mistral inference endpoint.Creates an inference endpoint to perform an inference task with the
mistral
service.When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Parameters:
fn
- a function that initializes a builder to create thePutMistralRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
putOpenai
public PutOpenaiResponse putOpenai(PutOpenaiRequest request) throws IOException, ElasticsearchException Create an OpenAI inference endpoint.Create an inference endpoint to perform an inference task with the
openai
service oropenai
compatible APIs.When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Throws:
IOException
ElasticsearchException
- See Also:
-
putOpenai
public final PutOpenaiResponse putOpenai(Function<PutOpenaiRequest.Builder, ObjectBuilder<PutOpenaiRequest>> fn) throws IOException, ElasticsearchExceptionCreate an OpenAI inference endpoint.Create an inference endpoint to perform an inference task with the
openai
service oropenai
compatible APIs.When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Parameters:
fn
- a function that initializes a builder to create thePutOpenaiRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
putVoyageai
public PutVoyageaiResponse putVoyageai(PutVoyageaiRequest request) throws IOException, ElasticsearchException Create a VoyageAI inference endpoint.Create an inference endpoint to perform an inference task with the
voyageai
service.Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.
- Throws:
IOException
ElasticsearchException
- See Also:
-
putVoyageai
public final PutVoyageaiResponse putVoyageai(Function<PutVoyageaiRequest.Builder, ObjectBuilder<PutVoyageaiRequest>> fn) throws IOException, ElasticsearchExceptionCreate a VoyageAI inference endpoint.Create an inference endpoint to perform an inference task with the
voyageai
service.Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.
- Parameters:
fn
- a function that initializes a builder to create thePutVoyageaiRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
putWatsonx
public PutWatsonxResponse putWatsonx(PutWatsonxRequest request) throws IOException, ElasticsearchException Create a Watsonx inference endpoint.Create an inference endpoint to perform an inference task with the
watsonxai
service. You need an IBM Cloud Databases for Elasticsearch deployment to use thewatsonxai
inference service. You can provision one through the IBM catalog, the Cloud Databases CLI plug-in, the Cloud Databases API, or Terraform.When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Throws:
IOException
ElasticsearchException
- See Also:
-
putWatsonx
public final PutWatsonxResponse putWatsonx(Function<PutWatsonxRequest.Builder, ObjectBuilder<PutWatsonxRequest>> fn) throws IOException, ElasticsearchExceptionCreate a Watsonx inference endpoint.Create an inference endpoint to perform an inference task with the
watsonxai
service. You need an IBM Cloud Databases for Elasticsearch deployment to use thewatsonxai
inference service. You can provision one through the IBM catalog, the Cloud Databases CLI plug-in, the Cloud Databases API, or Terraform.When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Parameters:
fn
- a function that initializes a builder to create thePutWatsonxRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
rerank
Perform rereanking inference on the service- Throws:
IOException
ElasticsearchException
- See Also:
-
rerank
public final RerankResponse rerank(Function<RerankRequest.Builder, ObjectBuilder<RerankRequest>> fn) throws IOException, ElasticsearchExceptionPerform rereanking inference on the service- Parameters:
fn
- a function that initializes a builder to create theRerankRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
sparseEmbedding
public SparseEmbeddingResponse sparseEmbedding(SparseEmbeddingRequest request) throws IOException, ElasticsearchException Perform sparse embedding inference on the service- Throws:
IOException
ElasticsearchException
- See Also:
-
sparseEmbedding
public final SparseEmbeddingResponse sparseEmbedding(Function<SparseEmbeddingRequest.Builder, ObjectBuilder<SparseEmbeddingRequest>> fn) throws IOException, ElasticsearchExceptionPerform sparse embedding inference on the service- Parameters:
fn
- a function that initializes a builder to create theSparseEmbeddingRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
streamCompletion
public BinaryResponse streamCompletion(StreamCompletionRequest request) throws IOException, ElasticsearchException Perform streaming inference. Get real-time responses for completion tasks by delivering answers incrementally, reducing response times during computation. This API works only with the completion task type.IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.
This API requires the
monitor_inference
cluster privilege (the built-ininference_admin
andinference_user
roles grant this privilege). You must use a client that supports streaming.- Throws:
IOException
ElasticsearchException
- See Also:
-
streamCompletion
public final BinaryResponse streamCompletion(Function<StreamCompletionRequest.Builder, ObjectBuilder<StreamCompletionRequest>> fn) throws IOException, ElasticsearchExceptionPerform streaming inference. Get real-time responses for completion tasks by delivering answers incrementally, reducing response times during computation. This API works only with the completion task type.IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.
This API requires the
monitor_inference
cluster privilege (the built-ininference_admin
andinference_user
roles grant this privilege). You must use a client that supports streaming.- Parameters:
fn
- a function that initializes a builder to create theStreamCompletionRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
textEmbedding
public TextEmbeddingResponse textEmbedding(TextEmbeddingRequest request) throws IOException, ElasticsearchException Perform text embedding inference on the service- Throws:
IOException
ElasticsearchException
- See Also:
-
textEmbedding
public final TextEmbeddingResponse textEmbedding(Function<TextEmbeddingRequest.Builder, ObjectBuilder<TextEmbeddingRequest>> fn) throws IOException, ElasticsearchExceptionPerform text embedding inference on the service- Parameters:
fn
- a function that initializes a builder to create theTextEmbeddingRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-
update
public UpdateInferenceResponse update(UpdateInferenceRequest request) throws IOException, ElasticsearchException Update an inference endpoint.Modify
task_settings
, secrets (withinservice_settings
), ornum_allocations
for an inference endpoint, depending on the specific endpoint service andtask_type
.IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.
- Throws:
IOException
ElasticsearchException
- See Also:
-
update
public final UpdateInferenceResponse update(Function<UpdateInferenceRequest.Builder, ObjectBuilder<UpdateInferenceRequest>> fn) throws IOException, ElasticsearchExceptionUpdate an inference endpoint.Modify
task_settings
, secrets (withinservice_settings
), ornum_allocations
for an inference endpoint, depending on the specific endpoint service andtask_type
.IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.
- Parameters:
fn
- a function that initializes a builder to create theUpdateInferenceRequest
- Throws:
IOException
ElasticsearchException
- See Also:
-