Class ElasticsearchInferenceAsyncClient
- All Implemented Interfaces:
Closeable
,AutoCloseable
-
Field Summary
Fields inherited from class co.elastic.clients.ApiClient
transport, transportOptions
-
Constructor Summary
ConstructorsConstructorDescriptionElasticsearchInferenceAsyncClient
(ElasticsearchTransport transport, TransportOptions transportOptions) -
Method Summary
Modifier and TypeMethodDescriptionPerform chat completion inferencefinal CompletableFuture<BinaryResponse>
chatCompletionUnified
(Function<ChatCompletionUnifiedRequest.Builder, ObjectBuilder<ChatCompletionUnifiedRequest>> fn) Perform chat completion inferencecompletion
(CompletionRequest request) Perform completion inference on the servicePerform completion inference on the servicedelete
(DeleteInferenceRequest request) Delete an inference endpointDelete an inference endpointget()
Get an inference endpointget
(GetInferenceRequest request) Get an inference endpointGet an inference endpointinference
(InferenceRequest request) Perform inference on the service.Perform inference on the service.put
(PutRequest request) Create an inference endpoint.final CompletableFuture<PutResponse>
Create an inference endpoint.putAlibabacloud
(PutAlibabacloudRequest request) Create an AlibabaCloud AI Search inference endpoint.Create an AlibabaCloud AI Search inference endpoint.putAmazonbedrock
(PutAmazonbedrockRequest request) Create an Amazon Bedrock inference endpoint.putAmazonbedrock
(Function<PutAmazonbedrockRequest.Builder, ObjectBuilder<PutAmazonbedrockRequest>> fn) Create an Amazon Bedrock inference endpoint.putAnthropic
(PutAnthropicRequest request) Create an Anthropic inference endpoint.Create an Anthropic inference endpoint.putAzureaistudio
(PutAzureaistudioRequest request) Create an Azure AI studio inference endpoint.putAzureaistudio
(Function<PutAzureaistudioRequest.Builder, ObjectBuilder<PutAzureaistudioRequest>> fn) Create an Azure AI studio inference endpoint.putAzureopenai
(PutAzureopenaiRequest request) Create an Azure OpenAI inference endpoint.Create an Azure OpenAI inference endpoint.putCohere
(PutCohereRequest request) Create a Cohere inference endpoint.Create a Cohere inference endpoint.putElasticsearch
(PutElasticsearchRequest request) Create an Elasticsearch inference endpoint.putElasticsearch
(Function<PutElasticsearchRequest.Builder, ObjectBuilder<PutElasticsearchRequest>> fn) Create an Elasticsearch inference endpoint.putElser
(PutElserRequest request) Create an ELSER inference endpoint.Create an ELSER inference endpoint.Create an Google AI Studio inference endpoint.putGoogleaistudio
(Function<PutGoogleaistudioRequest.Builder, ObjectBuilder<PutGoogleaistudioRequest>> fn) Create an Google AI Studio inference endpoint.Create a Google Vertex AI inference endpoint.putGooglevertexai
(Function<PutGooglevertexaiRequest.Builder, ObjectBuilder<PutGooglevertexaiRequest>> fn) Create a Google Vertex AI inference endpoint.putHuggingFace
(PutHuggingFaceRequest request) Create a Hugging Face inference endpoint.Create a Hugging Face inference endpoint.putJinaai
(PutJinaaiRequest request) Create an JinaAI inference endpoint.Create an JinaAI inference endpoint.putMistral
(PutMistralRequest request) Create a Mistral inference endpoint.Create a Mistral inference endpoint.putOpenai
(PutOpenaiRequest request) Create an OpenAI inference endpoint.Create an OpenAI inference endpoint.putVoyageai
(PutVoyageaiRequest request) Create a VoyageAI inference endpoint.Create a VoyageAI inference endpoint.putWatsonx
(PutWatsonxRequest request) Create a Watsonx inference endpoint.Create a Watsonx inference endpoint.rerank
(RerankRequest request) Perform rereanking inference on the servicefinal CompletableFuture<RerankResponse>
Perform rereanking inference on the servicesparseEmbedding
(SparseEmbeddingRequest request) Perform sparse embedding inference on the servicePerform sparse embedding inference on the servicestreamCompletion
(StreamCompletionRequest request) Perform streaming inference.final CompletableFuture<BinaryResponse>
streamCompletion
(Function<StreamCompletionRequest.Builder, ObjectBuilder<StreamCompletionRequest>> fn) Perform streaming inference.textEmbedding
(TextEmbeddingRequest request) Perform text embedding inference on the servicePerform text embedding inference on the serviceupdate
(UpdateInferenceRequest request) Update an inference endpoint.Update an inference endpoint.withTransportOptions
(TransportOptions transportOptions) Creates a new client with some request optionsMethods inherited from class co.elastic.clients.ApiClient
_jsonpMapper, _transport, _transportOptions, close, getDeserializer, withTransportOptions
-
Constructor Details
-
ElasticsearchInferenceAsyncClient
-
ElasticsearchInferenceAsyncClient
public ElasticsearchInferenceAsyncClient(ElasticsearchTransport transport, @Nullable TransportOptions transportOptions)
-
-
Method Details
-
withTransportOptions
public ElasticsearchInferenceAsyncClient withTransportOptions(@Nullable TransportOptions transportOptions) Description copied from class:ApiClient
Creates a new client with some request options- Specified by:
withTransportOptions
in classApiClient<ElasticsearchTransport,
ElasticsearchInferenceAsyncClient>
-
chatCompletionUnified
public CompletableFuture<BinaryResponse> chatCompletionUnified(ChatCompletionUnifiedRequest request) Perform chat completion inferenceThe chat completion inference API enables real-time responses for chat completion tasks by delivering answers incrementally, reducing response times during computation. It only works with the
chat_completion
task type foropenai
andelastic
inference services.IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.
NOTE: The
chat_completion
task type is only available within the _stream API and only supports streaming. The Chat completion inference API and the Stream inference API differ in their response structure and capabilities. The Chat completion inference API provides more comprehensive customization options through more fields and function calling support. If you use theopenai
service or theelastic
service, use the Chat completion inference API.- See Also:
-
chatCompletionUnified
public final CompletableFuture<BinaryResponse> chatCompletionUnified(Function<ChatCompletionUnifiedRequest.Builder, ObjectBuilder<ChatCompletionUnifiedRequest>> fn) Perform chat completion inferenceThe chat completion inference API enables real-time responses for chat completion tasks by delivering answers incrementally, reducing response times during computation. It only works with the
chat_completion
task type foropenai
andelastic
inference services.IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.
NOTE: The
chat_completion
task type is only available within the _stream API and only supports streaming. The Chat completion inference API and the Stream inference API differ in their response structure and capabilities. The Chat completion inference API provides more comprehensive customization options through more fields and function calling support. If you use theopenai
service or theelastic
service, use the Chat completion inference API.- Parameters:
fn
- a function that initializes a builder to create theChatCompletionUnifiedRequest
- See Also:
-
completion
Perform completion inference on the service- See Also:
-
completion
public final CompletableFuture<CompletionResponse> completion(Function<CompletionRequest.Builder, ObjectBuilder<CompletionRequest>> fn) Perform completion inference on the service- Parameters:
fn
- a function that initializes a builder to create theCompletionRequest
- See Also:
-
delete
Delete an inference endpoint- See Also:
-
delete
public final CompletableFuture<DeleteInferenceResponse> delete(Function<DeleteInferenceRequest.Builder, ObjectBuilder<DeleteInferenceRequest>> fn) Delete an inference endpoint- Parameters:
fn
- a function that initializes a builder to create theDeleteInferenceRequest
- See Also:
-
get
Get an inference endpoint- See Also:
-
get
public final CompletableFuture<GetInferenceResponse> get(Function<GetInferenceRequest.Builder, ObjectBuilder<GetInferenceRequest>> fn) Get an inference endpoint- Parameters:
fn
- a function that initializes a builder to create theGetInferenceRequest
- See Also:
-
get
Get an inference endpoint- See Also:
-
inference
Perform inference on the service.This API enables you to use machine learning models to perform specific tasks on data that you provide as an input. It returns a response with the results of the tasks. The inference endpoint you use can perform one specific task that has been defined when the endpoint was created with the create inference API.
For details about using this API with a service, such as Amazon Bedrock, Anthropic, or HuggingFace, refer to the service-specific documentation.
info The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.
- See Also:
-
inference
public final CompletableFuture<InferenceResponse> inference(Function<InferenceRequest.Builder, ObjectBuilder<InferenceRequest>> fn) Perform inference on the service.This API enables you to use machine learning models to perform specific tasks on data that you provide as an input. It returns a response with the results of the tasks. The inference endpoint you use can perform one specific task that has been defined when the endpoint was created with the create inference API.
For details about using this API with a service, such as Amazon Bedrock, Anthropic, or HuggingFace, refer to the service-specific documentation.
info The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.
- Parameters:
fn
- a function that initializes a builder to create theInferenceRequest
- See Also:
-
put
Create an inference endpoint.IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Mistral, Azure OpenAI, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.
- See Also:
-
put
public final CompletableFuture<PutResponse> put(Function<PutRequest.Builder, ObjectBuilder<PutRequest>> fn) Create an inference endpoint.IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Mistral, Azure OpenAI, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.
- Parameters:
fn
- a function that initializes a builder to create thePutRequest
- See Also:
-
putAlibabacloud
Create an AlibabaCloud AI Search inference endpoint.Create an inference endpoint to perform an inference task with the
alibabacloud-ai-search
service.- See Also:
-
putAlibabacloud
public final CompletableFuture<PutAlibabacloudResponse> putAlibabacloud(Function<PutAlibabacloudRequest.Builder, ObjectBuilder<PutAlibabacloudRequest>> fn) Create an AlibabaCloud AI Search inference endpoint.Create an inference endpoint to perform an inference task with the
alibabacloud-ai-search
service.- Parameters:
fn
- a function that initializes a builder to create thePutAlibabacloudRequest
- See Also:
-
putAmazonbedrock
public CompletableFuture<PutAmazonbedrockResponse> putAmazonbedrock(PutAmazonbedrockRequest request) Create an Amazon Bedrock inference endpoint.Creates an inference endpoint to perform an inference task with the
amazonbedrock
service.info You need to provide the access and secret keys only once, during the inference model creation. The get inference API does not retrieve your access or secret keys. After creating the inference model, you cannot change the associated key pairs. If you want to use a different access and secret key pair, delete the inference model and recreate it with the same name and the updated keys.
- See Also:
-
putAmazonbedrock
public final CompletableFuture<PutAmazonbedrockResponse> putAmazonbedrock(Function<PutAmazonbedrockRequest.Builder, ObjectBuilder<PutAmazonbedrockRequest>> fn) Create an Amazon Bedrock inference endpoint.Creates an inference endpoint to perform an inference task with the
amazonbedrock
service.info You need to provide the access and secret keys only once, during the inference model creation. The get inference API does not retrieve your access or secret keys. After creating the inference model, you cannot change the associated key pairs. If you want to use a different access and secret key pair, delete the inference model and recreate it with the same name and the updated keys.
- Parameters:
fn
- a function that initializes a builder to create thePutAmazonbedrockRequest
- See Also:
-
putAnthropic
Create an Anthropic inference endpoint.Create an inference endpoint to perform an inference task with the
anthropic
service.- See Also:
-
putAnthropic
public final CompletableFuture<PutAnthropicResponse> putAnthropic(Function<PutAnthropicRequest.Builder, ObjectBuilder<PutAnthropicRequest>> fn) Create an Anthropic inference endpoint.Create an inference endpoint to perform an inference task with the
anthropic
service.- Parameters:
fn
- a function that initializes a builder to create thePutAnthropicRequest
- See Also:
-
putAzureaistudio
public CompletableFuture<PutAzureaistudioResponse> putAzureaistudio(PutAzureaistudioRequest request) Create an Azure AI studio inference endpoint.Create an inference endpoint to perform an inference task with the
azureaistudio
service.- See Also:
-
putAzureaistudio
public final CompletableFuture<PutAzureaistudioResponse> putAzureaistudio(Function<PutAzureaistudioRequest.Builder, ObjectBuilder<PutAzureaistudioRequest>> fn) Create an Azure AI studio inference endpoint.Create an inference endpoint to perform an inference task with the
azureaistudio
service.- Parameters:
fn
- a function that initializes a builder to create thePutAzureaistudioRequest
- See Also:
-
putAzureopenai
Create an Azure OpenAI inference endpoint.Create an inference endpoint to perform an inference task with the
azureopenai
service.The list of chat completion models that you can choose from in your Azure OpenAI deployment include:
The list of embeddings models that you can choose from in your deployment can be found in the Azure models documentation.
- See Also:
-
putAzureopenai
public final CompletableFuture<PutAzureopenaiResponse> putAzureopenai(Function<PutAzureopenaiRequest.Builder, ObjectBuilder<PutAzureopenaiRequest>> fn) Create an Azure OpenAI inference endpoint.Create an inference endpoint to perform an inference task with the
azureopenai
service.The list of chat completion models that you can choose from in your Azure OpenAI deployment include:
The list of embeddings models that you can choose from in your deployment can be found in the Azure models documentation.
- Parameters:
fn
- a function that initializes a builder to create thePutAzureopenaiRequest
- See Also:
-
putCohere
Create a Cohere inference endpoint.Create an inference endpoint to perform an inference task with the
cohere
service.- See Also:
-
putCohere
public final CompletableFuture<PutCohereResponse> putCohere(Function<PutCohereRequest.Builder, ObjectBuilder<PutCohereRequest>> fn) Create a Cohere inference endpoint.Create an inference endpoint to perform an inference task with the
cohere
service.- Parameters:
fn
- a function that initializes a builder to create thePutCohereRequest
- See Also:
-
putElasticsearch
public CompletableFuture<PutElasticsearchResponse> putElasticsearch(PutElasticsearchRequest request) Create an Elasticsearch inference endpoint.Create an inference endpoint to perform an inference task with the
elasticsearch
service.info Your Elasticsearch deployment contains preconfigured ELSER and E5 inference endpoints, you only need to create the enpoints using the API if you want to customize the settings.
If you use the ELSER or the E5 model through the
elasticsearch
service, the API request will automatically download and deploy the model if it isn't downloaded yet.info You might see a 502 bad gateway error in the response when using the Kibana Console. This error usually just reflects a timeout, while the model downloads in the background. You can check the download progress in the Machine Learning UI. If using the Python client, you can set the timeout parameter to a higher value.
After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- See Also:
-
putElasticsearch
public final CompletableFuture<PutElasticsearchResponse> putElasticsearch(Function<PutElasticsearchRequest.Builder, ObjectBuilder<PutElasticsearchRequest>> fn) Create an Elasticsearch inference endpoint.Create an inference endpoint to perform an inference task with the
elasticsearch
service.info Your Elasticsearch deployment contains preconfigured ELSER and E5 inference endpoints, you only need to create the enpoints using the API if you want to customize the settings.
If you use the ELSER or the E5 model through the
elasticsearch
service, the API request will automatically download and deploy the model if it isn't downloaded yet.info You might see a 502 bad gateway error in the response when using the Kibana Console. This error usually just reflects a timeout, while the model downloads in the background. You can check the download progress in the Machine Learning UI. If using the Python client, you can set the timeout parameter to a higher value.
After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Parameters:
fn
- a function that initializes a builder to create thePutElasticsearchRequest
- See Also:
-
putElser
Create an ELSER inference endpoint.Create an inference endpoint to perform an inference task with the
elser
service. You can also deploy ELSER by using the Elasticsearch inference integration.info Your Elasticsearch deployment contains a preconfigured ELSER inference endpoint, you only need to create the enpoint using the API if you want to customize the settings.
The API request will automatically download and deploy the ELSER model if it isn't already downloaded.
info You might see a 502 bad gateway error in the response when using the Kibana Console. This error usually just reflects a timeout, while the model downloads in the background. You can check the download progress in the Machine Learning UI. If using the Python client, you can set the timeout parameter to a higher value.
After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- See Also:
-
putElser
public final CompletableFuture<PutElserResponse> putElser(Function<PutElserRequest.Builder, ObjectBuilder<PutElserRequest>> fn) Create an ELSER inference endpoint.Create an inference endpoint to perform an inference task with the
elser
service. You can also deploy ELSER by using the Elasticsearch inference integration.info Your Elasticsearch deployment contains a preconfigured ELSER inference endpoint, you only need to create the enpoint using the API if you want to customize the settings.
The API request will automatically download and deploy the ELSER model if it isn't already downloaded.
info You might see a 502 bad gateway error in the response when using the Kibana Console. This error usually just reflects a timeout, while the model downloads in the background. You can check the download progress in the Machine Learning UI. If using the Python client, you can set the timeout parameter to a higher value.
After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for
"state": "fully_allocated"
in the response and ensure that the"allocation_count"
matches the"target_allocation_count"
. Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.- Parameters:
fn
- a function that initializes a builder to create thePutElserRequest
- See Also:
-
putGoogleaistudio
public CompletableFuture<PutGoogleaistudioResponse> putGoogleaistudio(PutGoogleaistudioRequest request) Create an Google AI Studio inference endpoint.Create an inference endpoint to perform an inference task with the
googleaistudio
service.- See Also:
-
putGoogleaistudio
public final CompletableFuture<PutGoogleaistudioResponse> putGoogleaistudio(Function<PutGoogleaistudioRequest.Builder, ObjectBuilder<PutGoogleaistudioRequest>> fn) Create an Google AI Studio inference endpoint.Create an inference endpoint to perform an inference task with the
googleaistudio
service.- Parameters:
fn
- a function that initializes a builder to create thePutGoogleaistudioRequest
- See Also:
-
putGooglevertexai
public CompletableFuture<PutGooglevertexaiResponse> putGooglevertexai(PutGooglevertexaiRequest request) Create a Google Vertex AI inference endpoint.Create an inference endpoint to perform an inference task with the
googlevertexai
service.- See Also:
-
putGooglevertexai
public final CompletableFuture<PutGooglevertexaiResponse> putGooglevertexai(Function<PutGooglevertexaiRequest.Builder, ObjectBuilder<PutGooglevertexaiRequest>> fn) Create a Google Vertex AI inference endpoint.Create an inference endpoint to perform an inference task with the
googlevertexai
service.- Parameters:
fn
- a function that initializes a builder to create thePutGooglevertexaiRequest
- See Also:
-
putHuggingFace
Create a Hugging Face inference endpoint.Create an inference endpoint to perform an inference task with the
hugging_face
service.You must first create an inference endpoint on the Hugging Face endpoint page to get an endpoint URL. Select the model you want to use on the new endpoint creation page (for example
intfloat/e5-small-v2
), then select the sentence embeddings task under the advanced configuration section. Create the endpoint and copy the URL after the endpoint initialization has been finished.The following models are recommended for the Hugging Face service:
all-MiniLM-L6-v2
all-MiniLM-L12-v2
all-mpnet-base-v2
e5-base-v2
e5-small-v2
multilingual-e5-base
multilingual-e5-small
- See Also:
-
putHuggingFace
public final CompletableFuture<PutHuggingFaceResponse> putHuggingFace(Function<PutHuggingFaceRequest.Builder, ObjectBuilder<PutHuggingFaceRequest>> fn) Create a Hugging Face inference endpoint.Create an inference endpoint to perform an inference task with the
hugging_face
service.You must first create an inference endpoint on the Hugging Face endpoint page to get an endpoint URL. Select the model you want to use on the new endpoint creation page (for example
intfloat/e5-small-v2
), then select the sentence embeddings task under the advanced configuration section. Create the endpoint and copy the URL after the endpoint initialization has been finished.The following models are recommended for the Hugging Face service:
all-MiniLM-L6-v2
all-MiniLM-L12-v2
all-mpnet-base-v2
e5-base-v2
e5-small-v2
multilingual-e5-base
multilingual-e5-small
- Parameters:
fn
- a function that initializes a builder to create thePutHuggingFaceRequest
- See Also:
-
putJinaai
Create an JinaAI inference endpoint.Create an inference endpoint to perform an inference task with the
jinaai
service.To review the available
rerank
models, refer to https://jina.ai/reranker. To review the availabletext_embedding
models, refer to the https://jina.ai/embeddings/.- See Also:
-
putJinaai
public final CompletableFuture<PutJinaaiResponse> putJinaai(Function<PutJinaaiRequest.Builder, ObjectBuilder<PutJinaaiRequest>> fn) Create an JinaAI inference endpoint.Create an inference endpoint to perform an inference task with the
jinaai
service.To review the available
rerank
models, refer to https://jina.ai/reranker. To review the availabletext_embedding
models, refer to the https://jina.ai/embeddings/.- Parameters:
fn
- a function that initializes a builder to create thePutJinaaiRequest
- See Also:
-
putMistral
Create a Mistral inference endpoint.Creates an inference endpoint to perform an inference task with the
mistral
service.- See Also:
-
putMistral
public final CompletableFuture<PutMistralResponse> putMistral(Function<PutMistralRequest.Builder, ObjectBuilder<PutMistralRequest>> fn) Create a Mistral inference endpoint.Creates an inference endpoint to perform an inference task with the
mistral
service.- Parameters:
fn
- a function that initializes a builder to create thePutMistralRequest
- See Also:
-
putOpenai
Create an OpenAI inference endpoint.Create an inference endpoint to perform an inference task with the
openai
service oropenai
compatible APIs.- See Also:
-
putOpenai
public final CompletableFuture<PutOpenaiResponse> putOpenai(Function<PutOpenaiRequest.Builder, ObjectBuilder<PutOpenaiRequest>> fn) Create an OpenAI inference endpoint.Create an inference endpoint to perform an inference task with the
openai
service oropenai
compatible APIs.- Parameters:
fn
- a function that initializes a builder to create thePutOpenaiRequest
- See Also:
-
putVoyageai
Create a VoyageAI inference endpoint.Create an inference endpoint to perform an inference task with the
voyageai
service.Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.
- See Also:
-
putVoyageai
public final CompletableFuture<PutVoyageaiResponse> putVoyageai(Function<PutVoyageaiRequest.Builder, ObjectBuilder<PutVoyageaiRequest>> fn) Create a VoyageAI inference endpoint.Create an inference endpoint to perform an inference task with the
voyageai
service.Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.
- Parameters:
fn
- a function that initializes a builder to create thePutVoyageaiRequest
- See Also:
-
putWatsonx
Create a Watsonx inference endpoint.Create an inference endpoint to perform an inference task with the
watsonxai
service. You need an IBM Cloud Databases for Elasticsearch deployment to use thewatsonxai
inference service. You can provision one through the IBM catalog, the Cloud Databases CLI plug-in, the Cloud Databases API, or Terraform.- See Also:
-
putWatsonx
public final CompletableFuture<PutWatsonxResponse> putWatsonx(Function<PutWatsonxRequest.Builder, ObjectBuilder<PutWatsonxRequest>> fn) Create a Watsonx inference endpoint.Create an inference endpoint to perform an inference task with the
watsonxai
service. You need an IBM Cloud Databases for Elasticsearch deployment to use thewatsonxai
inference service. You can provision one through the IBM catalog, the Cloud Databases CLI plug-in, the Cloud Databases API, or Terraform.- Parameters:
fn
- a function that initializes a builder to create thePutWatsonxRequest
- See Also:
-
rerank
Perform rereanking inference on the service- See Also:
-
rerank
public final CompletableFuture<RerankResponse> rerank(Function<RerankRequest.Builder, ObjectBuilder<RerankRequest>> fn) Perform rereanking inference on the service- Parameters:
fn
- a function that initializes a builder to create theRerankRequest
- See Also:
-
sparseEmbedding
Perform sparse embedding inference on the service- See Also:
-
sparseEmbedding
public final CompletableFuture<SparseEmbeddingResponse> sparseEmbedding(Function<SparseEmbeddingRequest.Builder, ObjectBuilder<SparseEmbeddingRequest>> fn) Perform sparse embedding inference on the service- Parameters:
fn
- a function that initializes a builder to create theSparseEmbeddingRequest
- See Also:
-
streamCompletion
Perform streaming inference. Get real-time responses for completion tasks by delivering answers incrementally, reducing response times during computation. This API works only with the completion task type.IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.
This API requires the
monitor_inference
cluster privilege (the built-ininference_admin
andinference_user
roles grant this privilege). You must use a client that supports streaming.- See Also:
-
streamCompletion
public final CompletableFuture<BinaryResponse> streamCompletion(Function<StreamCompletionRequest.Builder, ObjectBuilder<StreamCompletionRequest>> fn) Perform streaming inference. Get real-time responses for completion tasks by delivering answers incrementally, reducing response times during computation. This API works only with the completion task type.IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.
This API requires the
monitor_inference
cluster privilege (the built-ininference_admin
andinference_user
roles grant this privilege). You must use a client that supports streaming.- Parameters:
fn
- a function that initializes a builder to create theStreamCompletionRequest
- See Also:
-
textEmbedding
Perform text embedding inference on the service- See Also:
-
textEmbedding
public final CompletableFuture<TextEmbeddingResponse> textEmbedding(Function<TextEmbeddingRequest.Builder, ObjectBuilder<TextEmbeddingRequest>> fn) Perform text embedding inference on the service- Parameters:
fn
- a function that initializes a builder to create theTextEmbeddingRequest
- See Also:
-
update
Update an inference endpoint.Modify
task_settings
, secrets (withinservice_settings
), ornum_allocations
for an inference endpoint, depending on the specific endpoint service andtask_type
.IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.
- See Also:
-
update
public final CompletableFuture<UpdateInferenceResponse> update(Function<UpdateInferenceRequest.Builder, ObjectBuilder<UpdateInferenceRequest>> fn) Update an inference endpoint.Modify
task_settings
, secrets (withinservice_settings
), ornum_allocations
for an inference endpoint, depending on the specific endpoint service andtask_type
.IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.
- Parameters:
fn
- a function that initializes a builder to create theUpdateInferenceRequest
- See Also:
-