Class PutAzureopenaiRequest

java.lang.Object
co.elastic.clients.elasticsearch._types.RequestBase
co.elastic.clients.elasticsearch.inference.PutAzureopenaiRequest
All Implemented Interfaces:
JsonpSerializable

@JsonpDeserializable public class PutAzureopenaiRequest extends RequestBase implements JsonpSerializable
Create an Azure OpenAI inference endpoint.

Create an inference endpoint to perform an inference task with the azureopenai service.

The list of chat completion models that you can choose from in your Azure OpenAI deployment include:

The list of embeddings models that you can choose from in your deployment can be found in the Azure models documentation.

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

See Also: