Class PutHuggingFaceRequest
- All Implemented Interfaces:
JsonpSerializable
Create an inference endpoint to perform an inference task with the
hugging_face
service. Supported tasks include:
text_embedding
, completion
, and
chat_completion
.
To configure the endpoint, first visit the Hugging Face Inference Endpoints page and create a new endpoint. Select a model that supports the task you intend to use.
For Elastic's text_embedding
task: The selected model must
support the Sentence Embeddings
task. On the new endpoint
creation page, select the Sentence Embeddings
task under the
Advanced Configuration
section. After the endpoint has
initialized, copy the generated endpoint URL. Recommended models for
text_embedding
task:
all-MiniLM-L6-v2
all-MiniLM-L12-v2
all-mpnet-base-v2
e5-base-v2
e5-small-v2
multilingual-e5-base
multilingual-e5-small
For Elastic's chat_completion
and completion
tasks:
The selected model must support the Text Generation
task and
expose OpenAI API. HuggingFace supports both serverless and dedicated
endpoints for Text Generation
. When creating dedicated endpoint
select the Text Generation
task. After the endpoint is
initialized (for dedicated) or ready (for serverless), ensure it supports the
OpenAI API and includes /v1/chat/completions
part in URL. Then,
copy the full endpoint URL for use. Recommended models for
chat_completion
and completion
tasks:
Mistral-7B-Instruct-v0.2
QwQ-32B
Phi-3-mini-128k-instruct
For Elastic's rerank
task: The selected model must support the
sentence-ranking
task and expose OpenAI API. HuggingFace
supports only dedicated (not serverless) endpoints for Rerank
so
far. After the endpoint is initialized, copy the full endpoint URL for use.
Tested models for rerank
task:
bge-reranker-base
jina-reranker-v1-turbo-en-GGUF
- See Also:
-
Nested Class Summary
Nested ClassesNested classes/interfaces inherited from class co.elastic.clients.elasticsearch._types.RequestBase
RequestBase.AbstractBuilder<BuilderT extends RequestBase.AbstractBuilder<BuilderT>>
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final JsonpDeserializer<PutHuggingFaceRequest>
Json deserializer forPutHuggingFaceRequest
static final Endpoint<PutHuggingFaceRequest,
PutHuggingFaceResponse, ErrorResponse> Endpoint "inference.put_hugging_face
". -
Method Summary
Modifier and TypeMethodDescriptionThe chunking configuration object.final String
Required - The unique identifier of the inference endpoint.static PutHuggingFaceRequest
void
serialize
(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper) Serialize this object to JSON.protected void
serializeInternal
(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper) final HuggingFaceServiceType
service()
Required - The type of service supported for the specified task type.Required - Settings used to install the inference model.protected static void
final HuggingFaceTaskSettings
Settings to configure the inference task.final HuggingFaceTaskType
taskType()
Required - The type of the inference task that the model will perform.final Time
timeout()
Specifies the amount of time to wait for the inference endpoint to be created.Methods inherited from class co.elastic.clients.elasticsearch._types.RequestBase
toString
-
Field Details
-
_DESERIALIZER
Json deserializer forPutHuggingFaceRequest
-
_ENDPOINT
Endpoint "inference.put_hugging_face
".
-
-
Method Details
-
of
public static PutHuggingFaceRequest of(Function<PutHuggingFaceRequest.Builder, ObjectBuilder<PutHuggingFaceRequest>> fn) -
chunkingSettings
The chunking configuration object.API name:
chunking_settings
-
huggingfaceInferenceId
Required - The unique identifier of the inference endpoint.API name:
huggingface_inference_id
-
service
Required - The type of service supported for the specified task type. In this case,hugging_face
.API name:
service
-
serviceSettings
Required - Settings used to install the inference model. These settings are specific to thehugging_face
service.API name:
service_settings
-
taskSettings
Settings to configure the inference task. These settings are specific to the task type you specified.API name:
task_settings
-
taskType
Required - The type of the inference task that the model will perform.API name:
task_type
-
timeout
Specifies the amount of time to wait for the inference endpoint to be created.API name:
timeout
-
serialize
Serialize this object to JSON.- Specified by:
serialize
in interfaceJsonpSerializable
-
serializeInternal
-
setupPutHuggingFaceRequestDeserializer
protected static void setupPutHuggingFaceRequestDeserializer(ObjectDeserializer<PutHuggingFaceRequest.Builder> op)
-