Class HuggingFaceServiceSettings

java.lang.Object
co.elastic.clients.elasticsearch.inference.HuggingFaceServiceSettings
All Implemented Interfaces:
JsonpSerializable

@JsonpDeserializable public class HuggingFaceServiceSettings extends Object implements JsonpSerializable
See Also:
  • Field Details

  • Method Details

    • of

    • apiKey

      public final String apiKey()
      Required - A valid access token for your HuggingFace account. You can create or find your access tokens on the HuggingFace settings page.

      IMPORTANT: You need to provide the API key only once, during the inference model creation. The get inference endpoint API does not retrieve your API key. After creating the inference model, you cannot change the associated API key. If you want to use a different API key, delete the inference model and recreate it with the same name and the updated API key.

      API name: api_key

    • rateLimit

      @Nullable public final RateLimitSetting rateLimit()
      This setting helps to minimize the number of rate limit errors returned from Hugging Face. By default, the hugging_face service sets the number of requests allowed per minute to 3000 for all supported tasks. Hugging Face does not publish a universal rate limit — actual limits may vary. It is recommended to adjust this value based on the capacity and limits of your specific deployment environment.

      API name: rate_limit

    • url

      public final String url()
      Required - The URL endpoint to use for the requests. For completion and chat_completion tasks, the deployed model must be compatible with the Hugging Face Chat Completion interface (see the linked external documentation for details). The endpoint URL for the request must include /v1/chat/completions. If the model supports the OpenAI Chat Completion schema, a toggle should appear in the interface. Enabling this toggle doesn't change any model behavior, it reveals the full endpoint URL needed (which should include /v1/chat/completions) when configuring the inference endpoint in Elasticsearch. If the model doesn't support this schema, the toggle may not be shown.

      API name: url

    • modelId

      @Nullable public final String modelId()
      The name of the HuggingFace model to use for the inference task. For completion and chat_completion tasks, this field is optional but may be required for certain models — particularly when using serverless inference endpoints. For the text_embedding task, this field should not be included. Otherwise, the request will fail.

      API name: model_id

    • serialize

      public void serialize(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper)
      Serialize this object to JSON.
      Specified by:
      serialize in interface JsonpSerializable
    • serializeInternal

      protected void serializeInternal(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper)
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • setupHuggingFaceServiceSettingsDeserializer

      protected static void setupHuggingFaceServiceSettingsDeserializer(ObjectDeserializer<HuggingFaceServiceSettings.Builder> op)