Class HuggingFaceServiceSettings.Builder

All Implemented Interfaces:
WithJson<HuggingFaceServiceSettings.Builder>, ObjectBuilder<HuggingFaceServiceSettings>
Enclosing class:
HuggingFaceServiceSettings

public static class HuggingFaceServiceSettings.Builder extends WithJsonObjectBuilderBase<HuggingFaceServiceSettings.Builder> implements ObjectBuilder<HuggingFaceServiceSettings>
  • Constructor Details

    • Builder

      public Builder()
  • Method Details

    • apiKey

      public final HuggingFaceServiceSettings.Builder apiKey(String value)
      Required - A valid access token for your HuggingFace account. You can create or find your access tokens on the HuggingFace settings page.

      IMPORTANT: You need to provide the API key only once, during the inference model creation. The get inference endpoint API does not retrieve your API key. After creating the inference model, you cannot change the associated API key. If you want to use a different API key, delete the inference model and recreate it with the same name and the updated API key.

      API name: api_key

    • rateLimit

      public final HuggingFaceServiceSettings.Builder rateLimit(@Nullable RateLimitSetting value)
      This setting helps to minimize the number of rate limit errors returned from Hugging Face. By default, the hugging_face service sets the number of requests allowed per minute to 3000 for all supported tasks. Hugging Face does not publish a universal rate limit — actual limits may vary. It is recommended to adjust this value based on the capacity and limits of your specific deployment environment.

      API name: rate_limit

    • rateLimit

      This setting helps to minimize the number of rate limit errors returned from Hugging Face. By default, the hugging_face service sets the number of requests allowed per minute to 3000 for all supported tasks. Hugging Face does not publish a universal rate limit — actual limits may vary. It is recommended to adjust this value based on the capacity and limits of your specific deployment environment.

      API name: rate_limit

    • url

      public final HuggingFaceServiceSettings.Builder url(String value)
      Required - The URL endpoint to use for the requests. For completion and chat_completion tasks, the deployed model must be compatible with the Hugging Face Chat Completion interface (see the linked external documentation for details). The endpoint URL for the request must include /v1/chat/completions. If the model supports the OpenAI Chat Completion schema, a toggle should appear in the interface. Enabling this toggle doesn't change any model behavior, it reveals the full endpoint URL needed (which should include /v1/chat/completions) when configuring the inference endpoint in Elasticsearch. If the model doesn't support this schema, the toggle may not be shown.

      API name: url

    • modelId

      public final HuggingFaceServiceSettings.Builder modelId(@Nullable String value)
      The name of the HuggingFace model to use for the inference task. For completion and chat_completion tasks, this field is optional but may be required for certain models — particularly when using serverless inference endpoints. For the text_embedding task, this field should not be included. Otherwise, the request will fail.

      API name: model_id

    • self

      Specified by:
      self in class WithJsonObjectBuilderBase<HuggingFaceServiceSettings.Builder>
    • build

      Specified by:
      build in interface ObjectBuilder<HuggingFaceServiceSettings>
      Throws:
      NullPointerException - if some of the required fields are null.