Class LlamaServiceSettings

java.lang.Object
co.elastic.clients.elasticsearch.inference.LlamaServiceSettings
All Implemented Interfaces:
JsonpSerializable

@JsonpDeserializable public class LlamaServiceSettings extends Object implements JsonpSerializable
See Also:
  • Field Details

  • Method Details

    • of

    • url

      public final String url()
      Required - The URL endpoint of the Llama stack endpoint. URL must contain:
      • For text_embedding task - /v1/inference/embeddings.
      • For completion and chat_completion tasks - /v1/openai/v1/chat/completions.

      API name: url

    • modelId

      public final String modelId()
      Required - The name of the model to use for the inference task. Refer to the Llama downloading models documentation for different ways of getting a list of available models and downloading them. Service has been tested and confirmed to be working with the following models:
      • For text_embedding task - all-MiniLM-L6-v2.
      • For completion and chat_completion tasks - llama3.2:3b.

      API name: model_id

    • maxInputTokens

      @Nullable public final Integer maxInputTokens()
      For a text_embedding task, the maximum number of tokens per input before chunking occurs.

      API name: max_input_tokens

    • similarity

      @Nullable public final LlamaSimilarityType similarity()
      For a text_embedding task, the similarity measure. One of cosine, dot_product, l2_norm.

      API name: similarity

    • rateLimit

      @Nullable public final RateLimitSetting rateLimit()
      This setting helps to minimize the number of rate limit errors returned from the Llama API. By default, the llama service sets the number of requests allowed per minute to 3000.

      API name: rate_limit

    • serialize

      public void serialize(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper)
      Serialize this object to JSON.
      Specified by:
      serialize in interface JsonpSerializable
    • serializeInternal

      protected void serializeInternal(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper)
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • setupLlamaServiceSettingsDeserializer

      protected static void setupLlamaServiceSettingsDeserializer(ObjectDeserializer<LlamaServiceSettings.Builder> op)