java.lang.Object

co.elastic.clients.elasticsearch.inference.LlamaServiceSettings

All Implemented Interfaces:: JsonpSerializable

@JsonpDeserializable public class LlamaServiceSettings extends Object implements JsonpSerializable

See Also:

API specification

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static class

LlamaServiceSettings.Builder

Builder for LlamaServiceSettings.
Field Summary

Fields

Modifier and Type

Field

Description

static final JsonpDeserializer<LlamaServiceSettings>

_DESERIALIZER

Json deserializer for LlamaServiceSettings
Method Summary

Modifier and Type

Method

Description

final Integer

maxInputTokens()

For a text_embedding task, the maximum number of tokens per input before chunking occurs.

final String

modelId()

Required - The name of the model to use for the inference task.

static LlamaServiceSettings

of(Function<LlamaServiceSettings.Builder,ObjectBuilder<LlamaServiceSettings>> fn)

final RateLimitSetting

rateLimit()

This setting helps to minimize the number of rate limit errors returned from the Llama API.

void

serialize(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper)

Serialize this object to JSON.

protected void

serializeInternal(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper)

protected static void

setupLlamaServiceSettingsDeserializer(ObjectDeserializer<LlamaServiceSettings.Builder> op)

final LlamaSimilarityType

similarity()

For a text_embedding task, the similarity measure.

String

toString()

final String

url()

Required - The URL endpoint of the Llama stack endpoint.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Field Details
- _DESERIALIZER
  
  public static final JsonpDeserializer<LlamaServiceSettings> _DESERIALIZER
  
  Json deserializer for LlamaServiceSettings
Method Details
- of
  
  public static LlamaServiceSettings of(Function<LlamaServiceSettings.Builder,ObjectBuilder<LlamaServiceSettings>> fn)
- url
  
  public final String url()
  Required - The URL endpoint of the Llama stack endpoint. URL must contain:
  
  For text_embedding task - /v1/inference/embeddings.
  
  For completion and chat_completion tasks - /v1/openai/v1/chat/completions.
  
  API name: url
- modelId
  
  public final String modelId()
  Required - The name of the model to use for the inference task. Refer to the Llama downloading models documentation for different ways of getting a list of available models and downloading them. Service has been tested and confirmed to be working with the following models:
  
  For text_embedding task - all-MiniLM-L6-v2.
  
  For completion and chat_completion tasks - llama3.2:3b.
  
  API name: model_id
- maxInputTokens
  
  @Nullable public final Integer maxInputTokens()
  
  For a text_embedding task, the maximum number of tokens per input before chunking occurs.
  API name: max_input_tokens
- similarity
  
  @Nullable public final LlamaSimilarityType similarity()
  
  For a text_embedding task, the similarity measure. One of cosine, dot_product, l2_norm.
  API name: similarity
- rateLimit
  
  @Nullable public final RateLimitSetting rateLimit()
  
  This setting helps to minimize the number of rate limit errors returned from the Llama API. By default, the llama service sets the number of requests allowed per minute to 3000.
  API name: rate_limit
- serialize
  
  public void serialize(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper)
  
  Serialize this object to JSON.
  
  Specified by:
  
  serialize in interface JsonpSerializable
- serializeInternal
  
  protected void serializeInternal(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper)
- toString
  
  public String toString()
  
  Overrides:
  
  toString in class Object
- setupLlamaServiceSettingsDeserializer
  
  protected static void setupLlamaServiceSettingsDeserializer(ObjectDeserializer<LlamaServiceSettings.Builder> op)

Class LlamaServiceSettings

Nested Class Summary

Field Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

_DESERIALIZER

Method Details

of

url

modelId

maxInputTokens

similarity

rateLimit

serialize

serializeInternal

toString

setupLlamaServiceSettingsDeserializer