Class LlamaServiceSettings.Builder
java.lang.Object
co.elastic.clients.util.ObjectBuilderBase
co.elastic.clients.util.WithJsonObjectBuilderBase<LlamaServiceSettings.Builder>
co.elastic.clients.elasticsearch.inference.LlamaServiceSettings.Builder
- All Implemented Interfaces:
- WithJson<LlamaServiceSettings.Builder>,- ObjectBuilder<LlamaServiceSettings>
- Enclosing class:
- LlamaServiceSettings
public static class LlamaServiceSettings.Builder
extends WithJsonObjectBuilderBase<LlamaServiceSettings.Builder>
implements ObjectBuilder<LlamaServiceSettings>
Builder for 
LlamaServiceSettings.- 
Constructor SummaryConstructors
- 
Method SummaryModifier and TypeMethodDescriptionbuild()Builds aLlamaServiceSettings.maxInputTokens(Integer value) For atext_embeddingtask, the maximum number of tokens per input before chunking occurs.Required - The name of the model to use for the inference task.rateLimit(RateLimitSetting value) This setting helps to minimize the number of rate limit errors returned from the Llama API.This setting helps to minimize the number of rate limit errors returned from the Llama API.protected LlamaServiceSettings.Builderself()similarity(LlamaSimilarityType value) For atext_embeddingtask, the similarity measure.Required - The URL endpoint of the Llama stack endpoint.Methods inherited from class co.elastic.clients.util.WithJsonObjectBuilderBasewithJsonMethods inherited from class co.elastic.clients.util.ObjectBuilderBase_checkSingleUse, _listAdd, _listAddAll, _mapPut, _mapPutAll
- 
Constructor Details- 
Builderpublic Builder()
 
- 
- 
Method Details- 
urlRequired - The URL endpoint of the Llama stack endpoint. URL must contain:- For text_embeddingtask -/v1/inference/embeddings.
- For completionandchat_completiontasks -/v1/openai/v1/chat/completions.
 API name: url
- For 
- 
modelIdRequired - The name of the model to use for the inference task. Refer to the Llama downloading models documentation for different ways of getting a list of available models and downloading them. Service has been tested and confirmed to be working with the following models:- For text_embeddingtask -all-MiniLM-L6-v2.
- For completionandchat_completiontasks -llama3.2:3b.
 API name: model_id
- For 
- 
maxInputTokensFor atext_embeddingtask, the maximum number of tokens per input before chunking occurs.API name: max_input_tokens
- 
similarityFor atext_embeddingtask, the similarity measure. One of cosine, dot_product, l2_norm.API name: similarity
- 
rateLimitThis setting helps to minimize the number of rate limit errors returned from the Llama API. By default, thellamaservice sets the number of requests allowed per minute to 3000.API name: rate_limit
- 
rateLimitpublic final LlamaServiceSettings.Builder rateLimit(Function<RateLimitSetting.Builder, ObjectBuilder<RateLimitSetting>> fn) This setting helps to minimize the number of rate limit errors returned from the Llama API. By default, thellamaservice sets the number of requests allowed per minute to 3000.API name: rate_limit
- 
self- Specified by:
- selfin class- WithJsonObjectBuilderBase<LlamaServiceSettings.Builder>
 
- 
buildBuilds aLlamaServiceSettings.- Specified by:
- buildin interface- ObjectBuilder<LlamaServiceSettings>
- Throws:
- NullPointerException- if some of the required fields are null.
 
 
-