Class InferenceChunkingSettings
- All Implemented Interfaces:
JsonpSerializable
- See Also:
-
Nested Class Summary
Nested Classes -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final JsonpDeserializer<InferenceChunkingSettings>Json deserializer forInferenceChunkingSettings -
Method Summary
Modifier and TypeMethodDescriptionfinal IntegerThe maximum size of a chunk in words.static InferenceChunkingSettingsfinal Integeroverlap()The number of overlapping words for chunks.final IntegerThe number of overlapping sentences for chunks.final StringOnly applicable to therecursivestrategy and required when using it.Only applicable to therecursivestrategy and required when using it.voidserialize(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper) Serialize this object to JSON.protected voidserializeInternal(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper) protected static voidsetupInferenceChunkingSettingsDeserializer(ObjectDeserializer<InferenceChunkingSettings.Builder> op) final Stringstrategy()The chunking strategy:sentence,word,noneorrecursive.toString()
-
Field Details
-
_DESERIALIZER
Json deserializer forInferenceChunkingSettings
-
-
Method Details
-
of
public static InferenceChunkingSettings of(Function<InferenceChunkingSettings.Builder, ObjectBuilder<InferenceChunkingSettings>> fn) -
maxChunkSize
The maximum size of a chunk in words. This value cannot be lower than20(forsentencestrategy) or10(forwordstrategy). This value should not exceed the window size for the associated model.API name:
max_chunk_size -
overlap
The number of overlapping words for chunks. It is applicable only to awordchunking strategy. This value cannot be higher than half themax_chunk_sizevalue.API name:
overlap -
sentenceOverlap
The number of overlapping sentences for chunks. It is applicable only for asentencechunking strategy. It can be either1or0.API name:
sentence_overlap -
separatorGroup
Only applicable to therecursivestrategy and required when using it.Sets a predefined list of separators in the saved chunking settings based on the selected text type. Values can be
markdownorplaintext.Using this parameter is an alternative to manually specifying a custom
separatorslist.API name:
separator_group -
separators
Only applicable to therecursivestrategy and required when using it.A list of strings used as possible split points when chunking text.
Each string can be a plain string or a regular expression (regex) pattern. The system tries each separator in order to split the text, starting from the first item in the list.
After splitting, it attempts to recombine smaller pieces into larger chunks that stay within the
max_chunk_sizelimit, to reduce the total number of chunks generated.API name:
separators -
strategy
The chunking strategy:sentence,word,noneorrecursive.- If
strategyis set torecursive, you must also specify:
max_chunk_size- either
separatorsorseparator_group
Learn more about different chunking strategies in the linked documentation.
API name:
strategy - If
-
serialize
Serialize this object to JSON.- Specified by:
serializein interfaceJsonpSerializable
-
serializeInternal
-
toString
-
setupInferenceChunkingSettingsDeserializer
protected static void setupInferenceChunkingSettingsDeserializer(ObjectDeserializer<InferenceChunkingSettings.Builder> op)
-