Class InferenceChunkingSettings
- All Implemented Interfaces:
JsonpSerializable
- See Also:
-
Nested Class Summary
Nested Classes -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final JsonpDeserializer<InferenceChunkingSettings>
Json deserializer forInferenceChunkingSettings
-
Method Summary
Modifier and TypeMethodDescriptionfinal Integer
The maximum size of a chunk in words.static InferenceChunkingSettings
final Integer
overlap()
The number of overlapping words for chunks.final Integer
The number of overlapping sentences for chunks.final String
Required - This parameter is only applicable when using therecursive
chunking strategy.Required - A list of strings used as possible split points when chunking text with therecursive
strategy.void
serialize
(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper) Serialize this object to JSON.protected void
serializeInternal
(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper) protected static void
setupInferenceChunkingSettingsDeserializer
(ObjectDeserializer<InferenceChunkingSettings.Builder> op) final String
strategy()
The chunking strategy:sentence
,word
,none
orrecursive
.toString()
-
Field Details
-
_DESERIALIZER
Json deserializer forInferenceChunkingSettings
-
-
Method Details
-
of
public static InferenceChunkingSettings of(Function<InferenceChunkingSettings.Builder, ObjectBuilder<InferenceChunkingSettings>> fn) -
maxChunkSize
The maximum size of a chunk in words. This value cannot be higher than300
or lower than20
(forsentence
strategy) or10
(forword
strategy).API name:
max_chunk_size
-
overlap
The number of overlapping words for chunks. It is applicable only to aword
chunking strategy. This value cannot be higher than half themax_chunk_size
value.API name:
overlap
-
sentenceOverlap
The number of overlapping sentences for chunks. It is applicable only for asentence
chunking strategy. It can be either1
or0
.API name:
sentence_overlap
-
separatorGroup
Required - This parameter is only applicable when using therecursive
chunking strategy.Sets a predefined list of separators in the saved chunking settings based on the selected text type. Values can be
markdown
orplaintext
.Using this parameter is an alternative to manually specifying a custom
separators
list.API name:
separator_group
-
separators
Required - A list of strings used as possible split points when chunking text with therecursive
strategy.Each string can be a plain string or a regular expression (regex) pattern. The system tries each separator in order to split the text, starting from the first item in the list.
After splitting, it attempts to recombine smaller pieces into larger chunks that stay within the
max_chunk_size
limit, to reduce the total number of chunks generated.API name:
separators
-
strategy
The chunking strategy:sentence
,word
,none
orrecursive
.- If
strategy
is set torecursive
, you must also specify:
max_chunk_size
- either
separators
orseparator_group
Learn more about different chunking strategies in the linked documentation.
API name:
strategy
- If
-
serialize
Serialize this object to JSON.- Specified by:
serialize
in interfaceJsonpSerializable
-
serializeInternal
-
toString
-
setupInferenceChunkingSettingsDeserializer
protected static void setupInferenceChunkingSettingsDeserializer(ObjectDeserializer<InferenceChunkingSettings.Builder> op)
-