Class InferenceChunkingSettings

java.lang.Object
co.elastic.clients.elasticsearch.inference.InferenceChunkingSettings
All Implemented Interfaces:
JsonpSerializable

@JsonpDeserializable public class InferenceChunkingSettings extends Object implements JsonpSerializable
Chunking configuration object
See Also:
  • Field Details

  • Method Details

    • of

    • maxChunkSize

      @Nullable public final Integer maxChunkSize()
      The maximum size of a chunk in words. This value cannot be lower than 20 (for sentence strategy) or 10 (for word strategy). This value should not exceed the window size for the associated model.

      API name: max_chunk_size

    • overlap

      @Nullable public final Integer overlap()
      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      API name: overlap

    • sentenceOverlap

      @Nullable public final Integer sentenceOverlap()
      The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      API name: sentence_overlap

    • separatorGroup

      @Nullable public final String separatorGroup()
      Only applicable to the recursive strategy and required when using it.

      Sets a predefined list of separators in the saved chunking settings based on the selected text type. Values can be markdown or plaintext.

      Using this parameter is an alternative to manually specifying a custom separators list.

      API name: separator_group

    • separators

      public final List<String> separators()
      Only applicable to the recursive strategy and required when using it.

      A list of strings used as possible split points when chunking text.

      Each string can be a plain string or a regular expression (regex) pattern. The system tries each separator in order to split the text, starting from the first item in the list.

      After splitting, it attempts to recombine smaller pieces into larger chunks that stay within the max_chunk_size limit, to reduce the total number of chunks generated.

      API name: separators

    • strategy

      @Nullable public final String strategy()
      The chunking strategy: sentence, word, none or recursive.
      • If strategy is set to recursive, you must also specify:
      • max_chunk_size
      • either separators orseparator_group

      Learn more about different chunking strategies in the linked documentation.

      API name: strategy

    • serialize

      public void serialize(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper)
      Serialize this object to JSON.
      Specified by:
      serialize in interface JsonpSerializable
    • serializeInternal

      protected void serializeInternal(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper)
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • setupInferenceChunkingSettingsDeserializer

      protected static void setupInferenceChunkingSettingsDeserializer(ObjectDeserializer<InferenceChunkingSettings.Builder> op)