Class InferenceChunkingSettings.Builder

All Implemented Interfaces:
WithJson<InferenceChunkingSettings.Builder>, ObjectBuilder<InferenceChunkingSettings>
Enclosing class:
InferenceChunkingSettings

public static class InferenceChunkingSettings.Builder extends WithJsonObjectBuilderBase<InferenceChunkingSettings.Builder> implements ObjectBuilder<InferenceChunkingSettings>
  • Constructor Details

    • Builder

      public Builder()
  • Method Details

    • maxChunkSize

      public final InferenceChunkingSettings.Builder maxChunkSize(@Nullable Integer value)
      The maximum size of a chunk in words. This value cannot be lower than 20 (for sentence strategy) or 10 (for word strategy). This value should not exceed the window size for the associated model.

      API name: max_chunk_size

    • overlap

      public final InferenceChunkingSettings.Builder overlap(@Nullable Integer value)
      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      API name: overlap

    • sentenceOverlap

      public final InferenceChunkingSettings.Builder sentenceOverlap(@Nullable Integer value)
      The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      API name: sentence_overlap

    • separatorGroup

      public final InferenceChunkingSettings.Builder separatorGroup(@Nullable String value)
      Only applicable to the recursive strategy and required when using it.

      Sets a predefined list of separators in the saved chunking settings based on the selected text type. Values can be markdown or plaintext.

      Using this parameter is an alternative to manually specifying a custom separators list.

      API name: separator_group

    • separators

      public final InferenceChunkingSettings.Builder separators(List<String> list)
      Only applicable to the recursive strategy and required when using it.

      A list of strings used as possible split points when chunking text.

      Each string can be a plain string or a regular expression (regex) pattern. The system tries each separator in order to split the text, starting from the first item in the list.

      After splitting, it attempts to recombine smaller pieces into larger chunks that stay within the max_chunk_size limit, to reduce the total number of chunks generated.

      API name: separators

      Adds all elements of list to separators.

    • separators

      public final InferenceChunkingSettings.Builder separators(String value, String... values)
      Only applicable to the recursive strategy and required when using it.

      A list of strings used as possible split points when chunking text.

      Each string can be a plain string or a regular expression (regex) pattern. The system tries each separator in order to split the text, starting from the first item in the list.

      After splitting, it attempts to recombine smaller pieces into larger chunks that stay within the max_chunk_size limit, to reduce the total number of chunks generated.

      API name: separators

      Adds one or more values to separators.

    • strategy

      public final InferenceChunkingSettings.Builder strategy(@Nullable String value)
      The chunking strategy: sentence, word, none or recursive.
      • If strategy is set to recursive, you must also specify:
      • max_chunk_size
      • either separators orseparator_group

      Learn more about different chunking strategies in the linked documentation.

      API name: strategy

    • self

      Specified by:
      self in class WithJsonObjectBuilderBase<InferenceChunkingSettings.Builder>
    • build

      public InferenceChunkingSettings build()
      Specified by:
      build in interface ObjectBuilder<InferenceChunkingSettings>
      Throws:
      NullPointerException - if some of the required fields are null.