SplittableInputSource (druid-processing 27.0.0 API)

All Superinterfaces:

InputSource

All Known Implementing Classes:

CloudObjectInputSource, CombiningInputSource, HttpInputSource, LocalInputSource
```
public interface SplittableInputSource<T>
extends InputSource
```
Splittable InputSource. ParallelIndexSupervisorTask can process InputSplits in parallel.

Field Summary

Fields
Modifier and Type Field and Description

static SplitHintSpec DEFAULT_SPLIT_HINT_SPEC
- Fields inherited from interface org.apache.druid.data.input.InputSource
  TYPE_PROPERTY

Fields
Modifier and Type	Field and Description
`static SplitHintSpec`	`DEFAULT_SPLIT_HINT_SPEC`

Method Summary

All Methods Instance Methods Abstract Methods Default Methods
Modifier and Type	Method and Description
`Stream<InputSplit<T>>`	`createSplits(InputFormat inputFormat, SplitHintSpec splitHintSpec)` Creates a `Stream` of `InputSplit`s.
`int`	`estimateNumSplits(InputFormat inputFormat, SplitHintSpec splitHintSpec)` Returns an estimated total number of splits to be created via `createSplits(org.apache.druid.data.input.InputFormat, org.apache.druid.data.input.SplitHintSpec)`.
`default SplitHintSpec`	`getSplitHintSpecOrDefault(SplitHintSpec splitHintSpec)`
`default boolean`	`isSplittable()` Returns true if this inputSource can be processed in parallel using ParallelIndexSupervisorTask.
`InputSource`	`withSplit(InputSplit<T> split)` Helper method for ParallelIndexSupervisorTask.

Methods inherited from interface org.apache.druid.data.input.InputSource
getTypes, needsFormat, reader

- Field Detail
  - DEFAULT_SPLIT_HINT_SPEC
```
static final SplitHintSpec DEFAULT_SPLIT_HINT_SPEC
```
- Method Detail
  - isSplittable
```
default boolean isSplittable()
```
    Description copied from interface: InputSource
    
    Returns true if this inputSource can be processed in parallel using ParallelIndexSupervisorTask. It must be castable to SplittableInputSource and the various SplittableInputSource methods must work as documented.
    
    Specified by:
    
    isSplittable in interface InputSource
  - createSplits
```
Stream<InputSplit<T>> createSplits(InputFormat inputFormat,
                                   @Nullable
                                   SplitHintSpec splitHintSpec)
                            throws IOException
```
    Creates a Stream of InputSplits. The returned stream is supposed to be evaluated lazily to avoid consuming too much memory. Note that this interface also has estimateNumSplits(org.apache.druid.data.input.InputFormat, org.apache.druid.data.input.SplitHintSpec) which is related to this method. The implementations should be careful to NOT cache the created splits in memory. Implementations can consider InputFormat.isSplittable() and SplitHintSpec to create splits in the same way with estimateNumSplits(org.apache.druid.data.input.InputFormat, org.apache.druid.data.input.SplitHintSpec).
    
    Throws:
    
    IOException
  - estimateNumSplits
```
int estimateNumSplits(InputFormat inputFormat,
                      @Nullable
                      SplitHintSpec splitHintSpec)
               throws IOException
```
    Returns an estimated total number of splits to be created via createSplits(org.apache.druid.data.input.InputFormat, org.apache.druid.data.input.SplitHintSpec). The estimated number of splits doesn't have to be accurate and can be different from the actual number of InputSplits returned from createSplits(org.apache.druid.data.input.InputFormat, org.apache.druid.data.input.SplitHintSpec). This will be used to estimate the progress of a phase in parallel indexing. See TaskMonitor for more details of the progress estimation. This method can be expensive if an implementation iterates all directories or whatever substructure to find all input entities. Implementations can consider InputFormat.isSplittable() and SplitHintSpec to find splits in the same way with createSplits(org.apache.druid.data.input.InputFormat, org.apache.druid.data.input.SplitHintSpec).
    
    Throws:
    
    IOException
  - withSplit
```
InputSource withSplit(InputSplit<T> split)
```
    Helper method for ParallelIndexSupervisorTask. Most of implementations can simply create a new instance with the given split.
  - getSplitHintSpecOrDefault
```
default SplitHintSpec getSplitHintSpecOrDefault(@Nullable
                                                SplitHintSpec splitHintSpec)
```

Interface SplittableInputSource<T>

Field Summary

Fields inherited from interface org.apache.druid.data.input.InputSource

Method Summary

Methods inherited from interface org.apache.druid.data.input.InputSource

Field Detail

DEFAULT_SPLIT_HINT_SPEC

Method Detail

isSplittable

createSplits

estimateNumSplits

withSplit

getSplitHintSpecOrDefault