KEY
- The key type.VALUE
- The value type.@Beta public interface BatchReadable<KEY,VALUE>
In order to feed a dataset into a batch job, the dataset must be splittable into chunks so that it's possible to process every part of the dataset in parallel. Every chunk must be readable as a collection of {key,value} records.
Modifier and Type | Method and Description |
---|---|
SplitReader<KEY,VALUE> |
createSplitReader(Split split)
Creates a reader for the split of a dataset.
|
List<Split> |
getSplits()
Returns all splits of the dataset.
|
List<Split> getSplits()
For feeding the whole dataset into a batch job.
Split
s.SplitReader<KEY,VALUE> createSplitReader(Split split)
split
- The split to create a reader for.SplitReader
.Copyright © 2024 Cask Data, Inc. Licensed under the Apache License, Version 2.0.