Package org.apache.druid.data.input
Interface InputFormat
-
- All Known Implementing Classes:
CsvInputFormat
,DelimitedInputFormat
,FlatTextInputFormat
,JsonInputFormat
,NestedInputFormat
,RegexInputFormat
public interface InputFormat
InputFormat abstracts the file format of input data. It creates aInputEntityReader
to read data and parse it intoInputRow
. The created InputEntityReader is used byInputSourceReader
.See
NestedInputFormat
for nested input formats such as JSON.
-
-
Field Summary
Fields Modifier and Type Field Description static String
TYPE_PROPERTY
-
Method Summary
All Methods Instance Methods Abstract Methods Default Methods Modifier and Type Method Description InputEntityReader
createReader(InputRowSchema inputRowSchema, InputEntity source, File temporaryDirectory)
default long
getWeightedSize(String path, long size)
Computes the weighted size of a given input object of the underyling input format type, weighted for its cost during ingestion.boolean
isSplittable()
Trait to indicate that a file can be split into multipleInputSplit
s.
-
-
-
Field Detail
-
TYPE_PROPERTY
static final String TYPE_PROPERTY
- See Also:
- Constant Field Values
-
-
Method Detail
-
isSplittable
boolean isSplittable()
Trait to indicate that a file can be split into multipleInputSplit
s.This method is not being used anywhere for now, but should be considered in
SplittableInputSource.createSplits(org.apache.druid.data.input.InputFormat, org.apache.druid.data.input.SplitHintSpec)
in the future.
-
createReader
InputEntityReader createReader(InputRowSchema inputRowSchema, InputEntity source, File temporaryDirectory)
-
getWeightedSize
default long getWeightedSize(String path, long size)
Computes the weighted size of a given input object of the underyling input format type, weighted for its cost during ingestion. The weight calculated is dependent on the format type and compression type (CompressionUtils.Format
) used if any. Uncompressed newline delimited json is used as baseline with scale factor 1. This means that when computing the byte weight that an uncompressed newline delimited json input object has towards ingestion, we take the file size as is, 1:1.- Parameters:
path
- The path of the input object. Used to tell whether any compression is used.size
- The size of the input object in bytes.- Returns:
- The weighted size of the input object.
-
-