Package org.apache.druid.data.input.impl
Class RegexInputFormat
- java.lang.Object
-
- org.apache.druid.data.input.impl.RegexInputFormat
-
- All Implemented Interfaces:
InputFormat
public class RegexInputFormat extends Object implements InputFormat
-
-
Field Summary
Fields Modifier and Type Field Description static String
TYPE_KEY
-
Fields inherited from interface org.apache.druid.data.input.InputFormat
TYPE_PROPERTY
-
-
Constructor Summary
Constructors Constructor Description RegexInputFormat(String pattern, String listDelimiter, List<String> columns)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description InputEntityReader
createReader(InputRowSchema inputRowSchema, InputEntity source, File temporaryDirectory)
List<String>
getColumns()
String
getListDelimiter()
String
getPattern()
long
getWeightedSize(String path, long size)
Computes the weighted size of a given input object of the underyling input format type, weighted for its cost during ingestion.boolean
isSplittable()
Trait to indicate that a file can be split into multipleInputSplit
s.
-
-
-
Field Detail
-
TYPE_KEY
public static final String TYPE_KEY
- See Also:
- Constant Field Values
-
-
Method Detail
-
getPattern
public String getPattern()
-
isSplittable
public boolean isSplittable()
Description copied from interface:InputFormat
Trait to indicate that a file can be split into multipleInputSplit
s.This method is not being used anywhere for now, but should be considered in
SplittableInputSource.createSplits(org.apache.druid.data.input.InputFormat, org.apache.druid.data.input.SplitHintSpec)
in the future.- Specified by:
isSplittable
in interfaceInputFormat
-
createReader
public InputEntityReader createReader(InputRowSchema inputRowSchema, InputEntity source, File temporaryDirectory)
- Specified by:
createReader
in interfaceInputFormat
-
getWeightedSize
public long getWeightedSize(String path, long size)
Description copied from interface:InputFormat
Computes the weighted size of a given input object of the underyling input format type, weighted for its cost during ingestion. The weight calculated is dependent on the format type and compression type (CompressionUtils.Format
) used if any. Uncompressed newline delimited json is used as baseline with scale factor 1. This means that when computing the byte weight that an uncompressed newline delimited json input object has towards ingestion, we take the file size as is, 1:1.- Specified by:
getWeightedSize
in interfaceInputFormat
- Parameters:
path
- The path of the input object. Used to tell whether any compression is used.size
- The size of the input object in bytes.- Returns:
- The weighted size of the input object.
-
-