Package org.apache.parquet.hadoop.api
Class ReadSupport<T>
- java.lang.Object
-
- org.apache.parquet.hadoop.api.ReadSupport<T>
-
- Type Parameters:
T
- the type of the materialized record
- Direct Known Subclasses:
DelegatingReadSupport
,GroupReadSupport
public abstract class ReadSupport<T> extends Object
Abstraction used by theParquetInputFormat
to materialize records
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
ReadSupport.ReadContext
information to read the file
-
Field Summary
Fields Modifier and Type Field Description static String
PARQUET_READ_SCHEMA
configuration key for a parquet read projection schema
-
Constructor Summary
Constructors Constructor Description ReadSupport()
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Deprecated Methods Modifier and Type Method Description static org.apache.parquet.schema.MessageType
getSchemaForRead(org.apache.parquet.schema.MessageType fileMessageType, String partialReadSchemaString)
attempts to validate and construct aMessageType
from a read projection schemastatic org.apache.parquet.schema.MessageType
getSchemaForRead(org.apache.parquet.schema.MessageType fileMessageType, org.apache.parquet.schema.MessageType projectedMessageType)
ReadSupport.ReadContext
init(org.apache.hadoop.conf.Configuration configuration, Map<String,String> keyValueMetaData, org.apache.parquet.schema.MessageType fileSchema)
Deprecated.overrideinit(InitContext)
insteadReadSupport.ReadContext
init(InitContext context)
called inInputFormat.getSplits(org.apache.hadoop.mapreduce.JobContext)
in the front endabstract org.apache.parquet.io.api.RecordMaterializer<T>
prepareForRead(org.apache.hadoop.conf.Configuration configuration, Map<String,String> keyValueMetaData, org.apache.parquet.schema.MessageType fileSchema, ReadSupport.ReadContext readContext)
called inRecordReader.initialize(org.apache.hadoop.mapreduce.InputSplit, org.apache.hadoop.mapreduce.TaskAttemptContext)
in the back end the returned RecordMaterializer will materialize the records and add them to the destination
-
-
-
Field Detail
-
PARQUET_READ_SCHEMA
public static final String PARQUET_READ_SCHEMA
configuration key for a parquet read projection schema- See Also:
- Constant Field Values
-
-
Method Detail
-
getSchemaForRead
public static org.apache.parquet.schema.MessageType getSchemaForRead(org.apache.parquet.schema.MessageType fileMessageType, String partialReadSchemaString)
attempts to validate and construct aMessageType
from a read projection schema- Parameters:
fileMessageType
- the typed schema of the sourcepartialReadSchemaString
- the requested projection schema- Returns:
- the typed schema that should be used to read
-
getSchemaForRead
public static org.apache.parquet.schema.MessageType getSchemaForRead(org.apache.parquet.schema.MessageType fileMessageType, org.apache.parquet.schema.MessageType projectedMessageType)
-
init
@Deprecated public ReadSupport.ReadContext init(org.apache.hadoop.conf.Configuration configuration, Map<String,String> keyValueMetaData, org.apache.parquet.schema.MessageType fileSchema)
Deprecated.overrideinit(InitContext)
insteadcalled inInputFormat.getSplits(org.apache.hadoop.mapreduce.JobContext)
in the front end- Parameters:
configuration
- the job configurationkeyValueMetaData
- the app specific metadata from the filefileSchema
- the schema of the file- Returns:
- the readContext that defines how to read the file
-
init
public ReadSupport.ReadContext init(InitContext context)
called inInputFormat.getSplits(org.apache.hadoop.mapreduce.JobContext)
in the front end- Parameters:
context
- the initialisation context- Returns:
- the readContext that defines how to read the file
-
prepareForRead
public abstract org.apache.parquet.io.api.RecordMaterializer<T> prepareForRead(org.apache.hadoop.conf.Configuration configuration, Map<String,String> keyValueMetaData, org.apache.parquet.schema.MessageType fileSchema, ReadSupport.ReadContext readContext)
called inRecordReader.initialize(org.apache.hadoop.mapreduce.InputSplit, org.apache.hadoop.mapreduce.TaskAttemptContext)
in the back end the returned RecordMaterializer will materialize the records and add them to the destination- Parameters:
configuration
- the job configurationkeyValueMetaData
- the app specific metadata from the filefileSchema
- the schema of the filereadContext
- returned by the init method- Returns:
- the recordMaterializer that will materialize the records
-
-