java.lang.Object
- org.apache.flink.orc.AbstractOrcFileInputFormat<org.apache.flink.table.data.RowData,BatchT,SplitT>
- - org.apache.flink.orc.OrcColumnarRowInputFormat<BatchT,SplitT>

All Implemented Interfaces:

Serializable, org.apache.flink.api.java.typeutils.ResultTypeQueryable<org.apache.flink.table.data.RowData>, org.apache.flink.connector.file.src.reader.BulkFormat<org.apache.flink.table.data.RowData,SplitT>, org.apache.flink.table.connector.format.FileBasedStatisticsReportableInputFormat
```
public class OrcColumnarRowInputFormat<BatchT,SplitT extends org.apache.flink.connector.file.src.FileSourceSplit>
extends AbstractOrcFileInputFormat<org.apache.flink.table.data.RowData,BatchT,SplitT>
implements org.apache.flink.table.connector.format.FileBasedStatisticsReportableInputFormat
```
An ORC reader that produces a stream of ColumnarRowData records.
This class can add extra fields through ColumnBatchFactory, for example, add partition fields, which can be extracted from path. Therefore, the getProducedType() may be different and types of extra fields need to be added.

See Also:

Serialized Form

Nested Class Summary
- Nested classes/interfaces inherited from class org.apache.flink.orc.AbstractOrcFileInputFormat
  AbstractOrcFileInputFormat.OrcReaderBatch<T,BatchT>, AbstractOrcFileInputFormat.OrcVectorizedReader<T,BatchT>
- Nested classes/interfaces inherited from interface org.apache.flink.connector.file.src.reader.BulkFormat
  org.apache.flink.connector.file.src.reader.BulkFormat.Reader<T extends Object>, org.apache.flink.connector.file.src.reader.BulkFormat.RecordIterator<T extends Object>

Field Summary
- Fields inherited from class org.apache.flink.orc.AbstractOrcFileInputFormat
  batchSize, conjunctPredicates, hadoopConfigWrapper, schema, selectedFields, shim

Constructor Summary

Constructors
Constructor	Description
`OrcColumnarRowInputFormat(OrcShim<BatchT> shim, org.apache.hadoop.conf.Configuration hadoopConfig, org.apache.orc.TypeDescription schema, int[] selectedFields, List<OrcFilters.Predicate> conjunctPredicates, int batchSize, ColumnBatchFactory<BatchT,SplitT> batchFactory, org.apache.flink.api.common.typeinfo.TypeInformation<org.apache.flink.table.data.RowData> producedTypeInfo)`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`static <SplitT extends org.apache.flink.connector.file.src.FileSourceSplit> OrcColumnarRowInputFormat<org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch,SplitT>`	createPartitionedFormat(OrcShim<org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch> shim, org.apache.hadoop.conf.Configuration hadoopConfig, org.apache.flink.table.types.logical.RowType tableType, List<String> partitionKeys, org.apache.flink.connector.file.table.PartitionFieldExtractor<SplitT> extractor, int[] selectedFields, List<OrcFilters.Predicate> conjunctPredicates, int batchSize, Function<org.apache.flink.table.types.logical.RowType,org.apache.flink.api.common.typeinfo.TypeInformation<org.apache.flink.table.data.RowData>> rowTypeInfoFactory)	Create a partitioned `OrcColumnarRowInputFormat`, the partition columns can be generated by split.
`AbstractOrcFileInputFormat.OrcReaderBatch<org.apache.flink.table.data.RowData,BatchT>`	`createReaderBatch(SplitT split, OrcVectorizedBatchWrapper<BatchT> orcBatch, org.apache.flink.connector.file.src.util.Pool.Recycler<AbstractOrcFileInputFormat.OrcReaderBatch<org.apache.flink.table.data.RowData,BatchT>> recycler, int batchSize)`	Creates the `AbstractOrcFileInputFormat.OrcReaderBatch` structure, which is responsible for holding the data structures that hold the batch data (column vectors, row arrays, ...) and the batch conversion from the ORC representation to the result format.
`org.apache.flink.api.common.typeinfo.TypeInformation<org.apache.flink.table.data.RowData>`	`getProducedType()`	Gets the type produced by this format.
`org.apache.flink.table.plan.stats.TableStats`	`reportStatistics(List<org.apache.flink.core.fs.Path> files, org.apache.flink.table.types.DataType producedDataType)`

Methods inherited from class org.apache.flink.orc.AbstractOrcFileInputFormat
createReader, isSplittable, restoreReader

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail

OrcColumnarRowInputFormat

public OrcColumnarRowInputFormat(OrcShim<BatchT> shim,
                                 org.apache.hadoop.conf.Configuration hadoopConfig,
                                 org.apache.orc.TypeDescription schema,
                                 int[] selectedFields,
                                 List<OrcFilters.Predicate> conjunctPredicates,
                                 int batchSize,
                                 ColumnBatchFactory<BatchT,SplitT> batchFactory,
                                 org.apache.flink.api.common.typeinfo.TypeInformation<org.apache.flink.table.data.RowData> producedTypeInfo)

Method Detail

createReaderBatch

public AbstractOrcFileInputFormat.OrcReaderBatch<org.apache.flink.table.data.RowData,BatchT> createReaderBatch(SplitT split,
                                                                                                                     OrcVectorizedBatchWrapper<BatchT> orcBatch,
                                                                                                                     org.apache.flink.connector.file.src.util.Pool.Recycler<AbstractOrcFileInputFormat.OrcReaderBatch<org.apache.flink.table.data.RowData,BatchT>> recycler,
                                                                                                                     int batchSize)

Description copied from class: AbstractOrcFileInputFormat

Creates the AbstractOrcFileInputFormat.OrcReaderBatch structure, which is responsible for holding the data structures that hold the batch data (column vectors, row arrays, ...) and the batch conversion from the ORC representation to the result format.

Specified by:: createReaderBatch in class AbstractOrcFileInputFormat<org.apache.flink.table.data.RowData,BatchT,SplitT extends org.apache.flink.connector.file.src.FileSourceSplit>

getProducedType
```
public org.apache.flink.api.common.typeinfo.TypeInformation<org.apache.flink.table.data.RowData> getProducedType()
```
Description copied from class: AbstractOrcFileInputFormat

Gets the type produced by this format.

Specified by:

getProducedType in interface org.apache.flink.connector.file.src.reader.BulkFormat<BatchT,SplitT extends org.apache.flink.connector.file.src.FileSourceSplit>

Specified by:

getProducedType in interface org.apache.flink.api.java.typeutils.ResultTypeQueryable<BatchT>

Specified by:

getProducedType in class AbstractOrcFileInputFormat<org.apache.flink.table.data.RowData,BatchT,SplitT extends org.apache.flink.connector.file.src.FileSourceSplit>

reportStatistics

public org.apache.flink.table.plan.stats.TableStats reportStatistics(List<org.apache.flink.core.fs.Path> files,
                                                                     org.apache.flink.table.types.DataType producedDataType)

Specified by:: reportStatistics in interface org.apache.flink.table.connector.format.FileBasedStatisticsReportableInputFormat

createPartitionedFormat

public static <SplitT extends org.apache.flink.connector.file.src.FileSourceSplit> OrcColumnarRowInputFormat<org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch,SplitT> createPartitionedFormat(OrcShim<org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch> shim,
                                                                                                                                                                                                            org.apache.hadoop.conf.Configuration hadoopConfig,
                                                                                                                                                                                                            org.apache.flink.table.types.logical.RowType tableType,
                                                                                                                                                                                                            List<String> partitionKeys,
                                                                                                                                                                                                            org.apache.flink.connector.file.table.PartitionFieldExtractor<SplitT> extractor,
                                                                                                                                                                                                            int[] selectedFields,
                                                                                                                                                                                                            List<OrcFilters.Predicate> conjunctPredicates,
                                                                                                                                                                                                            int batchSize,
                                                                                                                                                                                                            Function<org.apache.flink.table.types.logical.RowType,org.apache.flink.api.common.typeinfo.TypeInformation<org.apache.flink.table.data.RowData>> rowTypeInfoFactory)

Create a partitioned OrcColumnarRowInputFormat, the partition columns can be generated by split.

Class OrcColumnarRowInputFormat<BatchT,​SplitT extends org.apache.flink.connector.file.src.FileSourceSplit>

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.flink.orc.AbstractOrcFileInputFormat

Nested classes/interfaces inherited from interface org.apache.flink.connector.file.src.reader.BulkFormat

Field Summary

Fields inherited from class org.apache.flink.orc.AbstractOrcFileInputFormat

Constructor Summary

Method Summary

Methods inherited from class org.apache.flink.orc.AbstractOrcFileInputFormat

Methods inherited from class java.lang.Object