Class AvroParquetInputFormat<T>

  • Type Parameters:
    T - the Java type of objects produced by this InputFormat

    public class AvroParquetInputFormat<T>
    extends org.apache.parquet.hadoop.ParquetInputFormat<T>
    A Hadoop InputFormat for Parquet files.
    • Nested Class Summary

      • Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

        org.apache.hadoop.mapreduce.lib.input.FileInputFormat.Counter
    • Field Summary

      • Fields inherited from class org.apache.parquet.hadoop.ParquetInputFormat

        BLOOM_FILTERING_ENABLED, COLUMN_INDEX_FILTERING_ENABLED, DICTIONARY_FILTERING_ENABLED, FILTER_PREDICATE, PAGE_VERIFY_CHECKSUM_ENABLED, READ_SUPPORT_CLASS, RECORD_FILTERING_ENABLED, SPLIT_FILES, STATS_FILTERING_ENABLED, STRICT_TYPE_CHECKING, TASK_SIDE_METADATA, UNBOUND_RECORD_FILTER
      • Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

        DEFAULT_LIST_STATUS_NUM_THREADS, INPUT_DIR, INPUT_DIR_RECURSIVE, LIST_STATUS_NUM_THREADS, NUM_INPUT_FILES, PATHFILTER_CLASS, SPLIT_MAXSIZE, SPLIT_MINSIZE
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static void setAvroDataSupplier​(org.apache.hadoop.mapreduce.Job job, Class<? extends AvroDataSupplier> supplierClass)
      Uses an instance of the specified AvroDataSupplier class to control how the SpecificData instance that is used to find Avro specific records is created.
      static void setAvroReadSchema​(org.apache.hadoop.mapreduce.Job job, org.apache.avro.Schema avroReadSchema)
      Override the Avro schema to use for reading.
      static void setRequestedProjection​(org.apache.hadoop.mapreduce.Job job, org.apache.avro.Schema requestedProjection)
      Set the subset of columns to read (projection pushdown).
      • Methods inherited from class org.apache.parquet.hadoop.ParquetInputFormat

        createRecordReader, getFilter, getFooters, getFooters, getFooters, getGlobalMetaData, getReadSupportClass, getReadSupportInstance, getSplits, getSplits, getUnboundRecordFilter, isSplitable, isTaskSideMetaData, listStatus, setFilterPredicate, setReadSupportClass, setReadSupportClass, setTaskSideMetaData, setUnboundRecordFilter
      • Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

        addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputDirRecursive, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, makeSplit, makeSplit, setInputDirRecursive, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize