Class ColumnChunkMetaData


  • public abstract class ColumnChunkMetaData
    extends Object
    Column meta data for a block stored in the file footer and passed in the InputSplit
    • Field Detail

      • rowGroupOrdinal

        protected int rowGroupOrdinal
    • Method Detail

      • get

        @Deprecated
        public static ColumnChunkMetaData get​(org.apache.parquet.hadoop.metadata.ColumnPath path,
                                              PrimitiveType.PrimitiveTypeName type,
                                              org.apache.parquet.hadoop.metadata.CompressionCodecName codec,
                                              Set<Encoding> encodings,
                                              long firstDataPage,
                                              long dictionaryPageOffset,
                                              long valueCount,
                                              long totalSize,
                                              long totalUncompressedSize)
        Deprecated.
      • get

        @Deprecated
        public static ColumnChunkMetaData get​(org.apache.parquet.hadoop.metadata.ColumnPath path,
                                              PrimitiveType.PrimitiveTypeName type,
                                              org.apache.parquet.hadoop.metadata.CompressionCodecName codec,
                                              EncodingStats encodingStats,
                                              Set<Encoding> encodings,
                                              Statistics statistics,
                                              long firstDataPage,
                                              long dictionaryPageOffset,
                                              long valueCount,
                                              long totalSize,
                                              long totalUncompressedSize)
        Parameters:
        path - the path of this column in the write schema
        type - primitive type for this column
        codec - the compression codec used to compress
        encodingStats - EncodingStats for the encodings used in this column
        encodings - a set of encoding used in this column
        statistics - statistics for the data in this column
        firstDataPage - offset of the first non-dictionary page
        dictionaryPageOffset - offset of the the dictionary page
        valueCount - number of values
        totalSize - total compressed size
        totalUncompressedSize - uncompressed data size
        Returns:
        a column chunk metadata instance
      • get

        public static ColumnChunkMetaData get​(org.apache.parquet.hadoop.metadata.ColumnPath path,
                                              PrimitiveType type,
                                              org.apache.parquet.hadoop.metadata.CompressionCodecName codec,
                                              EncodingStats encodingStats,
                                              Set<Encoding> encodings,
                                              Statistics statistics,
                                              long firstDataPage,
                                              long dictionaryPageOffset,
                                              long valueCount,
                                              long totalSize,
                                              long totalUncompressedSize)
      • setRowGroupOrdinal

        public void setRowGroupOrdinal​(int rowGroupOrdinal)
      • getRowGroupOrdinal

        public int getRowGroupOrdinal()
      • getStartingPos

        public long getStartingPos()
        Returns:
        the offset of the first byte in the chunk
      • positiveLongFitsInAnInt

        protected static boolean positiveLongFitsInAnInt​(long value)
        checks that a positive long value fits in an int. (reindexed on Integer.MIN_VALUE)
        Parameters:
        value - a long value
        Returns:
        whether it fits
      • decryptIfNeeded

        protected void decryptIfNeeded()
      • getCodec

        public org.apache.parquet.hadoop.metadata.CompressionCodecName getCodec()
      • getPath

        public org.apache.parquet.hadoop.metadata.ColumnPath getPath()
        Returns:
        column identifier
      • getPrimitiveType

        public PrimitiveType getPrimitiveType()
        Returns:
        the primitive type object of the column
      • getFirstDataPageOffset

        public abstract long getFirstDataPageOffset()
        Returns:
        start of the column data offset
      • getDictionaryPageOffset

        public abstract long getDictionaryPageOffset()
        Returns:
        the location of the dictionary page if any; 0 is returned if there is no dictionary page. Check hasDictionaryPage() to validate.
      • getValueCount

        public abstract long getValueCount()
        Returns:
        count of values in this block of the column
      • getTotalUncompressedSize

        public abstract long getTotalUncompressedSize()
        Returns:
        the totalUncompressedSize
      • getTotalSize

        public abstract long getTotalSize()
        Returns:
        the totalSize
      • getStatistics

        public abstract Statistics getStatistics()
        Returns:
        the stats for this column
      • getColumnIndexReference

        @Private
        public IndexReference getColumnIndexReference()
        Returns:
        the reference to the column index
      • setColumnIndexReference

        @Private
        public void setColumnIndexReference​(IndexReference indexReference)
        Parameters:
        indexReference - the reference to the column index
      • getOffsetIndexReference

        @Private
        public IndexReference getOffsetIndexReference()
        Returns:
        the reference to the offset index
      • setOffsetIndexReference

        @Private
        public void setOffsetIndexReference​(IndexReference offsetIndexReference)
        Parameters:
        offsetIndexReference - the reference to the offset index
      • setBloomFilterOffset

        @Private
        public void setBloomFilterOffset​(long bloomFilterOffset)
        Parameters:
        bloomFilterOffset - the reference to the Bloom filter
      • getBloomFilterOffset

        @Private
        public long getBloomFilterOffset()
        Returns:
        the offset to the Bloom filter or -1 if there is no bloom filter for this column chunk
      • getEncodings

        public Set<Encoding> getEncodings()
        Returns:
        all the encodings used in this column
      • hasDictionaryPage

        public boolean hasDictionaryPage()