Class ColumnChunkMetaData
- java.lang.Object
-
- org.apache.parquet.hadoop.metadata.ColumnChunkMetaData
-
public abstract class ColumnChunkMetaData extends Object
Column meta data for a block stored in the file footer and passed in the InputSplit
-
-
Field Summary
Fields Modifier and Type Field Description protected int
rowGroupOrdinal
-
Constructor Summary
Constructors Modifier Constructor Description protected
ColumnChunkMetaData(EncodingStats encodingStats, ColumnChunkProperties columnChunkProperties)
protected
ColumnChunkMetaData(ColumnChunkProperties columnChunkProperties)
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Deprecated Methods Modifier and Type Method Description protected void
decryptIfNeeded()
static ColumnChunkMetaData
get(org.apache.parquet.hadoop.metadata.ColumnPath path, PrimitiveType.PrimitiveTypeName type, org.apache.parquet.hadoop.metadata.CompressionCodecName codec, Set<Encoding> encodings, long firstDataPage, long dictionaryPageOffset, long valueCount, long totalSize, long totalUncompressedSize)
Deprecated.static ColumnChunkMetaData
get(org.apache.parquet.hadoop.metadata.ColumnPath path, PrimitiveType.PrimitiveTypeName type, org.apache.parquet.hadoop.metadata.CompressionCodecName codec, Set<Encoding> encodings, Statistics statistics, long firstDataPage, long dictionaryPageOffset, long valueCount, long totalSize, long totalUncompressedSize)
Deprecated.static ColumnChunkMetaData
get(org.apache.parquet.hadoop.metadata.ColumnPath path, PrimitiveType.PrimitiveTypeName type, org.apache.parquet.hadoop.metadata.CompressionCodecName codec, EncodingStats encodingStats, Set<Encoding> encodings, Statistics statistics, long firstDataPage, long dictionaryPageOffset, long valueCount, long totalSize, long totalUncompressedSize)
Deprecated.will be removed in 2.0.0.static ColumnChunkMetaData
get(org.apache.parquet.hadoop.metadata.ColumnPath path, PrimitiveType type, org.apache.parquet.hadoop.metadata.CompressionCodecName codec, EncodingStats encodingStats, Set<Encoding> encodings, Statistics statistics, long firstDataPage, long dictionaryPageOffset, long valueCount, long totalSize, long totalUncompressedSize)
long
getBloomFilterOffset()
org.apache.parquet.hadoop.metadata.CompressionCodecName
getCodec()
IndexReference
getColumnIndexReference()
abstract long
getDictionaryPageOffset()
Set<Encoding>
getEncodings()
EncodingStats
getEncodingStats()
abstract long
getFirstDataPageOffset()
IndexReference
getOffsetIndexReference()
org.apache.parquet.hadoop.metadata.ColumnPath
getPath()
PrimitiveType
getPrimitiveType()
int
getRowGroupOrdinal()
long
getStartingPos()
abstract Statistics
getStatistics()
abstract long
getTotalSize()
abstract long
getTotalUncompressedSize()
PrimitiveType.PrimitiveTypeName
getType()
Deprecated.will be removed in 2.0.0.abstract long
getValueCount()
static ColumnChunkMetaData
getWithEncryptedMetadata(ParquetMetadataConverter parquetMetadataConverter, org.apache.parquet.hadoop.metadata.ColumnPath path, PrimitiveType type, byte[] encryptedMetadata, byte[] columnKeyMetadata, InternalFileDecryptor fileDecryptor, int rowGroupOrdinal, int columnOrdinal, String createdBy)
boolean
hasDictionaryPage()
protected static boolean
positiveLongFitsInAnInt(long value)
checks that a positive long value fits in an int.void
setBloomFilterOffset(long bloomFilterOffset)
void
setColumnIndexReference(IndexReference indexReference)
void
setOffsetIndexReference(IndexReference offsetIndexReference)
void
setRowGroupOrdinal(int rowGroupOrdinal)
String
toString()
-
-
-
Constructor Detail
-
ColumnChunkMetaData
protected ColumnChunkMetaData(ColumnChunkProperties columnChunkProperties)
-
ColumnChunkMetaData
protected ColumnChunkMetaData(EncodingStats encodingStats, ColumnChunkProperties columnChunkProperties)
-
-
Method Detail
-
get
@Deprecated public static ColumnChunkMetaData get(org.apache.parquet.hadoop.metadata.ColumnPath path, PrimitiveType.PrimitiveTypeName type, org.apache.parquet.hadoop.metadata.CompressionCodecName codec, Set<Encoding> encodings, long firstDataPage, long dictionaryPageOffset, long valueCount, long totalSize, long totalUncompressedSize)
Deprecated.
-
get
@Deprecated public static ColumnChunkMetaData get(org.apache.parquet.hadoop.metadata.ColumnPath path, PrimitiveType.PrimitiveTypeName type, org.apache.parquet.hadoop.metadata.CompressionCodecName codec, Set<Encoding> encodings, Statistics statistics, long firstDataPage, long dictionaryPageOffset, long valueCount, long totalSize, long totalUncompressedSize)
Deprecated.
-
get
@Deprecated public static ColumnChunkMetaData get(org.apache.parquet.hadoop.metadata.ColumnPath path, PrimitiveType.PrimitiveTypeName type, org.apache.parquet.hadoop.metadata.CompressionCodecName codec, EncodingStats encodingStats, Set<Encoding> encodings, Statistics statistics, long firstDataPage, long dictionaryPageOffset, long valueCount, long totalSize, long totalUncompressedSize)
Deprecated.will be removed in 2.0.0. Useget(ColumnPath, PrimitiveType, CompressionCodecName, EncodingStats, Set, Statistics, long, long, long, long, long)
instead.- Parameters:
path
- the path of this column in the write schematype
- primitive type for this columncodec
- the compression codec used to compressencodingStats
- EncodingStats for the encodings used in this columnencodings
- a set of encoding used in this columnstatistics
- statistics for the data in this columnfirstDataPage
- offset of the first non-dictionary pagedictionaryPageOffset
- offset of the the dictionary pagevalueCount
- number of valuestotalSize
- total compressed sizetotalUncompressedSize
- uncompressed data size- Returns:
- a column chunk metadata instance
-
get
public static ColumnChunkMetaData get(org.apache.parquet.hadoop.metadata.ColumnPath path, PrimitiveType type, org.apache.parquet.hadoop.metadata.CompressionCodecName codec, EncodingStats encodingStats, Set<Encoding> encodings, Statistics statistics, long firstDataPage, long dictionaryPageOffset, long valueCount, long totalSize, long totalUncompressedSize)
-
getWithEncryptedMetadata
public static ColumnChunkMetaData getWithEncryptedMetadata(ParquetMetadataConverter parquetMetadataConverter, org.apache.parquet.hadoop.metadata.ColumnPath path, PrimitiveType type, byte[] encryptedMetadata, byte[] columnKeyMetadata, InternalFileDecryptor fileDecryptor, int rowGroupOrdinal, int columnOrdinal, String createdBy)
-
setRowGroupOrdinal
public void setRowGroupOrdinal(int rowGroupOrdinal)
-
getRowGroupOrdinal
public int getRowGroupOrdinal()
-
getStartingPos
public long getStartingPos()
- Returns:
- the offset of the first byte in the chunk
-
positiveLongFitsInAnInt
protected static boolean positiveLongFitsInAnInt(long value)
checks that a positive long value fits in an int. (reindexed on Integer.MIN_VALUE)- Parameters:
value
- a long value- Returns:
- whether it fits
-
decryptIfNeeded
protected void decryptIfNeeded()
-
getCodec
public org.apache.parquet.hadoop.metadata.CompressionCodecName getCodec()
-
getPath
public org.apache.parquet.hadoop.metadata.ColumnPath getPath()
- Returns:
- column identifier
-
getType
@Deprecated public PrimitiveType.PrimitiveTypeName getType()
Deprecated.will be removed in 2.0.0. UsegetPrimitiveType()
instead.- Returns:
- type of the column
-
getPrimitiveType
public PrimitiveType getPrimitiveType()
- Returns:
- the primitive type object of the column
-
getFirstDataPageOffset
public abstract long getFirstDataPageOffset()
- Returns:
- start of the column data offset
-
getDictionaryPageOffset
public abstract long getDictionaryPageOffset()
- Returns:
- the location of the dictionary page if any;
0
is returned if there is no dictionary page. CheckhasDictionaryPage()
to validate.
-
getValueCount
public abstract long getValueCount()
- Returns:
- count of values in this block of the column
-
getTotalUncompressedSize
public abstract long getTotalUncompressedSize()
- Returns:
- the totalUncompressedSize
-
getTotalSize
public abstract long getTotalSize()
- Returns:
- the totalSize
-
getStatistics
public abstract Statistics getStatistics()
- Returns:
- the stats for this column
-
getColumnIndexReference
@Private public IndexReference getColumnIndexReference()
- Returns:
- the reference to the column index
-
setColumnIndexReference
@Private public void setColumnIndexReference(IndexReference indexReference)
- Parameters:
indexReference
- the reference to the column index
-
getOffsetIndexReference
@Private public IndexReference getOffsetIndexReference()
- Returns:
- the reference to the offset index
-
setOffsetIndexReference
@Private public void setOffsetIndexReference(IndexReference offsetIndexReference)
- Parameters:
offsetIndexReference
- the reference to the offset index
-
setBloomFilterOffset
@Private public void setBloomFilterOffset(long bloomFilterOffset)
- Parameters:
bloomFilterOffset
- the reference to the Bloom filter
-
getBloomFilterOffset
@Private public long getBloomFilterOffset()
- Returns:
- the offset to the Bloom filter or
-1
if there is no bloom filter for this column chunk
-
getEncodingStats
public EncodingStats getEncodingStats()
-
hasDictionaryPage
public boolean hasDictionaryPage()
-
-