public static class ParquetProperties.Builder extends Object
Modifier and Type | Method and Description |
---|---|
ParquetProperties |
build() |
ParquetProperties.Builder |
estimateRowCountForPageSizeCheck(boolean estimateNextSizeCheck) |
ParquetProperties.Builder |
withAdaptiveBloomFilterEnabled(boolean enabled)
Whether to use adaptive bloom filter to automatically adjust the bloom filter size according to
`parquet.bloom.filter.max.bytes`.
|
ParquetProperties.Builder |
withAllocator(ByteBufferAllocator allocator) |
ParquetProperties.Builder |
withBloomFilterCandidatesNumber(String columnPath,
int number)
When `AdaptiveBloomFilter` is enabled, set how many bloom filter candidates to use.
|
ParquetProperties.Builder |
withBloomFilterEnabled(boolean enabled)
Enable or disable the bloom filter for the columns not specified by
withBloomFilterEnabled(String, boolean) . |
ParquetProperties.Builder |
withBloomFilterEnabled(String columnPath,
boolean enabled)
Enable or disable the bloom filter for the specified column.
|
ParquetProperties.Builder |
withBloomFilterFPP(String columnPath,
double fpp) |
ParquetProperties.Builder |
withBloomFilterNDV(String columnPath,
long ndv)
Set Bloom filter NDV (number of distinct values) for the specified column.
|
ParquetProperties.Builder |
withByteStreamSplitEncoding(boolean enable)
Enable or disable BYTE_STREAM_SPLIT encoding for FLOAT and DOUBLE columns.
|
ParquetProperties.Builder |
withByteStreamSplitEncoding(String columnPath,
boolean enable)
Enable or disable BYTE_STREAM_SPLIT encoding for specified columns.
|
ParquetProperties.Builder |
withColumnIndexTruncateLength(int length) |
ParquetProperties.Builder |
withDictionaryEncoding(boolean enableDictionary)
Enable or disable dictionary encoding.
|
ParquetProperties.Builder |
withDictionaryEncoding(String columnPath,
boolean enableDictionary)
Enable or disable dictionary encoding for the specified column.
|
ParquetProperties.Builder |
withDictionaryPageSize(int dictionaryPageSize)
Set the Parquet format dictionary page size.
|
ParquetProperties.Builder |
withExtendedByteStreamSplitEncoding(boolean enable)
Enable or disable BYTE_STREAM_SPLIT encoding for FLOAT, DOUBLE, INT32, INT64 and FIXED_LEN_BYTE_ARRAY columns.
|
ParquetProperties.Builder |
withExtraMetaData(Map<String,String> extraMetaData) |
ParquetProperties.Builder |
withMaxBloomFilterBytes(int maxBloomFilterBytes)
Set max Bloom filter bytes for related columns.
|
ParquetProperties.Builder |
withMaxRowCountForPageSizeCheck(int max) |
ParquetProperties.Builder |
withMinRowCountForPageSizeCheck(int min) |
ParquetProperties.Builder |
withPageRowCountLimit(int rowCount) |
ParquetProperties.Builder |
withPageSize(int pageSize)
Set the Parquet format page size.
|
ParquetProperties.Builder |
withPageValueCountThreshold(int value) |
ParquetProperties.Builder |
withPageWriteChecksumEnabled(boolean val) |
ParquetProperties.Builder |
withStatisticsTruncateLength(int length) |
ParquetProperties.Builder |
withValuesWriterFactory(ValuesWriterFactory factory) |
ParquetProperties.Builder |
withWriterVersion(ParquetProperties.WriterVersion version)
Set the
format version . |
public ParquetProperties.Builder withPageSize(int pageSize)
pageSize
- an integer size in bytespublic ParquetProperties.Builder withDictionaryEncoding(boolean enableDictionary)
enableDictionary
- whether dictionary encoding should be enabledpublic ParquetProperties.Builder withDictionaryEncoding(String columnPath, boolean enableDictionary)
columnPath
- the path of the column (dot-string)enableDictionary
- whether dictionary encoding should be enabledpublic ParquetProperties.Builder withByteStreamSplitEncoding(boolean enable)
enable
- whether BYTE_STREAM_SPLIT encoding should be enabledpublic ParquetProperties.Builder withByteStreamSplitEncoding(String columnPath, boolean enable)
columnPath
- the path of the column (dot-string)enable
- whether BYTE_STREAM_SPLIT encoding should be enabledpublic ParquetProperties.Builder withExtendedByteStreamSplitEncoding(boolean enable)
enable
- whether BYTE_STREAM_SPLIT encoding should be enabledpublic ParquetProperties.Builder withDictionaryPageSize(int dictionaryPageSize)
dictionaryPageSize
- an integer size in bytespublic ParquetProperties.Builder withWriterVersion(ParquetProperties.WriterVersion version)
format version
.version
- a WriterVersion
public ParquetProperties.Builder withMinRowCountForPageSizeCheck(int min)
public ParquetProperties.Builder withMaxRowCountForPageSizeCheck(int max)
public ParquetProperties.Builder withPageValueCountThreshold(int value)
public ParquetProperties.Builder estimateRowCountForPageSizeCheck(boolean estimateNextSizeCheck)
public ParquetProperties.Builder withAllocator(ByteBufferAllocator allocator)
public ParquetProperties.Builder withValuesWriterFactory(ValuesWriterFactory factory)
public ParquetProperties.Builder withColumnIndexTruncateLength(int length)
public ParquetProperties.Builder withStatisticsTruncateLength(int length)
public ParquetProperties.Builder withMaxBloomFilterBytes(int maxBloomFilterBytes)
maxBloomFilterBytes
- the max bytes of a Bloom filter bitset for a column.public ParquetProperties.Builder withBloomFilterNDV(String columnPath, long ndv)
withBloomFilterEnabled(String, boolean)
).columnPath
- the path of the column (dot-string)ndv
- the NDV of the columnpublic ParquetProperties.Builder withBloomFilterFPP(String columnPath, double fpp)
public ParquetProperties.Builder withBloomFilterEnabled(boolean enabled)
withBloomFilterEnabled(String, boolean)
.enabled
- whether bloom filter shall be enabled for all columnspublic ParquetProperties.Builder withAdaptiveBloomFilterEnabled(boolean enabled)
enabled
- whether to use adaptive bloom filterpublic ParquetProperties.Builder withBloomFilterCandidatesNumber(String columnPath, int number)
columnPath
- the path of the column (dot-string)number
- the number of candidatespublic ParquetProperties.Builder withBloomFilterEnabled(String columnPath, boolean enabled)
withBloomFilterEnabled(boolean)
with a
false
value and then enable the bloom filters for the required columns one-by-one by invoking this
method or vice versa.columnPath
- the path of the column (dot-string)enabled
- whether bloom filter shall be enabledpublic ParquetProperties.Builder withPageRowCountLimit(int rowCount)
public ParquetProperties.Builder withPageWriteChecksumEnabled(boolean val)
public ParquetProperties.Builder withExtraMetaData(Map<String,String> extraMetaData)
public ParquetProperties build()
Copyright © 2023 The Apache Software Foundation. All rights reserved.