Class ParquetProperties.Builder

    • Method Detail

      • withPageSize

        public ParquetProperties.Builder withPageSize​(int pageSize)
        Set the Parquet format page size.
        Parameters:
        pageSize - an integer size in bytes
        Returns:
        this builder for method chaining.
      • withDictionaryEncoding

        public ParquetProperties.Builder withDictionaryEncoding​(boolean enableDictionary)
        Enable or disable dictionary encoding.
        Parameters:
        enableDictionary - whether dictionary encoding should be enabled
        Returns:
        this builder for method chaining.
      • withDictionaryEncoding

        public ParquetProperties.Builder withDictionaryEncoding​(String columnPath,
                                                                boolean enableDictionary)
        Enable or disable dictionary encoding for the specified column.
        Parameters:
        columnPath - the path of the column (dot-string)
        enableDictionary - whether dictionary encoding should be enabled
        Returns:
        this builder for method chaining.
      • withByteStreamSplitEncoding

        public ParquetProperties.Builder withByteStreamSplitEncoding​(boolean enableByteStreamSplit)
      • withDictionaryPageSize

        public ParquetProperties.Builder withDictionaryPageSize​(int dictionaryPageSize)
        Set the Parquet format dictionary page size.
        Parameters:
        dictionaryPageSize - an integer size in bytes
        Returns:
        this builder for method chaining.
      • estimateRowCountForPageSizeCheck

        public ParquetProperties.Builder estimateRowCountForPageSizeCheck​(boolean estimateNextSizeCheck)
      • withMaxBloomFilterBytes

        public ParquetProperties.Builder withMaxBloomFilterBytes​(int maxBloomFilterBytes)
        Set max Bloom filter bytes for related columns.
        Parameters:
        maxBloomFilterBytes - the max bytes of a Bloom filter bitset for a column.
        Returns:
        this builder for method chaining
      • withBloomFilterNDV

        public ParquetProperties.Builder withBloomFilterNDV​(String columnPath,
                                                            long ndv)
        Set Bloom filter NDV (number of distinct values) for the specified column. If set for a column then the writing of the bloom filter for that column will be automatically enabled (see withBloomFilterEnabled(String, boolean)).
        Parameters:
        columnPath - the path of the column (dot-string)
        ndv - the NDV of the column
        Returns:
        this builder for method chaining
      • withBloomFilterEnabled

        public ParquetProperties.Builder withBloomFilterEnabled​(boolean enabled)
        Enable or disable the bloom filter for the columns not specified by withBloomFilterEnabled(String, boolean).
        Parameters:
        enabled - whether bloom filter shall be enabled for all columns
        Returns:
        this builder for method chaining
      • withBloomFilterEnabled

        public ParquetProperties.Builder withBloomFilterEnabled​(String columnPath,
                                                                boolean enabled)
        Enable or disable the bloom filter for the specified column. One may either disable bloom filters for all columns by invoking withBloomFilterEnabled(boolean) with a false value and then enable the bloom filters for the required columns one-by-one by invoking this method or vice versa.
        Parameters:
        columnPath - the path of the column (dot-string)
        enabled - whether bloom filter shall be enabled
        Returns:
        this builder for method chaining