Class ParquetWriter<T>

    • Field Detail


        public static final org.apache.parquet.hadoop.metadata.CompressionCodecName DEFAULT_COMPRESSION_CODEC_NAME

        public static final boolean DEFAULT_IS_DICTIONARY_ENABLED
        See Also:
        Constant Field Values

        public static final boolean DEFAULT_IS_VALIDATING_ENABLED
        See Also:
        Constant Field Values

        public static final int MAX_PADDING_SIZE_DEFAULT
        See Also:
        Constant Field Values
    • Constructor Detail

      • ParquetWriter

        public ParquetWriter​(org.apache.hadoop.fs.Path file,
                             WriteSupport<T> writeSupport,
                             org.apache.parquet.hadoop.metadata.CompressionCodecName compressionCodecName,
                             int blockSize,
                             int pageSize)
                      throws IOException
        will be removed in 2.0.0
        Create a new ParquetWriter. (with dictionary encoding enabled and validation off)
        file - the file to create
        writeSupport - the implementation to write a record to a RecordConsumer
        compressionCodecName - the compression codec to use
        blockSize - the block size threshold
        pageSize - the page size threshold
        IOException - if there is an error while writing
      • ParquetWriter

        public ParquetWriter​(org.apache.hadoop.fs.Path file,
                             WriteSupport<T> writeSupport,
                             org.apache.parquet.hadoop.metadata.CompressionCodecName compressionCodecName,
                             int blockSize,
                             int pageSize,
                             boolean enableDictionary,
                             boolean validating)
                      throws IOException
        will be removed in 2.0.0
        Create a new ParquetWriter.
        file - the file to create
        writeSupport - the implementation to write a record to a RecordConsumer
        compressionCodecName - the compression codec to use
        blockSize - the block size threshold
        pageSize - the page size threshold (both data and dictionary)
        enableDictionary - to turn dictionary encoding on
        validating - to turn on validation using the schema
        IOException - if there is an error while writing
      • ParquetWriter

        public ParquetWriter​(org.apache.hadoop.fs.Path file,
                             WriteSupport<T> writeSupport,
                             org.apache.parquet.hadoop.metadata.CompressionCodecName compressionCodecName,
                             int blockSize,
                             int pageSize,
                             int dictionaryPageSize,
                             boolean enableDictionary,
                             boolean validating)
                      throws IOException
        will be removed in 2.0.0
        Create a new ParquetWriter.
        file - the file to create
        writeSupport - the implementation to write a record to a RecordConsumer
        compressionCodecName - the compression codec to use
        blockSize - the block size threshold
        pageSize - the page size threshold
        dictionaryPageSize - the page size threshold for the dictionary pages
        enableDictionary - to turn dictionary encoding on
        validating - to turn on validation using the schema
        IOException - if there is an error while writing
      • ParquetWriter

        public ParquetWriter​(org.apache.hadoop.fs.Path file,
                             WriteSupport<T> writeSupport,
                             org.apache.parquet.hadoop.metadata.CompressionCodecName compressionCodecName,
                             int blockSize,
                             int pageSize,
                             int dictionaryPageSize,
                             boolean enableDictionary,
                             boolean validating,
                             ParquetProperties.WriterVersion writerVersion)
                      throws IOException
        will be removed in 2.0.0
        Create a new ParquetWriter. Directly instantiates a Hadoop Configuration which reads configuration from the classpath.
        file - the file to create
        writeSupport - the implementation to write a record to a RecordConsumer
        compressionCodecName - the compression codec to use
        blockSize - the block size threshold
        pageSize - the page size threshold
        dictionaryPageSize - the page size threshold for the dictionary pages
        enableDictionary - to turn dictionary encoding on
        validating - to turn on validation using the schema
        writerVersion - version of parquetWriter from ParquetProperties.WriterVersion
        IOException - if there is an error while writing
      • ParquetWriter

        public ParquetWriter​(org.apache.hadoop.fs.Path file,
                             WriteSupport<T> writeSupport,
                             org.apache.parquet.hadoop.metadata.CompressionCodecName compressionCodecName,
                             int blockSize,
                             int pageSize,
                             int dictionaryPageSize,
                             boolean enableDictionary,
                             boolean validating,
                             ParquetProperties.WriterVersion writerVersion,
                             org.apache.hadoop.conf.Configuration conf)
                      throws IOException
        will be removed in 2.0.0
        Create a new ParquetWriter.
        file - the file to create
        writeSupport - the implementation to write a record to a RecordConsumer
        compressionCodecName - the compression codec to use
        blockSize - the block size threshold
        pageSize - the page size threshold
        dictionaryPageSize - the page size threshold for the dictionary pages
        enableDictionary - to turn dictionary encoding on
        validating - to turn on validation using the schema
        writerVersion - version of parquetWriter from ParquetProperties.WriterVersion
        conf - Hadoop configuration to use while accessing the filesystem
        IOException - if there is an error while writing
      • ParquetWriter

        public ParquetWriter​(org.apache.hadoop.fs.Path file,
                             ParquetFileWriter.Mode mode,
                             WriteSupport<T> writeSupport,
                             org.apache.parquet.hadoop.metadata.CompressionCodecName compressionCodecName,
                             int blockSize,
                             int pageSize,
                             int dictionaryPageSize,
                             boolean enableDictionary,
                             boolean validating,
                             ParquetProperties.WriterVersion writerVersion,
                             org.apache.hadoop.conf.Configuration conf)
                      throws IOException
        will be removed in 2.0.0
        Create a new ParquetWriter.
        file - the file to create
        mode - file creation mode
        writeSupport - the implementation to write a record to a RecordConsumer
        compressionCodecName - the compression codec to use
        blockSize - the block size threshold
        pageSize - the page size threshold
        dictionaryPageSize - the page size threshold for the dictionary pages
        enableDictionary - to turn dictionary encoding on
        validating - to turn on validation using the schema
        writerVersion - version of parquetWriter from ParquetProperties.WriterVersion
        conf - Hadoop configuration to use while accessing the filesystem
        IOException - if there is an error while writing
      • ParquetWriter

        public ParquetWriter​(org.apache.hadoop.fs.Path file,
                             WriteSupport<T> writeSupport)
                      throws IOException
        will be removed in 2.0.0
        Create a new ParquetWriter. The default block size is 50 MB.The default page size is 1 MB. Default compression is no compression. Dictionary encoding is disabled.
        file - the file to create
        writeSupport - the implementation to write a record to a RecordConsumer
        IOException - if there is an error while writing