Class ZstandardCodec

  • All Implemented Interfaces:
    org.apache.hadoop.conf.Configurable, org.apache.hadoop.io.compress.CompressionCodec

    public class ZstandardCodec
    extends Object
    implements org.apache.hadoop.conf.Configurable, org.apache.hadoop.io.compress.CompressionCodec
    ZSTD compression codec for Parquet. We do not use the default hadoop one because it requires 1) to set up hadoop on local development machine; 2) to upgrade hadoop to the newer version to have ZSTD support which is more cumbersome than upgrading parquet version. This implementation relies on ZSTD JNI(https://github.com/luben/zstd-jni) which is already a dependency for Parquet. ZSTD JNI ZstdOutputStream and ZstdInputStream use Zstd internally. So no need to create compressor and decompressor in ZstandardCodec.
    • Field Detail

      • PARQUET_COMPRESS_ZSTD_BUFFERPOOL_ENABLED

        public static final String PARQUET_COMPRESS_ZSTD_BUFFERPOOL_ENABLED
        See Also:
        Constant Field Values
      • DEFAULT_PARQUET_COMPRESS_ZSTD_BUFFERPOOL_ENABLED

        public static final boolean DEFAULT_PARQUET_COMPRESS_ZSTD_BUFFERPOOL_ENABLED
        See Also:
        Constant Field Values
      • DEFAULT_PARQUET_COMPRESS_ZSTD_LEVEL

        public static final int DEFAULT_PARQUET_COMPRESS_ZSTD_LEVEL
        See Also:
        Constant Field Values
      • DEFAULTPARQUET_COMPRESS_ZSTD_WORKERS

        public static final int DEFAULTPARQUET_COMPRESS_ZSTD_WORKERS
        See Also:
        Constant Field Values
    • Constructor Detail

      • ZstandardCodec

        public ZstandardCodec()
    • Method Detail

      • setConf

        public void setConf​(org.apache.hadoop.conf.Configuration conf)
        Specified by:
        setConf in interface org.apache.hadoop.conf.Configurable
      • getConf

        public org.apache.hadoop.conf.Configuration getConf()
        Specified by:
        getConf in interface org.apache.hadoop.conf.Configurable
      • createCompressor

        public org.apache.hadoop.io.compress.Compressor createCompressor()
        Specified by:
        createCompressor in interface org.apache.hadoop.io.compress.CompressionCodec
      • createDecompressor

        public org.apache.hadoop.io.compress.Decompressor createDecompressor()
        Specified by:
        createDecompressor in interface org.apache.hadoop.io.compress.CompressionCodec
      • createInputStream

        public org.apache.hadoop.io.compress.CompressionInputStream createInputStream​(InputStream stream,
                                                                                      org.apache.hadoop.io.compress.Decompressor decompressor)
                                                                               throws IOException
        Specified by:
        createInputStream in interface org.apache.hadoop.io.compress.CompressionCodec
        Throws:
        IOException
      • createInputStream

        public org.apache.hadoop.io.compress.CompressionInputStream createInputStream​(InputStream stream)
                                                                               throws IOException
        Specified by:
        createInputStream in interface org.apache.hadoop.io.compress.CompressionCodec
        Throws:
        IOException
      • createOutputStream

        public org.apache.hadoop.io.compress.CompressionOutputStream createOutputStream​(OutputStream stream,
                                                                                        org.apache.hadoop.io.compress.Compressor compressor)
                                                                                 throws IOException
        Specified by:
        createOutputStream in interface org.apache.hadoop.io.compress.CompressionCodec
        Throws:
        IOException
      • createOutputStream

        public org.apache.hadoop.io.compress.CompressionOutputStream createOutputStream​(OutputStream stream)
                                                                                 throws IOException
        Specified by:
        createOutputStream in interface org.apache.hadoop.io.compress.CompressionCodec
        Throws:
        IOException
      • getCompressorType

        public Class<? extends org.apache.hadoop.io.compress.Compressor> getCompressorType()
        Specified by:
        getCompressorType in interface org.apache.hadoop.io.compress.CompressionCodec
      • getDecompressorType

        public Class<? extends org.apache.hadoop.io.compress.Decompressor> getDecompressorType()
        Specified by:
        getDecompressorType in interface org.apache.hadoop.io.compress.CompressionCodec
      • getDefaultExtension

        public String getDefaultExtension()
        Specified by:
        getDefaultExtension in interface org.apache.hadoop.io.compress.CompressionCodec