java.lang.Object
- org.apache.hadoop.mapreduce.OutputFormat<K,V>
- - org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<Void,T>
  - - org.apache.parquet.hadoop.ParquetOutputFormat<T>

Type Parameters:: T - the type of the materialized records

Direct Known Subclasses:: ExampleOutputFormat

public class ParquetOutputFormat<T>
extends org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<Void,T>

OutputFormat to write to a Parquet file It requires a WriteSupport to convert the actual records to the underlying format. It requires the schema of the incoming records. (provided by the write support) It allows storing extra metadata in the footer (for example: for schema compatibility purpose when converting from a different schema language). The format configuration settings in the job configuration:

 # The block size is the size of a row group being buffered in memory
 # this limits the memory usage when writing
 # Larger values will improve the IO when reading but consume more memory when writing
 parquet.block.size=134217728 # in bytes, default = 128 * 1024 * 1024

 # The page size is for compression. When reading, each page can be decompressed independently.
 # A block is composed of pages. The page is the smallest unit that must be read fully to access a single record.
 # If this value is too small, the compression will deteriorate
 parquet.page.size=1048576 # in bytes, default = 1 * 1024 * 1024

 # There is one dictionary page per column per row group when dictionary encoding is used.
 # The dictionary page size works like the page size but for dictionary
 parquet.dictionary.page.size=1048576 # in bytes, default = 1 * 1024 * 1024

 # The compression algorithm used to compress pages
 parquet.compression=UNCOMPRESSED # one of: UNCOMPRESSED, SNAPPY, GZIP, LZO. Default: UNCOMPRESSED. Supersedes mapred.output.compress*

 # The write support class to convert the records written to the OutputFormat into the events accepted by the record consumer
 # Usually provided by a specific ParquetOutputFormat subclass
 parquet.write.support.class= # fully qualified name

 # To enable/disable dictionary encoding
 parquet.enable.dictionary=true # false to disable dictionary encoding

 # To enable/disable summary metadata aggregation at the end of a MR job
 # The default is true (enabled)
 parquet.enable.summary-metadata=true # false to disable summary aggregation

 # Maximum size (in bytes) allowed as padding to align row groups
 # This is also the minimum size of a row group. Default: 8388608
 parquet.writer.max-padding=8388608 # 8 MB

If parquet.compression is not set, the following properties are checked (FileOutputFormat behavior). Note that we explicitely disallow custom Codecs

 mapred.output.compress=true
 mapred.output.compression.codec=org.apache.hadoop.io.compress.SomeCodec # the codec must be one of Snappy, GZip or LZO

if none of those is set the data is uncompressed.

Nested Class Summary

Nested Classes
Modifier and Type Class Description

static class ParquetOutputFormat.JobSummaryLevel
- Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
  org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.Counter

Field Summary

Fields
Modifier and Type	Field	Description
`static String`	`BLOCK_SIZE`
`static String`	`BLOOM_FILTER_ENABLED`
`static String`	`BLOOM_FILTER_EXPECTED_NDV`
`static String`	`BLOOM_FILTER_MAX_BYTES`
`static String`	`COLUMN_INDEX_TRUNCATE_LENGTH`
`static String`	`COMPRESSION`
`static String`	`DICTIONARY_PAGE_SIZE`
`static String`	`ENABLE_DICTIONARY`
`static String`	`ENABLE_JOB_SUMMARY`	Deprecated.
`static String`	`ESTIMATE_PAGE_SIZE_CHECK`
`static String`	`JOB_SUMMARY_LEVEL`	Must be one of the values in `ParquetOutputFormat.JobSummaryLevel` (case insensitive)
`static String`	`MAX_PADDING_BYTES`
`static String`	`MAX_ROW_COUNT_FOR_PAGE_SIZE_CHECK`
`static String`	`MEMORY_POOL_RATIO`
`static String`	`MIN_MEMORY_ALLOCATION`
`static String`	`MIN_ROW_COUNT_FOR_PAGE_SIZE_CHECK`
`static String`	`PAGE_ROW_COUNT_LIMIT`
`static String`	`PAGE_SIZE`
`static String`	`PAGE_WRITE_CHECKSUM_ENABLED`
`static String`	`STATISTICS_TRUNCATE_LENGTH`
`static String`	`VALIDATION`
`static String`	`WRITE_SUPPORT_CLASS`
`static String`	`WRITER_VERSION`

Fields inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
BASE_OUTPUT_NAME, COMPRESS, COMPRESS_CODEC, COMPRESS_TYPE, OUTDIR, PART

Constructor Summary

Constructors
Constructor	Description
`ParquetOutputFormat()`	used when directly using the output format and configuring the write support implementation using parquet.write.support.class
`ParquetOutputFormat(S writeSupport)`	constructor used when this OutputFormat in wrapped in another one (In Pig for example)

Method Summary

All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods
Modifier and Type	Method	Description
`static FileEncryptionProperties`	`createEncryptionProperties(org.apache.hadoop.conf.Configuration fileHadoopConfig, org.apache.hadoop.fs.Path tempFilePath, WriteSupport.WriteContext fileWriteContext)`
`static int`	`getBlockSize(org.apache.hadoop.conf.Configuration configuration)`	Deprecated.
`static int`	`getBlockSize(org.apache.hadoop.mapreduce.JobContext jobContext)`
`static boolean`	`getBloomFilterEnabled(org.apache.hadoop.conf.Configuration conf)`
`static int`	`getBloomFilterMaxBytes(org.apache.hadoop.conf.Configuration conf)`
`static org.apache.parquet.hadoop.metadata.CompressionCodecName`	`getCompression(org.apache.hadoop.conf.Configuration configuration)`
`static org.apache.parquet.hadoop.metadata.CompressionCodecName`	`getCompression(org.apache.hadoop.mapreduce.JobContext jobContext)`
`static int`	`getDictionaryPageSize(org.apache.hadoop.conf.Configuration configuration)`
`static int`	`getDictionaryPageSize(org.apache.hadoop.mapreduce.JobContext jobContext)`
`static boolean`	`getEnableDictionary(org.apache.hadoop.conf.Configuration configuration)`
`static boolean`	`getEnableDictionary(org.apache.hadoop.mapreduce.JobContext jobContext)`
`static boolean`	`getEstimatePageSizeCheck(org.apache.hadoop.conf.Configuration configuration)`
`static ParquetOutputFormat.JobSummaryLevel`	`getJobSummaryLevel(org.apache.hadoop.conf.Configuration conf)`
`static long`	`getLongBlockSize(org.apache.hadoop.conf.Configuration configuration)`
`static int`	`getMaxRowCountForPageSizeCheck(org.apache.hadoop.conf.Configuration configuration)`
`static MemoryManager`	`getMemoryManager()`
`static int`	`getMinRowCountForPageSizeCheck(org.apache.hadoop.conf.Configuration configuration)`
`org.apache.hadoop.mapreduce.OutputCommitter`	`getOutputCommitter(org.apache.hadoop.mapreduce.TaskAttemptContext context)`
`static int`	`getPageSize(org.apache.hadoop.conf.Configuration configuration)`
`static int`	`getPageSize(org.apache.hadoop.mapreduce.JobContext jobContext)`
`static boolean`	`getPageWriteChecksumEnabled(org.apache.hadoop.conf.Configuration conf)`
`org.apache.hadoop.mapreduce.RecordWriter<Void,T>`	`getRecordWriter(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path file, org.apache.parquet.hadoop.metadata.CompressionCodecName codec)`
`org.apache.hadoop.mapreduce.RecordWriter<Void,T>`	`getRecordWriter(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path file, org.apache.parquet.hadoop.metadata.CompressionCodecName codec, ParquetFileWriter.Mode mode)`
`org.apache.hadoop.mapreduce.RecordWriter<Void,T>`	`getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext taskAttemptContext)`
`org.apache.hadoop.mapreduce.RecordWriter<Void,T>`	`getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext taskAttemptContext, org.apache.hadoop.fs.Path file)`
`org.apache.hadoop.mapreduce.RecordWriter<Void,T>`	`getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext taskAttemptContext, org.apache.hadoop.fs.Path file, ParquetFileWriter.Mode mode)`
`org.apache.hadoop.mapreduce.RecordWriter<Void,T>`	`getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext taskAttemptContext, ParquetFileWriter.Mode mode)`
`static boolean`	`getValidation(org.apache.hadoop.conf.Configuration configuration)`
`static boolean`	`getValidation(org.apache.hadoop.mapreduce.JobContext jobContext)`
`static ParquetProperties.WriterVersion`	`getWriterVersion(org.apache.hadoop.conf.Configuration configuration)`
`WriteSupport<T>`	`getWriteSupport(org.apache.hadoop.conf.Configuration configuration)`
`static Class<?>`	`getWriteSupportClass(org.apache.hadoop.conf.Configuration configuration)`
`static boolean`	`isCompressionSet(org.apache.hadoop.conf.Configuration configuration)`
`static boolean`	`isCompressionSet(org.apache.hadoop.mapreduce.JobContext jobContext)`
`static void`	`setBlockSize(org.apache.hadoop.mapreduce.Job job, int blockSize)`
`static void`	`setColumnIndexTruncateLength(org.apache.hadoop.conf.Configuration conf, int length)`
`static void`	`setColumnIndexTruncateLength(org.apache.hadoop.mapreduce.JobContext jobContext, int length)`
`static void`	`setCompression(org.apache.hadoop.mapreduce.Job job, org.apache.parquet.hadoop.metadata.CompressionCodecName compression)`
`static void`	`setDictionaryPageSize(org.apache.hadoop.mapreduce.Job job, int pageSize)`
`static void`	`setEnableDictionary(org.apache.hadoop.mapreduce.Job job, boolean enableDictionary)`
`static void`	`setMaxPaddingSize(org.apache.hadoop.conf.Configuration conf, int maxPaddingSize)`
`static void`	`setMaxPaddingSize(org.apache.hadoop.mapreduce.JobContext jobContext, int maxPaddingSize)`
`static void`	`setPageRowCountLimit(org.apache.hadoop.conf.Configuration conf, int rowCount)`
`static void`	`setPageRowCountLimit(org.apache.hadoop.mapreduce.JobContext jobContext, int rowCount)`
`static void`	`setPageSize(org.apache.hadoop.mapreduce.Job job, int pageSize)`
`static void`	`setPageWriteChecksumEnabled(org.apache.hadoop.conf.Configuration conf, boolean val)`
`static void`	`setPageWriteChecksumEnabled(org.apache.hadoop.mapreduce.JobContext jobContext, boolean val)`
`static void`	`setStatisticsTruncateLength(org.apache.hadoop.mapreduce.JobContext jobContext, int length)`
`static void`	`setValidation(org.apache.hadoop.conf.Configuration configuration, boolean validating)`
`static void`	`setValidation(org.apache.hadoop.mapreduce.JobContext jobContext, boolean validating)`
`static void`	`setWriteSupportClass(org.apache.hadoop.mapred.JobConf job, Class<?> writeSupportClass)`
`static void`	`setWriteSupportClass(org.apache.hadoop.mapreduce.Job job, Class<?> writeSupportClass)`

Methods inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat
checkOutputSpecs, getCompressOutput, getDefaultWorkFile, getOutputCompressorClass, getOutputName, getOutputPath, getPathForWorkFile, getUniqueFile, getWorkOutputPath, setCompressOutput, setOutputCompressorClass, setOutputName, setOutputPath

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail
- ENABLE_JOB_SUMMARY
```
@Deprecated
public static final String ENABLE_JOB_SUMMARY
```
  Deprecated.
  
  An alias for JOB_SUMMARY_LEVEL, where true means ALL and false means NONE
  
  See Also:
  
  Constant Field Values
- JOB_SUMMARY_LEVEL
```
public static final String JOB_SUMMARY_LEVEL
```
  Must be one of the values in ParquetOutputFormat.JobSummaryLevel (case insensitive)
  
  See Also:
  
  Constant Field Values
- BLOCK_SIZE
```
public static final String BLOCK_SIZE
```
  See Also:
  
  Constant Field Values
- PAGE_SIZE
```
public static final String PAGE_SIZE
```
  See Also:
  
  Constant Field Values
- COMPRESSION
```
public static final String COMPRESSION
```
  See Also:
  
  Constant Field Values
- WRITE_SUPPORT_CLASS
```
public static final String WRITE_SUPPORT_CLASS
```
  See Also:
  
  Constant Field Values
- DICTIONARY_PAGE_SIZE
```
public static final String DICTIONARY_PAGE_SIZE
```
  See Also:
  
  Constant Field Values
- ENABLE_DICTIONARY
```
public static final String ENABLE_DICTIONARY
```
  See Also:
  
  Constant Field Values
- VALIDATION
```
public static final String VALIDATION
```
  See Also:
  
  Constant Field Values
- WRITER_VERSION
```
public static final String WRITER_VERSION
```
  See Also:
  
  Constant Field Values
- MEMORY_POOL_RATIO
```
public static final String MEMORY_POOL_RATIO
```
  See Also:
  
  Constant Field Values
- MIN_MEMORY_ALLOCATION
```
public static final String MIN_MEMORY_ALLOCATION
```
  See Also:
  
  Constant Field Values
- MAX_PADDING_BYTES
```
public static final String MAX_PADDING_BYTES
```
  See Also:
  
  Constant Field Values
- MIN_ROW_COUNT_FOR_PAGE_SIZE_CHECK
```
public static final String MIN_ROW_COUNT_FOR_PAGE_SIZE_CHECK
```
  See Also:
  
  Constant Field Values
- MAX_ROW_COUNT_FOR_PAGE_SIZE_CHECK
```
public static final String MAX_ROW_COUNT_FOR_PAGE_SIZE_CHECK
```
  See Also:
  
  Constant Field Values
- ESTIMATE_PAGE_SIZE_CHECK
```
public static final String ESTIMATE_PAGE_SIZE_CHECK
```
  See Also:
  
  Constant Field Values
- COLUMN_INDEX_TRUNCATE_LENGTH
```
public static final String COLUMN_INDEX_TRUNCATE_LENGTH
```
  See Also:
  
  Constant Field Values
- STATISTICS_TRUNCATE_LENGTH
```
public static final String STATISTICS_TRUNCATE_LENGTH
```
  See Also:
  
  Constant Field Values
- BLOOM_FILTER_ENABLED
```
public static final String BLOOM_FILTER_ENABLED
```
  See Also:
  
  Constant Field Values
- BLOOM_FILTER_EXPECTED_NDV
```
public static final String BLOOM_FILTER_EXPECTED_NDV
```
  See Also:
  
  Constant Field Values
- BLOOM_FILTER_MAX_BYTES
```
public static final String BLOOM_FILTER_MAX_BYTES
```
  See Also:
  
  Constant Field Values
- PAGE_ROW_COUNT_LIMIT
```
public static final String PAGE_ROW_COUNT_LIMIT
```
  See Also:
  
  Constant Field Values
- PAGE_WRITE_CHECKSUM_ENABLED
```
public static final String PAGE_WRITE_CHECKSUM_ENABLED
```
  See Also:
  
  Constant Field Values

Constructor Detail
- ParquetOutputFormat
```
public ParquetOutputFormat(S writeSupport)
```
  constructor used when this OutputFormat in wrapped in another one (In Pig for example)
  
  Type Parameters:
  
  S - the Java write support type
  
  Parameters:
  
  writeSupport - the class used to convert the incoming records
- ParquetOutputFormat
```
public ParquetOutputFormat()
```
  used when directly using the output format and configuring the write support implementation using parquet.write.support.class
  
  Type Parameters:
  
  S - the Java write support type

Method Detail

getJobSummaryLevel

public static ParquetOutputFormat.JobSummaryLevel getJobSummaryLevel(org.apache.hadoop.conf.Configuration conf)

setWriteSupportClass

public static void setWriteSupportClass(org.apache.hadoop.mapreduce.Job job,
                                        Class<?> writeSupportClass)

setWriteSupportClass

public static void setWriteSupportClass(org.apache.hadoop.mapred.JobConf job,
                                        Class<?> writeSupportClass)

getWriteSupportClass

public static Class<?> getWriteSupportClass(org.apache.hadoop.conf.Configuration configuration)

setBlockSize

public static void setBlockSize(org.apache.hadoop.mapreduce.Job job,
                                int blockSize)

setPageSize

public static void setPageSize(org.apache.hadoop.mapreduce.Job job,
                               int pageSize)

setDictionaryPageSize

public static void setDictionaryPageSize(org.apache.hadoop.mapreduce.Job job,
                                         int pageSize)

setCompression

public static void setCompression(org.apache.hadoop.mapreduce.Job job,
                                  org.apache.parquet.hadoop.metadata.CompressionCodecName compression)

setEnableDictionary

public static void setEnableDictionary(org.apache.hadoop.mapreduce.Job job,
                                       boolean enableDictionary)

getEnableDictionary

public static boolean getEnableDictionary(org.apache.hadoop.mapreduce.JobContext jobContext)

getBloomFilterMaxBytes

public static int getBloomFilterMaxBytes(org.apache.hadoop.conf.Configuration conf)

getBloomFilterEnabled

public static boolean getBloomFilterEnabled(org.apache.hadoop.conf.Configuration conf)

getBlockSize

public static int getBlockSize(org.apache.hadoop.mapreduce.JobContext jobContext)

getPageSize

public static int getPageSize(org.apache.hadoop.mapreduce.JobContext jobContext)

getDictionaryPageSize

public static int getDictionaryPageSize(org.apache.hadoop.mapreduce.JobContext jobContext)

getCompression

public static org.apache.parquet.hadoop.metadata.CompressionCodecName getCompression(org.apache.hadoop.mapreduce.JobContext jobContext)

isCompressionSet

public static boolean isCompressionSet(org.apache.hadoop.mapreduce.JobContext jobContext)

setValidation

public static void setValidation(org.apache.hadoop.mapreduce.JobContext jobContext,
                                 boolean validating)

getValidation

public static boolean getValidation(org.apache.hadoop.mapreduce.JobContext jobContext)

getEnableDictionary

public static boolean getEnableDictionary(org.apache.hadoop.conf.Configuration configuration)

getMinRowCountForPageSizeCheck

public static int getMinRowCountForPageSizeCheck(org.apache.hadoop.conf.Configuration configuration)

getMaxRowCountForPageSizeCheck

public static int getMaxRowCountForPageSizeCheck(org.apache.hadoop.conf.Configuration configuration)

getEstimatePageSizeCheck

public static boolean getEstimatePageSizeCheck(org.apache.hadoop.conf.Configuration configuration)

getBlockSize

@Deprecated
public static int getBlockSize(org.apache.hadoop.conf.Configuration configuration)

Deprecated.

getLongBlockSize

public static long getLongBlockSize(org.apache.hadoop.conf.Configuration configuration)

getPageSize

public static int getPageSize(org.apache.hadoop.conf.Configuration configuration)

getDictionaryPageSize

public static int getDictionaryPageSize(org.apache.hadoop.conf.Configuration configuration)

getWriterVersion

public static ParquetProperties.WriterVersion getWriterVersion(org.apache.hadoop.conf.Configuration configuration)

getCompression

public static org.apache.parquet.hadoop.metadata.CompressionCodecName getCompression(org.apache.hadoop.conf.Configuration configuration)

isCompressionSet

public static boolean isCompressionSet(org.apache.hadoop.conf.Configuration configuration)

setValidation

public static void setValidation(org.apache.hadoop.conf.Configuration configuration,
                                 boolean validating)

getValidation

public static boolean getValidation(org.apache.hadoop.conf.Configuration configuration)

setMaxPaddingSize

public static void setMaxPaddingSize(org.apache.hadoop.mapreduce.JobContext jobContext,
                                     int maxPaddingSize)

setMaxPaddingSize

public static void setMaxPaddingSize(org.apache.hadoop.conf.Configuration conf,
                                     int maxPaddingSize)

setColumnIndexTruncateLength

public static void setColumnIndexTruncateLength(org.apache.hadoop.mapreduce.JobContext jobContext,
                                                int length)

setColumnIndexTruncateLength

public static void setColumnIndexTruncateLength(org.apache.hadoop.conf.Configuration conf,
                                                int length)

setStatisticsTruncateLength

public static void setStatisticsTruncateLength(org.apache.hadoop.mapreduce.JobContext jobContext,
                                               int length)

setPageRowCountLimit

public static void setPageRowCountLimit(org.apache.hadoop.mapreduce.JobContext jobContext,
                                        int rowCount)

setPageRowCountLimit

public static void setPageRowCountLimit(org.apache.hadoop.conf.Configuration conf,
                                        int rowCount)

setPageWriteChecksumEnabled

public static void setPageWriteChecksumEnabled(org.apache.hadoop.mapreduce.JobContext jobContext,
                                               boolean val)

setPageWriteChecksumEnabled

public static void setPageWriteChecksumEnabled(org.apache.hadoop.conf.Configuration conf,
                                               boolean val)

getPageWriteChecksumEnabled

public static boolean getPageWriteChecksumEnabled(org.apache.hadoop.conf.Configuration conf)

getRecordWriter

public org.apache.hadoop.mapreduce.RecordWriter<Void,T> getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext taskAttemptContext)
                                                                       throws IOException,
                                                                              InterruptedException

Specified by:: getRecordWriter in class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<Void,T>
Throws:: IOException; InterruptedException

getRecordWriter

public org.apache.hadoop.mapreduce.RecordWriter<Void,T> getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext taskAttemptContext,
                                                                              ParquetFileWriter.Mode mode)
                                                                       throws IOException,
                                                                              InterruptedException

Throws:: IOException; InterruptedException

getRecordWriter

public org.apache.hadoop.mapreduce.RecordWriter<Void,T> getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext taskAttemptContext,
                                                                              org.apache.hadoop.fs.Path file)
                                                                       throws IOException,
                                                                              InterruptedException

Throws:: IOException; InterruptedException

getRecordWriter

public org.apache.hadoop.mapreduce.RecordWriter<Void,T> getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext taskAttemptContext,
                                                                              org.apache.hadoop.fs.Path file,
                                                                              ParquetFileWriter.Mode mode)
                                                                       throws IOException,
                                                                              InterruptedException

Throws:: IOException; InterruptedException

getRecordWriter

public org.apache.hadoop.mapreduce.RecordWriter<Void,T> getRecordWriter(org.apache.hadoop.conf.Configuration conf,
                                                                              org.apache.hadoop.fs.Path file,
                                                                              org.apache.parquet.hadoop.metadata.CompressionCodecName codec)
                                                                       throws IOException,
                                                                              InterruptedException

Throws:: IOException; InterruptedException

getRecordWriter

public org.apache.hadoop.mapreduce.RecordWriter<Void,T> getRecordWriter(org.apache.hadoop.conf.Configuration conf,
                                                                              org.apache.hadoop.fs.Path file,
                                                                              org.apache.parquet.hadoop.metadata.CompressionCodecName codec,
                                                                              ParquetFileWriter.Mode mode)
                                                                       throws IOException,
                                                                              InterruptedException

Throws:: IOException; InterruptedException

getWriteSupport
```
public WriteSupport<T> getWriteSupport(org.apache.hadoop.conf.Configuration configuration)
```
Parameters:

configuration - to find the configuration for the write support class

Returns:

the configured write support

getOutputCommitter

public org.apache.hadoop.mapreduce.OutputCommitter getOutputCommitter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
                                                               throws IOException

Overrides:: getOutputCommitter in class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<Void,T>
Throws:: IOException

getMemoryManager

public static MemoryManager getMemoryManager()

createEncryptionProperties

public static FileEncryptionProperties createEncryptionProperties(org.apache.hadoop.conf.Configuration fileHadoopConfig,
                                                                  org.apache.hadoop.fs.Path tempFilePath,
                                                                  WriteSupport.WriteContext fileWriteContext)

Class ParquetOutputFormat<T>

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat

Field Summary

Fields inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat

Constructor Summary

Method Summary

Methods inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat

Methods inherited from class java.lang.Object

Field Detail

ENABLE_JOB_SUMMARY

JOB_SUMMARY_LEVEL

BLOCK_SIZE

PAGE_SIZE

COMPRESSION

WRITE_SUPPORT_CLASS

DICTIONARY_PAGE_SIZE

ENABLE_DICTIONARY

VALIDATION

WRITER_VERSION

MEMORY_POOL_RATIO

MIN_MEMORY_ALLOCATION

MAX_PADDING_BYTES

MIN_ROW_COUNT_FOR_PAGE_SIZE_CHECK

MAX_ROW_COUNT_FOR_PAGE_SIZE_CHECK

ESTIMATE_PAGE_SIZE_CHECK

COLUMN_INDEX_TRUNCATE_LENGTH

STATISTICS_TRUNCATE_LENGTH

BLOOM_FILTER_ENABLED

BLOOM_FILTER_EXPECTED_NDV

BLOOM_FILTER_MAX_BYTES

PAGE_ROW_COUNT_LIMIT

PAGE_WRITE_CHECKSUM_ENABLED

Constructor Detail

ParquetOutputFormat

ParquetOutputFormat

Method Detail

getJobSummaryLevel

setWriteSupportClass

setWriteSupportClass

getWriteSupportClass

setBlockSize

setPageSize

setDictionaryPageSize

setCompression

setEnableDictionary

getEnableDictionary

getBloomFilterMaxBytes

getBloomFilterEnabled

getBlockSize

getPageSize

getDictionaryPageSize

getCompression

isCompressionSet

setValidation

getValidation

getEnableDictionary

getMinRowCountForPageSizeCheck

getMaxRowCountForPageSizeCheck

getEstimatePageSizeCheck

getBlockSize

getLongBlockSize

getPageSize

getDictionaryPageSize

getWriterVersion

getCompression

isCompressionSet

setValidation

getValidation

setMaxPaddingSize

setMaxPaddingSize

setColumnIndexTruncateLength

setColumnIndexTruncateLength

setStatisticsTruncateLength

setPageRowCountLimit

setPageRowCountLimit

setPageWriteChecksumEnabled

setPageWriteChecksumEnabled

getPageWriteChecksumEnabled

getRecordWriter