Class DictionaryValuesWriter

    • Field Detail

      • encodingForDictionaryPage

        protected final Encoding encodingForDictionaryPage
      • maxDictionaryByteSize

        protected final int maxDictionaryByteSize
      • dictionaryTooBig

        protected boolean dictionaryTooBig
      • dictionaryByteSize

        protected long dictionaryByteSize
      • lastUsedDictionaryByteSize

        protected int lastUsedDictionaryByteSize
      • lastUsedDictionarySize

        protected int lastUsedDictionarySize
      • encodedValues

        protected IntList encodedValues
      • firstPage

        protected boolean firstPage
        indicates if this is the first page being processed
      • allocator

        protected org.apache.parquet.bytes.ByteBufferAllocator allocator
    • Constructor Detail

      • DictionaryValuesWriter

        protected DictionaryValuesWriter​(int maxDictionaryByteSize,
                                         Encoding encodingForDataPage,
                                         Encoding encodingForDictionaryPage,
                                         org.apache.parquet.bytes.ByteBufferAllocator allocator)
    • Method Detail

      • shouldFallBack

        public boolean shouldFallBack()
        Description copied from interface: RequiresFallback
        In the case of a dictionary based encoding we will fallback if the dictionary becomes too big
        Specified by:
        shouldFallBack in interface RequiresFallback
        Returns:
        true to notify the parent that we should fallback to another encoding
      • isCompressionSatisfying

        public boolean isCompressionSatisfying​(long rawSize,
                                               long encodedSize)
        Description copied from interface: RequiresFallback
        Before writing the first page we will verify if the encoding is worth it. and fall back if a simpler encoding would be better in that case
        Specified by:
        isCompressionSatisfying in interface RequiresFallback
        Parameters:
        rawSize - the size if encoded with plain
        encodedSize - the size as encoded by the current encoding
        Returns:
        true if we keep this encoding
      • fallBackAllValuesTo

        public void fallBackAllValuesTo​(ValuesWriter writer)
        Description copied from interface: RequiresFallback
        When falling back to a different encoding we must re-encode all the values seen so far
        Specified by:
        fallBackAllValuesTo in interface RequiresFallback
        Parameters:
        writer - the new encoder to write the current values to
      • fallBackDictionaryEncodedData

        protected abstract void fallBackDictionaryEncodedData​(ValuesWriter writer)
      • getBufferedSize

        public long getBufferedSize()
        Description copied from class: ValuesWriter
        used to decide if we want to work to the next page
        Specified by:
        getBufferedSize in class ValuesWriter
        Returns:
        the size of the currently buffered data (in bytes)
      • getBytes

        public org.apache.parquet.bytes.BytesInput getBytes()
        Specified by:
        getBytes in class ValuesWriter
        Returns:
        the bytes buffered so far to write to the current page
      • getEncoding

        public Encoding getEncoding()
        Description copied from class: ValuesWriter
        called after getBytes() and before reset()
        Specified by:
        getEncoding in class ValuesWriter
        Returns:
        the encoding that was used to encode the bytes
      • reset

        public void reset()
        Description copied from class: ValuesWriter
        called after getBytes() to reset the current buffer and start writing the next page
        Specified by:
        reset in class ValuesWriter
      • close

        public void close()
        Description copied from class: ValuesWriter
        Called to close the values writer. Any output stream is closed and can no longer be used. All resources are released.
        Overrides:
        close in class ValuesWriter
      • clearDictionaryContent

        protected abstract void clearDictionaryContent()
        clear/free the underlying dictionary content
      • getDictionarySize

        protected abstract int getDictionarySize()
        Returns:
        size in items