Class LazyBinaryFormat<T>

  • All Implemented Interfaces:
    BinaryFormat
    Direct Known Subclasses:
    BinaryRawValueData, BinaryStringData

    @Internal
    public abstract class LazyBinaryFormat<T>
    extends Object
    implements BinaryFormat
    An abstract implementation fo BinaryFormat which is lazily serialized into binary or lazily deserialized into Java object.

    The reason why we introduce this data structure is in order to save (de)serialization in nested function calls. Consider the following function call chain:

    UDF0(input) -> UDF1(result0) -> UDF2(result1) -> UDF3(result2)

    Such nested calls, if the return values of UDFs are Java object format, it will result in multiple conversions between Java object and binary format:

     converterToBinary(UDF0(converterToJavaObject(input))) ->
       converterToBinary(UDF1(converterToJavaObject(result0))) ->
         converterToBinary(UDF2(converterToJavaObject(result1))) ->
           ...
     

    So we introduced LazyBinaryFormat to avoid the redundant cost, it has three forms:

    • Binary form
    • Java object form
    • Binary and Java object both exist

    It can lazy the conversions as much as possible. It will be converted into required form only when it is needed.

    • Constructor Detail

      • LazyBinaryFormat

        public LazyBinaryFormat()
      • LazyBinaryFormat

        public LazyBinaryFormat​(org.apache.flink.core.memory.MemorySegment[] segments,
                                int offset,
                                int sizeInBytes,
                                T javaObject)
      • LazyBinaryFormat

        public LazyBinaryFormat​(org.apache.flink.core.memory.MemorySegment[] segments,
                                int offset,
                                int sizeInBytes)
      • LazyBinaryFormat

        public LazyBinaryFormat​(T javaObject)
      • LazyBinaryFormat

        public LazyBinaryFormat​(T javaObject,
                                BinarySection binarySection)
    • Method Detail

      • getJavaObject

        public T getJavaObject()
      • setJavaObject

        public void setJavaObject​(T javaObject)
        Must be public as it is used during code generation.
      • getSegments

        public org.apache.flink.core.memory.MemorySegment[] getSegments()
        Description copied from interface: BinaryFormat
        Gets the underlying MemorySegments this binary format spans.
        Specified by:
        getSegments in interface BinaryFormat
      • getOffset

        public int getOffset()
        Description copied from interface: BinaryFormat
        Gets the start offset of this binary data in the MemorySegments.
        Specified by:
        getOffset in interface BinaryFormat
      • getSizeInBytes

        public int getSizeInBytes()
        Description copied from interface: BinaryFormat
        Gets the size in bytes of this binary data.
        Specified by:
        getSizeInBytes in interface BinaryFormat
      • ensureMaterialized

        public final void ensureMaterialized​(org.apache.flink.api.common.typeutils.TypeSerializer<T> serializer)
        Ensure we have materialized binary format.
      • materialize

        protected abstract BinarySection materialize​(org.apache.flink.api.common.typeutils.TypeSerializer<T> serializer)
                                              throws IOException
        Materialize java object to binary format. Inherited classes need to hold the information they need. (For example, RawValueData needs javaObjectSerializer).
        Throws:
        IOException