Class Frame


  • public class Frame
    extends Object
    A data frame. Frames are split into contiguous "regions". With columnar frames (FrameType.COLUMNAR) each region is a column. With row-based frames (FrameType.ROW_BASED) there are always two regions: row offsets and row data. This object is lightweight. It has constant overhead regardless of the number of rows or regions. Frames wrap some Memory. If the memory is backed by a resource that requires explicit releasing, such as direct off-heap memory or a memory-mapped file, the creator of the Memory is responsible for releasing that resource when the frame is no longer needed. Frames are written with FrameWriter and read with FrameReader. Frame format: - 1 byte: FrameType.version() - 8 bytes: size in bytes of the frame, little-endian long - 4 bytes: number of rows, little-endian int - 4 bytes: number of regions, little-endian int - 1 byte: 0 if frame is nonpermuted, 1 if frame is permuted - 4 bytes x numRows: permutation section; only present for permuted frames. Array of little-endian ints mapping logical row numbers to physical row numbers. - 8 bytes x numRegions: region offsets. Array of end offsets of each region (exclusive), relative to start of frame, as little-endian longs. - NNN bytes: regions, back-to-back. There is also a compressed frame format. Compressed frames are written by writeTo(java.nio.channels.WritableByteChannel, boolean, java.nio.ByteBuffer, org.apache.druid.frame.channel.ByteTracker) when "compress" is true, and decompressed by decompress(org.apache.datasketches.memory.Memory, long, long). Format: - 1 byte: compression type: CompressionStrategy.getId(). Currently, only LZ4 is supported. - 8 bytes: compressed frame length, little-endian long - 8 bytes: uncompressed frame length (numBytes), little-endian long - NNN bytes: LZ4-compressed frame - 8 bytes: 64-bit xxhash checksum of prior content, including 16-byte header and compressed frame, little-endian long Note to developers: if we end up needing to add more fields here, consider introducing a Smile (or Protobuf, etc) header to make it simpler to add more fields.
    • Field Detail

      • COMPRESSED_FRAME_HEADER_SIZE

        public static final int COMPRESSED_FRAME_HEADER_SIZE
        See Also:
        Constant Field Values
      • COMPRESSED_FRAME_TRAILER_SIZE

        public static final int COMPRESSED_FRAME_TRAILER_SIZE
        See Also:
        Constant Field Values
      • COMPRESSED_FRAME_ENVELOPE_SIZE

        public static final int COMPRESSED_FRAME_ENVELOPE_SIZE
        See Also:
        Constant Field Values
    • Method Detail

      • wrap

        public static Frame wrap​(org.apache.datasketches.memory.Memory memory)
        Returns a frame backed by the provided Memory. This operation does not do any copies or allocations. The Memory must be in little-endian byte order. Behavior is undefined if the memory is modified anytime during the lifetime of the Frame object.
      • wrap

        public static Frame wrap​(ByteBuffer buffer)
        Returns a frame backed by the provided ByteBuffer. This operation does not do any copies or allocations. The position and limit of the buffer are ignored. If you need them to be respected, call ByteBuffer.slice() first, or use wrap(Memory) to wrap a particular region.
      • wrap

        public static Frame wrap​(byte[] bytes)
        Returns a frame backed by the provided byte array. This operation does not do any copies or allocations. The position and limit of the buffer are ignored. If you need them to be respected, call ByteBuffer.slice() first, or use wrap(Memory) to wrap a particular region.
      • decompress

        public static Frame decompress​(org.apache.datasketches.memory.Memory memory,
                                       long position,
                                       long length)
        Decompresses the provided memory and returns a frame backed by that decompressed memory. This operation is safe even on corrupt data: it validates position, length, and checksum prior to decompressing. This operation allocates memory on-heap to store the decompressed frame.
      • numBytes

        public long numBytes()
      • numRows

        public int numRows()
      • numRegions

        public int numRegions()
      • isPermuted

        public boolean isPermuted()
      • physicalRow

        public int physicalRow​(int logicalRow)
        Maps a logical row number to a physical row number. If the frame is non-permuted, these are the same. If the frame is permuted, this uses the sorted-row mappings to remap the row number.
        Throws:
        IllegalArgumentException - if "logicalRow" is out of bounds
      • region

        public org.apache.datasketches.memory.Memory region​(int regionNumber)
        Returns memory corresponding to a particular region of this frame.
      • writableMemory

        public org.apache.datasketches.memory.WritableMemory writableMemory()
        Direct, writable access to this frame's memory. Used by operations that modify the frame in-place, like FrameSort. Most callers should use region(int) and physicalRow(int), rather than this direct-access method.
        Throws:
        IllegalStateException - if this frame wraps non-writable memory
      • writeTo

        public long writeTo​(WritableByteChannel channel,
                            boolean compress,
                            @Nullable
                            ByteBuffer compressionBuffer,
                            ByteTracker byteTracker)
                     throws IOException
        Writes this frame to a channel, optionally compressing it as well. Returns the number of bytes written. The provided compressionBuffer is used to hold compressed data temporarily, prior to writing it to the channel. It must be at least as large as compressionBufferSize(numBytes()), or else an IllegalStateException is thrown. It may be null if "compress" is false.
        Throws:
        IOException