Class OrcColumnVector

  • All Implemented Interfaces:
    AutoCloseable

    public class OrcColumnVector
    extends ColumnVector
    A column vector class wrapping Hive's ColumnVector. Because Spark ColumnarBatch only accepts Spark's vectorized.ColumnVector, this column vector is used to adapt Hive ColumnVector with Spark ColumnarVector.
    • Method Detail

      • setBatchSize

        public void setBatchSize​(int batchSize)
      • close

        public void close()
        Description copied from class: ColumnVector
        Cleans up memory for this column vector. The column vector is not usable after this. This overwrites `AutoCloseable.close` to remove the `throws` clause, as column vector is in-memory and we don't expect any exception to happen during closing.
        Specified by:
        close in interface AutoCloseable
        Specified by:
        close in class ColumnVector
      • hasNull

        public boolean hasNull()
        Description copied from class: ColumnVector
        Returns true if this column vector contains any null values.
        Specified by:
        hasNull in class ColumnVector
      • numNulls

        public int numNulls()
        Description copied from class: ColumnVector
        Returns the number of nulls in this column vector.
        Specified by:
        numNulls in class ColumnVector
      • isNullAt

        public boolean isNullAt​(int rowId)
        Description copied from class: ColumnVector
        Returns whether the value at rowId is NULL.
        Specified by:
        isNullAt in class ColumnVector
      • getBoolean

        public boolean getBoolean​(int rowId)
        Description copied from class: ColumnVector
        Returns the boolean type value for rowId. The return value is undefined and can be anything, if the slot for rowId is null.
        Specified by:
        getBoolean in class ColumnVector
      • getByte

        public byte getByte​(int rowId)
        Description copied from class: ColumnVector
        Returns the byte type value for rowId. The return value is undefined and can be anything, if the slot for rowId is null.
        Specified by:
        getByte in class ColumnVector
      • getShort

        public short getShort​(int rowId)
        Description copied from class: ColumnVector
        Returns the short type value for rowId. The return value is undefined and can be anything, if the slot for rowId is null.
        Specified by:
        getShort in class ColumnVector
      • getInt

        public int getInt​(int rowId)
        Description copied from class: ColumnVector
        Returns the int type value for rowId. The return value is undefined and can be anything, if the slot for rowId is null.
        Specified by:
        getInt in class ColumnVector
      • getLong

        public long getLong​(int rowId)
        Description copied from class: ColumnVector
        Returns the long type value for rowId. The return value is undefined and can be anything, if the slot for rowId is null.
        Specified by:
        getLong in class ColumnVector
      • getFloat

        public float getFloat​(int rowId)
        Description copied from class: ColumnVector
        Returns the float type value for rowId. The return value is undefined and can be anything, if the slot for rowId is null.
        Specified by:
        getFloat in class ColumnVector
      • getDouble

        public double getDouble​(int rowId)
        Description copied from class: ColumnVector
        Returns the double type value for rowId. The return value is undefined and can be anything, if the slot for rowId is null.
        Specified by:
        getDouble in class ColumnVector
      • getDecimal

        public org.apache.spark.sql.types.Decimal getDecimal​(int rowId,
                                                             int precision,
                                                             int scale)
        Description copied from class: ColumnVector
        Returns the decimal type value for rowId. If the slot for rowId is null, it should return null.
        Specified by:
        getDecimal in class ColumnVector
      • getUTF8String

        public org.apache.spark.unsafe.types.UTF8String getUTF8String​(int rowId)
        Description copied from class: ColumnVector
        Returns the string type value for rowId. If the slot for rowId is null, it should return null. Note that the returned UTF8String may point to the data of this column vector, please copy it if you want to keep it after this column vector is freed.
        Specified by:
        getUTF8String in class ColumnVector
      • getBinary

        public byte[] getBinary​(int rowId)
        Description copied from class: ColumnVector
        Returns the binary type value for rowId. If the slot for rowId is null, it should return null.
        Specified by:
        getBinary in class ColumnVector
      • getArray

        public ColumnarArray getArray​(int rowId)
        Description copied from class: ColumnVector
        Returns the array type value for rowId. If the slot for rowId is null, it should return null. To support array type, implementations must construct an ColumnarArray and return it in this method. ColumnarArray requires a ColumnVector that stores the data of all the elements of all the arrays in this vector, and an offset and length which points to a range in that ColumnVector, and the range represents the array for rowId. Implementations are free to decide where to put the data vector and offsets and lengths. For example, we can use the first child vector as the data vector, and store offsets and lengths in 2 int arrays in this vector.
        Specified by:
        getArray in class ColumnVector
      • getMap

        public ColumnarMap getMap​(int rowId)
        Description copied from class: ColumnVector
        Returns the map type value for rowId. If the slot for rowId is null, it should return null. In Spark, map type value is basically a key data array and a value data array. A key from the key array with a index and a value from the value array with the same index contribute to an entry of this map type value. To support map type, implementations must construct a ColumnarMap and return it in this method. ColumnarMap requires a ColumnVector that stores the data of all the keys of all the maps in this vector, and another ColumnVector that stores the data of all the values of all the maps in this vector, and a pair of offset and length which specify the range of the key/value array that belongs to the map type value at rowId.
        Specified by:
        getMap in class ColumnVector