shark.execution

MapJoinOperator

class MapJoinOperator extends CommonJoinOperator[MapJoinDesc]

A join operator optimized for joining a large table with a number of small tables that fit in memory. The join can be performed as a map only job that avoids an expensive shuffle process.

Different from Hive, we don't spill the hash tables to disk. If the "small" tables are too big to fit in memory, the normal join should be used anyway.

Linear Supertypes
CommonJoinOperator[MapJoinDesc], JoinFilter[MapJoinDesc], NaryOperator[MapJoinDesc], Operator[MapJoinDesc], Serializable, Serializable, LogHelper, Logging, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. MapJoinOperator
  2. CommonJoinOperator
  3. JoinFilter
  4. NaryOperator
  5. Operator
  6. Serializable
  7. Serializable
  8. LogHelper
  9. Logging
  10. AnyRef
  11. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new MapJoinOperator()

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. def addChild(child: Operator[_ <: HiveDesc]): Unit

    Definition Classes
    Operator
  7. def addParent(parent: Operator[_ <: HiveDesc]): Unit

    Definition Classes
    Operator
  8. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  9. var bigTableAlias: Int

  10. var bigTableAliasByte: Byte

  11. def childOperators: ArrayBuffer[Operator[_ <: HiveDesc]]

    Definition Classes
    Operator
  12. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  13. def combineMultipleRdds(rdds: Seq[(Int, RDD[_])]): RDD[_]

    Called on master.

    Called on master.

    Definition Classes
    MapJoinOperatorNaryOperator
  14. def computeJoinKeyValuesOnPartition[T](iter: Iterator[T], posByte: Byte): Iterator[(Seq[AnyRef], Seq[Array[AnyRef]])]

  15. var conf: MapJoinDesc

    Definition Classes
    CommonJoinOperator
  16. def desc: MapJoinDesc

    Definition Classes
    Operator
  17. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  18. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  19. def errStream(): PrintStream

    Definition Classes
    LogHelper
  20. def execute(): RDD[_]

    Execute the operator.

    Execute the operator. This should recursively execute parent operators.

    Definition Classes
    MapJoinOperatorNaryOperatorOperator
  21. def executeParents(): Seq[(Int, RDD[_])]

    Definition Classes
    MapJoinOperatorOperator
  22. def filterEval(data: AnyRef): Boolean

    Definition Classes
    CommonJoinOperator
    Annotations
    @inline()
  23. var filterMap: Array[Array[Int]]

    Definition Classes
    CommonJoinOperator
  24. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  25. def generate[B <: AnyRef](elements: Array[B]): Array[AnyRef]

    Create the new output tuple based on the input elements and return it.

    Create the new output tuple based on the input elements and return it.

    The join sequence generally looks like:

    join (filter value eval) ... / \ join (filter value eval) \ / \ \ table1(col1,col2..) table2(colx, coly..) table3(cola, colb..) ...

    elements

    [input] represents values of all input tables columns

    Definition Classes
    JoinFilter
  26. def getBigTableAlias(): Int

  27. def getBigTableAliasByte(): Byte

  28. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  29. def getConf(): MapJoinDesc

    Definition Classes
    CommonJoinOperator
  30. def getInputOffsets(): Array[Int]

    Definition Classes
    JoinFilter
  31. def getJoinConditions(): Array[JoinCondDesc]

    Definition Classes
    CommonJoinOperator
  32. def getNullCheck(): Boolean

    Definition Classes
    CommonJoinOperator
  33. def getNumTables(): Int

    Definition Classes
    CommonJoinOperator
  34. def getOrder(): Array[Byte]

    Definition Classes
    CommonJoinOperator
  35. def getPosBigTable(): Int

  36. def getResultRowSize(): Int

    Definition Classes
    JoinFilter
  37. def getResultTupleSizes(): Array[Int]

    Definition Classes
    JoinFilter
  38. def getTag: Int

    Return the join tag.

    Return the join tag. This is usually just 0. ReduceSink might set it to something else.

    Definition Classes
    Operator
  39. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  40. def hconf: HiveConf

    Definition Classes
    Operator
  41. def initEvaluators(evals: Array[ExprNodeEvaluator], start: Int, length: Int, rowInspector: ObjectInspector): Array[ObjectInspector]

    Initialize an array of ExprNodeEvaluator from start, for specified length and return the result ObjectInspectors.

    Initialize an array of ExprNodeEvaluator from start, for specified length and return the result ObjectInspectors.

    Attributes
    protected
    Definition Classes
    Operator
  42. def initEvaluators(evals: Array[ExprNodeEvaluator], rowInspector: ObjectInspector): Array[ObjectInspector]

    Initialize an array of ExprNodeEvaluator and return the result ObjectInspectors.

    Initialize an array of ExprNodeEvaluator and return the result ObjectInspectors.

    Attributes
    protected
    Definition Classes
    Operator
  43. def initEvaluatorsAndReturnStruct(evals: Array[ExprNodeEvaluator], outputColName: List[String], rowInspector: ObjectInspector): StructObjectInspector

    Initialize an array of ExprNodeEvaluator and put the return values into a StructObjectInspector with integer field names.

    Initialize an array of ExprNodeEvaluator and put the return values into a StructObjectInspector with integer field names.

    Attributes
    protected
    Definition Classes
    Operator
  44. def initEvaluatorsAndReturnStruct(evals: Array[ExprNodeEvaluator], fieldObjectInspectors: Array[ObjectInspector], distinctColIndices: List[List[Integer]], outputColNames: List[String], length: Int, rowInspector: ObjectInspector): StructObjectInspector

    Copy from the org.

    Copy from the org.apache.hadoop.hive.ql.exec.ReduceSinkOperator Initializes array of ExprNodeEvaluator. Adds Union field for distinct column indices for group by. Puts the return values into a StructObjectInspector with output column names.

    If distinctColIndices is empty, the object inspector is same as Operator#initEvaluatorsAndReturnStruct(ExprNodeEvaluator[], List, ObjectInspector)

    Attributes
    protected
    Definition Classes
    Operator
  45. def initEvaluatorsAndReturnStruct(evals: Array[ExprNodeEvaluator], distinctColIndices: List[List[Integer]], outputColNames: List[String], length: Int, rowInspector: ObjectInspector): StructObjectInspector

    Copy from the org.

    Copy from the org.apache.hadoop.hive.ql.exec.ReduceSinkOperator Initializes array of ExprNodeEvaluator. Adds Union field for distinct column indices for group by. Puts the return values into a StructObjectInspector with output column names.

    If distinctColIndices is empty, the object inspector is same as Operator#initEvaluatorsAndReturnStruct(ExprNodeEvaluator[], List, ObjectInspector)

    Attributes
    protected
    Definition Classes
    Operator
  46. def initializeJoinFilterOnMaster(): Unit

    Attributes
    protected
    Definition Classes
    JoinFilter
  47. def initializeMasterOnAll(): Unit

    Recursively calls initializeOnMaster() for the entire query plan.

    Recursively calls initializeOnMaster() for the entire query plan. Parent operators are called before children.

    Definition Classes
    Operator
  48. def initializeOnMaster(): Unit

    Initialize the operator on master node.

    Initialize the operator on master node. This can have dependency on other nodes. When an operator's initializeOnMaster() is invoked, all its parents' initializeOnMaster() have been invoked.

    Definition Classes
    MapJoinOperatorCommonJoinOperatorOperator
  49. def initializeOnSlave(): Unit

    Initialize the operator on slave nodes.

    Initialize the operator on slave nodes. This method should have no dependency on parents or children. Everything that is not used in this method should be marked @transient.

    Definition Classes
    MapJoinOperatorCommonJoinOperatorOperator
  50. def inputObjectInspectors(): Seq[ObjectInspector]

    Attributes
    protected
    Definition Classes
    Operator
  51. var inputOffsets: Array[Int]

    Definition Classes
    JoinFilter
  52. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  53. def isTraceEnabled(): Boolean

    Attributes
    protected
    Definition Classes
    Logging
  54. var joinConditions: Array[JoinCondDesc]

    Definition Classes
    CommonJoinOperator
  55. var joinFilterObjectInspectors: Array[List[ObjectInspector]]

    Definition Classes
    CommonJoinOperator
  56. var joinFilters: Array[List[ExprNodeEvaluator]]

    Definition Classes
    CommonJoinOperator
  57. var joinKeys: Array[List[ExprNodeEvaluator]]

  58. var joinKeysObjectInspectors: Array[List[ObjectInspector]]

  59. def joinOnPartition[T](iter: Iterator[T], hashtables: Map[Int, HashMap[Seq[AnyRef], Array[Array[AnyRef]]]]): Iterator[_]

    Stream through the large table and process the join using the hash tables.

    Stream through the large table and process the join using the hash tables. Note that this is a specialized processPartition that accepts an extra parameter for the hash tables (built from the small tables).

  60. var joinVals: Array[List[ExprNodeEvaluator]]

    Definition Classes
    CommonJoinOperator
  61. var joinValues: Array[List[ExprNodeEvaluator]]

  62. var joinValuesObjectInspectors: Array[List[ObjectInspector]]

    Definition Classes
    CommonJoinOperator
  63. var joinValuesStandardObjectInspectors: Array[List[ObjectInspector]]

    Definition Classes
    CommonJoinOperator
  64. def log: Logger

    Attributes
    protected
    Definition Classes
    Logging
  65. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  66. def logDebug(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  67. def logError(msg: String, exception: Throwable): Unit

    Definition Classes
    LogHelper
  68. def logError(msg: String, detail: String): Unit

    Definition Classes
    LogHelper
  69. def logError(msg: ⇒ String): Unit

    Definition Classes
    LogHelper → Logging
  70. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  71. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  72. def logInfo(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  73. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  74. def logTrace(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  75. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  76. def logWarning(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  77. val metadataKeyTag: Int

  78. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  79. var noOuterJoin: Boolean

    Definition Classes
    CommonJoinOperator
  80. final def notify(): Unit

    Definition Classes
    AnyRef
  81. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  82. var nullCheck: Boolean

    Definition Classes
    CommonJoinOperator
  83. var numTables: Int

    Definition Classes
    CommonJoinOperator
  84. var objectInspectors: Seq[ObjectInspector]

    Definition Classes
    Operator
  85. var order: Array[Byte]

    Definition Classes
    CommonJoinOperator
  86. def outStream(): PrintStream

    Definition Classes
    LogHelper
  87. def outputObjectInspector(): StandardStructObjectInspector

    Definition Classes
    MapJoinOperatorCommonJoinOperatorOperator
  88. def parentOperators: ArrayBuffer[Operator[_ <: HiveDesc]]

    Definition Classes
    Operator
  89. def parentOperatorsAsJavaList: List[Operator[_ <: HiveDesc]]

    Return the parent operators as a Java List.

    Return the parent operators as a Java List. This is for interoperability with Java. We use this in explain's Java code.

    Definition Classes
    Operator
  90. var posBigTable: Int

  91. def postprocessRdd(rdd: RDD[_]): RDD[_]

    Called on master.

    Called on master.

    Definition Classes
    NaryOperator
  92. def processPartition(split: Int, iter: Iterator[_]): Iterator[_]

    Process a partition.

    Process a partition. Called on slaves.

    Definition Classes
    MapJoinOperatorNaryOperatorOperator
  93. var resultRowSize: Int

    Definition Classes
    JoinFilter
  94. var resultTupleSizes: Array[Int]

    Definition Classes
    JoinFilter
  95. def returnTerminalOperators(): Seq[Operator[_ <: HiveDesc]]

    Definition Classes
    Operator
  96. def returnTopOperators(): Seq[Operator[_]]

    Definition Classes
    Operator
  97. var rowBuffer: Array[AnyRef]

    Definition Classes
    CommonJoinOperator
  98. def setBigTableAlias(arg0: Int): Unit

  99. def setBigTableAliasByte(arg0: Byte): Unit

  100. def setColumnValues(data: AnyRef, outputRow: Array[AnyRef], tblIdx: Int, offsets: IndexedSeq[Int]): Unit

    Copy the table(input) columns value to the output tuple

    Copy the table(input) columns value to the output tuple

    Attributes
    protected
    Definition Classes
    JoinFilter
  101. def setConf(arg0: MapJoinDesc): Unit

    Definition Classes
    CommonJoinOperator
  102. def setDesc[B >: MapJoinDesc](d: B): Unit

    Definition Classes
    Operator
  103. def setInputOffsets(arg0: Array[Int]): Unit

    Definition Classes
    JoinFilter
  104. def setJoinConditions(arg0: Array[JoinCondDesc]): Unit

    Definition Classes
    CommonJoinOperator
  105. def setNullCheck(arg0: Boolean): Unit

    Definition Classes
    CommonJoinOperator
  106. def setNumTables(arg0: Int): Unit

    Definition Classes
    CommonJoinOperator
  107. def setOrder(arg0: Array[Byte]): Unit

    Definition Classes
    CommonJoinOperator
  108. def setPosBigTable(arg0: Int): Unit

  109. def setResultRowSize(arg0: Int): Unit

    Definition Classes
    JoinFilter
  110. def setResultTupleSizes(arg0: Array[Int]): Unit

    Definition Classes
    JoinFilter
  111. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  112. var tagLen: Int

    Definition Classes
    CommonJoinOperator
  113. def toString(): String

    Definition Classes
    AnyRef → Any
  114. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  115. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  116. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from CommonJoinOperator[MapJoinDesc]

Inherited from JoinFilter[MapJoinDesc]

Inherited from NaryOperator[MapJoinDesc]

Inherited from Operator[MapJoinDesc]

Inherited from Serializable

Inherited from Serializable

Inherited from LogHelper

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped