execution

Type Members

case class Aggregate(partial: Boolean, groupingExpressions: Seq[Expression], aggregateExpressions: Seq[NamedExpression], child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

:: DeveloperApi :: Groups input data by groupingExpressions and computes the aggregateExpressions for each group.
:: DeveloperApi :: Groups input data by groupingExpressions and computes the aggregateExpressions for each group.
partial
if true then aggregation is done partially on local data without shuffling to ensure all values where groupingExpressions are equal are present.
groupingExpressions
expressions that are evaluated to determine grouping.
aggregateExpressions
expressions that are computed for each group.
child
the input data source.

Annotations
@DeveloperApi()
case class AggregateEvaluation(schema: Seq[Attribute], initialValues: Seq[Expression], update: Seq[Expression], result: Expression) extends Product with Serializable
case class BatchPythonEvaluation(udf: PythonUDF, output: Seq[Attribute], child: SparkPlan) extends SparkPlan with Product with Serializable

:: DeveloperApi :: Uses PythonRDD to evaluate a PythonUDF, one partition of tuples at a time.
:: DeveloperApi :: Uses PythonRDD to evaluate a PythonUDF, one partition of tuples at a time. The input data is cached and zipped with the result of the udf evaluation.

Annotations
@DeveloperApi()
case class CacheTableCommand(tableName: String, plan: Option[LogicalPlan], isLazy: Boolean) extends Command with RunnableCommand with Product with Serializable

:: DeveloperApi ::
:: DeveloperApi ::

Annotations
@DeveloperApi()
case class DescribeCommand(child: SparkPlan, output: Seq[Attribute], isExtended: Boolean) extends Command with RunnableCommand with Product with Serializable

:: DeveloperApi ::
:: DeveloperApi ::

Annotations
@DeveloperApi()
case class Distinct(partial: Boolean, child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

:: DeveloperApi :: Computes the set of distinct input rows using a HashSet.
:: DeveloperApi :: Computes the set of distinct input rows using a HashSet.
partial
when true the distinct operation is performed partially, per partition, without shuffling the data.
child
the input query plan.

Annotations
@DeveloperApi()
case class EvaluatePython(udf: PythonUDF, child: LogicalPlan, resultAttribute: AttributeReference) extends catalyst.plans.logical.UnaryNode with Product with Serializable

:: DeveloperApi :: Evaluates a PythonUDF, appending the result to the end of the input tuple.
:: DeveloperApi :: Evaluates a PythonUDF, appending the result to the end of the input tuple.

Annotations
@DeveloperApi()
case class Except(left: SparkPlan, right: SparkPlan) extends SparkPlan with BinaryNode with Product with Serializable

:: DeveloperApi :: Returns a table with the elements from left that are not in right using the built-in spark subtract function.
:: DeveloperApi :: Returns a table with the elements from left that are not in right using the built-in spark subtract function.

Annotations
@DeveloperApi()
case class Exchange(newPartitioning: Partitioning, child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

:: DeveloperApi ::
:: DeveloperApi ::

Annotations
@DeveloperApi()
case class ExecutedCommand(cmd: RunnableCommand) extends SparkPlan with Product with Serializable

A physical operator that executes the run method of a RunnableCommand and saves the result to prevent multiple executions.
case class Expand(projections: Seq[GroupExpression], output: Seq[Attribute], child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

Apply the all of the GroupExpressions to every input row, hence we will get multiple output rows for a input row.
Apply the all of the GroupExpressions to every input row, hence we will get multiple output rows for a input row.
projections
The group of expressions, all of the group expressions should output the same schema specified bye the parameter output
output
The output Schema
child
Child operator

Annotations
@DeveloperApi()
case class ExplainCommand(logicalPlan: LogicalPlan, output: Seq[Attribute] = ..., extended: Boolean = false) extends Command with RunnableCommand with Product with Serializable

An explain command for users to see how a command will be executed.
An explain command for users to see how a command will be executed.
Note that this command takes in a logical plan, runs the optimizer on the logical plan (but do NOT actually execute it).
:: DeveloperApi ::

Annotations
@DeveloperApi()
case class ExternalSort(sortOrder: Seq[SortOrder], global: Boolean, child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

:: DeveloperApi :: Performs a sort, spilling to disk as needed.
:: DeveloperApi :: Performs a sort, spilling to disk as needed.
global
when true performs a global sort of all partitions by shuffling the data first if necessary.

Annotations
@DeveloperApi()
case class Filter(condition: Expression, child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

:: DeveloperApi ::
:: DeveloperApi ::

Annotations
@DeveloperApi()
case class Generate(generator: Generator, join: Boolean, outer: Boolean, child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

:: DeveloperApi :: Applies a Generator to a stream of input rows, combining the output of each into a new stream of rows.
:: DeveloperApi :: Applies a Generator to a stream of input rows, combining the output of each into a new stream of rows. This operation is similar to a flatMap in functional programming with one important additional feature, which allows the input rows to be joined with their output.
join
when true, each output row is implicitly joined with the input tuple that produced it.
outer
when true, each input row will be output at least once, even if the output of the given generator is empty. outer has no effect when join is false.

Annotations
@DeveloperApi()
case class GeneratedAggregate(partial: Boolean, groupingExpressions: Seq[Expression], aggregateExpressions: Seq[NamedExpression], child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

:: DeveloperApi :: Alternate version of aggregation that leverages projection and thus code generation.
:: DeveloperApi :: Alternate version of aggregation that leverages projection and thus code generation. Aggregations are converted into a set of projections from a aggregation buffer tuple back onto itself. Currently only used for simple aggregations like SUM, COUNT, or AVERAGE are supported.
partial
if true then aggregation is done partially on local data without shuffling to ensure all values where groupingExpressions are equal are present.
groupingExpressions
expressions that are evaluated to determine grouping.
aggregateExpressions
expressions that are computed for each group.
child
the input data source.

Annotations
@DeveloperApi()
case class Intersect(left: SparkPlan, right: SparkPlan) extends SparkPlan with BinaryNode with Product with Serializable

:: DeveloperApi :: Returns the rows in left that also appear in right using the built in spark intersection function.
:: DeveloperApi :: Returns the rows in left that also appear in right using the built in spark intersection function.

Annotations
@DeveloperApi()
case class Limit(limit: Int, child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

:: DeveloperApi :: Take the first limit elements.
:: DeveloperApi :: Take the first limit elements. Note that the implementation is different depending on whether this is a terminal operator or not. If it is terminal and is invoked using executeCollect, this operator uses something similar to Spark's take method on the Spark driver. If it is not terminal or is invoked using execute, we first take the limit on each partition, and then repartition all the data to a single partition to compute the global limit.

Annotations
@DeveloperApi()
case class LocalTableScan(output: Seq[Attribute], rows: Seq[Row]) extends SparkPlan with LeafNode with Product with Serializable

Physical plan node for scanning data from a local collection.
case class LogicalLocalTable(output: Seq[Attribute], rows: Seq[Row])(sqlContext: SQLContext) extends LogicalPlan with MultiInstanceRelation with Product with Serializable

Logical plan node for scanning data from a local collection.
case class LogicalRDD(output: Seq[Attribute], rdd: RDD[Row])(sqlContext: SQLContext) extends LogicalPlan with MultiInstanceRelation with Product with Serializable

Logical plan node for scanning data from an RDD.
case class OutputFaker(output: Seq[Attribute], child: SparkPlan) extends SparkPlan with Product with Serializable

:: DeveloperApi :: A plan node that does nothing but lie about the output of its child.
:: DeveloperApi :: A plan node that does nothing but lie about the output of its child. Used to spice a (hopefully structurally equivalent) tree from a different optimization sequence into an already resolved tree.

Annotations
@DeveloperApi()
case class PhysicalRDD(output: Seq[Attribute], rdd: RDD[Row]) extends SparkPlan with LeafNode with Product with Serializable

Physical plan node for scanning data from an RDD.
case class Project(projectList: Seq[NamedExpression], child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

:: DeveloperApi ::
:: DeveloperApi ::

Annotations
@DeveloperApi()
class QueryExecutionException extends Exception
trait RunnableCommand extends Command

A logical command that is executed for its side-effects.
A logical command that is executed for its side-effects. RunnableCommands are wrapped in ExecutedCommand during execution.
case class Sample(fraction: Double, withReplacement: Boolean, seed: Long, child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

:: DeveloperApi ::
:: DeveloperApi ::

Annotations
@DeveloperApi()
case class SetCommand(kv: Option[(String, Option[String])], output: Seq[Attribute]) extends Command with RunnableCommand with Logging with Product with Serializable

:: DeveloperApi ::
:: DeveloperApi ::

Annotations
@DeveloperApi()
case class ShowTablesCommand(databaseName: Option[String]) extends Command with RunnableCommand with Product with Serializable

A command for users to get tables in the given database.
A command for users to get tables in the given database. If a databaseName is not given, the current database will be used. The syntax of using this command in SQL is:
```
SHOW TABLES [IN databaseName]
```
:: DeveloperApi ::
Annotations
@DeveloperApi()
case class Sort(sortOrder: Seq[SortOrder], global: Boolean, child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

:: DeveloperApi :: Performs a sort on-heap.
:: DeveloperApi :: Performs a sort on-heap.
global
when true performs a global sort of all partitions by shuffling the data first if necessary.

Annotations
@DeveloperApi()
abstract class SparkPlan extends QueryPlan[SparkPlan] with Logging with Serializable

:: DeveloperApi ::
:: DeveloperApi ::

Annotations
@DeveloperApi()
case class TakeOrdered(limit: Int, sortOrder: Seq[SortOrder], child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

:: DeveloperApi :: Take the first limit elements as defined by the sortOrder.
:: DeveloperApi :: Take the first limit elements as defined by the sortOrder. This is logically equivalent to having a Limit operator after a Sort operator. This could have been named TopK, but Spark's top operator does the opposite in ordering so we name it TakeOrdered to avoid confusion.

Annotations
@DeveloperApi()
case class UncacheTableCommand(tableName: String) extends Command with RunnableCommand with Product with Serializable

:: DeveloperApi ::
:: DeveloperApi ::

Annotations
@DeveloperApi()
case class Union(children: Seq[SparkPlan]) extends SparkPlan with Product with Serializable

:: DeveloperApi ::
:: DeveloperApi ::

Annotations
@DeveloperApi()

Value Members

object ClearCacheCommand extends Command with RunnableCommand with Product with Serializable

:: DeveloperApi :: Clear all cached data from the in-memory cache.
:: DeveloperApi :: Clear all cached data from the in-memory cache.

Annotations
@DeveloperApi()
object EvaluatePython extends Serializable
object RDDConversions

:: DeveloperApi ::
:: DeveloperApi ::

Annotations
@DeveloperApi()
object SparkPlan extends Serializable
package debug

:: DeveloperApi :: Contains methods for debugging query execution.
:: DeveloperApi :: Contains methods for debugging query execution.
Usage:
```
import org.apache.spark.sql.execution.debug._
sql("SELECT key FROM src").debug()
dataFrame.typeCheck()
```
package joins

:: DeveloperApi :: Physical execution operators for join operations.

package execution

Type Members

case class Aggregate(partial: Boolean, groupingExpressions: Seq[Expression], aggregateExpressions: Seq[NamedExpression], child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

case class AggregateEvaluation(schema: Seq[Attribute], initialValues: Seq[Expression], update: Seq[Expression], result: Expression) extends Product with Serializable

case class BatchPythonEvaluation(udf: PythonUDF, output: Seq[Attribute], child: SparkPlan) extends SparkPlan with Product with Serializable

case class CacheTableCommand(tableName: String, plan: Option[LogicalPlan], isLazy: Boolean) extends Command with RunnableCommand with Product with Serializable

case class DescribeCommand(child: SparkPlan, output: Seq[Attribute], isExtended: Boolean) extends Command with RunnableCommand with Product with Serializable

case class Distinct(partial: Boolean, child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

case class EvaluatePython(udf: PythonUDF, child: LogicalPlan, resultAttribute: AttributeReference) extends catalyst.plans.logical.UnaryNode with Product with Serializable

case class Except(left: SparkPlan, right: SparkPlan) extends SparkPlan with BinaryNode with Product with Serializable

case class Exchange(newPartitioning: Partitioning, child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

case class ExecutedCommand(cmd: RunnableCommand) extends SparkPlan with Product with Serializable

case class Expand(projections: Seq[GroupExpression], output: Seq[Attribute], child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

case class ExplainCommand(logicalPlan: LogicalPlan, output: Seq[Attribute] = ..., extended: Boolean = false) extends Command with RunnableCommand with Product with Serializable

case class ExternalSort(sortOrder: Seq[SortOrder], global: Boolean, child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

case class Filter(condition: Expression, child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

case class Generate(generator: Generator, join: Boolean, outer: Boolean, child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

case class GeneratedAggregate(partial: Boolean, groupingExpressions: Seq[Expression], aggregateExpressions: Seq[NamedExpression], child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

case class Intersect(left: SparkPlan, right: SparkPlan) extends SparkPlan with BinaryNode with Product with Serializable

case class Limit(limit: Int, child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

case class LocalTableScan(output: Seq[Attribute], rows: Seq[Row]) extends SparkPlan with LeafNode with Product with Serializable

case class LogicalLocalTable(output: Seq[Attribute], rows: Seq[Row])(sqlContext: SQLContext) extends LogicalPlan with MultiInstanceRelation with Product with Serializable

case class LogicalRDD(output: Seq[Attribute], rdd: RDD[Row])(sqlContext: SQLContext) extends LogicalPlan with MultiInstanceRelation with Product with Serializable

case class OutputFaker(output: Seq[Attribute], child: SparkPlan) extends SparkPlan with Product with Serializable

case class PhysicalRDD(output: Seq[Attribute], rdd: RDD[Row]) extends SparkPlan with LeafNode with Product with Serializable

case class Project(projectList: Seq[NamedExpression], child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

class QueryExecutionException extends Exception

trait RunnableCommand extends Command

case class Sample(fraction: Double, withReplacement: Boolean, seed: Long, child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

case class SetCommand(kv: Option[(String, Option[String])], output: Seq[Attribute]) extends Command with RunnableCommand with Logging with Product with Serializable

case class ShowTablesCommand(databaseName: Option[String]) extends Command with RunnableCommand with Product with Serializable

case class Sort(sortOrder: Seq[SortOrder], global: Boolean, child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

abstract class SparkPlan extends QueryPlan[SparkPlan] with Logging with Serializable

case class TakeOrdered(limit: Int, sortOrder: Seq[SortOrder], child: SparkPlan) extends SparkPlan with UnaryNode with Product with Serializable

case class UncacheTableCommand(tableName: String) extends Command with RunnableCommand with Product with Serializable

case class Union(children: Seq[SparkPlan]) extends SparkPlan with Product with Serializable

Value Members

object ClearCacheCommand extends Command with RunnableCommand with Product with Serializable

object EvaluatePython extends Serializable

object RDDConversions

object SparkPlan extends Serializable

package debug

package joins

Inherited from AnyRef

Inherited from Any

Ungrouped