org.apache.spark.sql

GroupedData

class GroupedData extends AnyRef

:: Experimental :: A set of methods for aggregations on a DataFrame, created by DataFrame.groupBy.

Annotations: @Experimental()

Linear Supertypes

AnyRef, Any

Ordering

Alphabetic
By inheritance

Inherited

GroupedData
AnyRef
Any

Hide All
Show all

Learn more about member selection

Visibility

Public
All

Instance Constructors

new GroupedData(df: DataFrame, groupingExprs: Seq[Expression])

Attributes
protected[org.apache.spark.sql]

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def agg(expr: Column, exprs: Column*): DataFrame

Compute aggregates by specifying a series of aggregate columns.
Compute aggregates by specifying a series of aggregate columns. Unlike other methods in this class, the resulting DataFrame won't automatically include the grouping columns.
The available aggregate methods are defined in org.apache.spark.sql.functions.
```
// Selects the age of the oldest employee and the aggregate expense for each department

// Scala:
import org.apache.spark.sql.functions._
df.groupBy("department").agg($"department", max($"age"), sum($"expense"))

// Java:
import static org.apache.spark.sql.functions.*;
df.groupBy("department").agg(col("department"), max(col("age")), sum(col("expense")));
```
Annotations
@varargs()
def agg(exprs: Map[String, String]): DataFrame

(Java-specific) Compute aggregates by specifying a map from column name to aggregate methods.
(Java-specific) Compute aggregates by specifying a map from column name to aggregate methods. The resulting DataFrame will also contain the grouping columns.
The available aggregate methods are avg, max, min, sum, count.
```
// Selects the age of the oldest employee and the aggregate expense for each department
import com.google.common.collect.ImmutableMap;
df.groupBy("department").agg(ImmutableMap.<String, String>builder()
  .put("age", "max")
  .put("expense", "sum")
  .build());
```
def agg(exprs: Map[String, String]): DataFrame

(Scala-specific) Compute aggregates by specifying a map from column name to aggregate methods.
(Scala-specific) Compute aggregates by specifying a map from column name to aggregate methods. The resulting DataFrame will also contain the grouping columns.
The available aggregate methods are avg, max, min, sum, count.
```
// Selects the age of the oldest employee and the aggregate expense for each department
df.groupBy("department").agg(Map(
  "age" -> "max",
  "expense" -> "sum"
))
```
def agg(aggExpr: (String, String), aggExprs: (String, String)*): DataFrame

(Scala-specific) Compute aggregates by specifying a map from column name to aggregate methods.
(Scala-specific) Compute aggregates by specifying a map from column name to aggregate methods. The resulting DataFrame will also contain the grouping columns.
The available aggregate methods are avg, max, min, sum, count.
```
// Selects the age of the oldest employee and the aggregate expense for each department
df.groupBy("department").agg(
  "age" -> "max",
  "expense" -> "sum"
)
```
final def asInstanceOf[T0]: T0

Definition Classes
Any
def avg(colNames: String*): DataFrame

Compute the mean value for each numeric columns for each group.
Compute the mean value for each numeric columns for each group. The resulting DataFrame will also contain the grouping columns. When specified columns are given, only compute the mean values for them.

Annotations
@varargs()
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
def count(): DataFrame

Count the number of rows for each group.
Count the number of rows for each group. The resulting DataFrame will also contain the grouping columns.
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def hashCode(): Int

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
def max(colNames: String*): DataFrame

Compute the max value for each numeric columns for each group.
Compute the max value for each numeric columns for each group. The resulting DataFrame will also contain the grouping columns. When specified columns are given, only compute the max values for them.

Annotations
@varargs()
def mean(colNames: String*): DataFrame

Compute the average value for each numeric columns for each group.
Compute the average value for each numeric columns for each group. This is an alias for avg. The resulting DataFrame will also contain the grouping columns. When specified columns are given, only compute the average values for them.

Annotations
@varargs()
def min(colNames: String*): DataFrame

Compute the min value for each numeric column for each group.
Compute the min value for each numeric column for each group. The resulting DataFrame will also contain the grouping columns. When specified columns are given, only compute the min values for them.

Annotations
@varargs()
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
def sum(colNames: String*): DataFrame

Compute the sum for each numeric columns for each group.
Compute the sum for each numeric columns for each group. The resulting DataFrame will also contain the grouping columns. When specified columns are given, only compute the sum for them.

Annotations
@varargs()
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped