package expressions
- Alphabetic
- Public
- All
Type Members
- 
      
      
      
        
      
    
      
        abstract 
        class
      
      
        Aggregator[-IN, BUF, OUT] extends Serializable
      
      
      A base class for user-defined aggregations, which can be used in Datasetoperations to take all of the elements of a group and reduce them to a single value.A base class for user-defined aggregations, which can be used in Datasetoperations to take all of the elements of a group and reduce them to a single value.For example, the following aggregator extracts an intfrom a specific class and adds them up:case class Data(i: Int) val customSummer = new Aggregator[Data, Int, Int] { def zero: Int = 0 def reduce(b: Int, a: Data): Int = b + a.i def merge(b1: Int, b2: Int): Int = b1 + b2 def finish(r: Int): Int = r def bufferEncoder: Encoder[Int] = Encoders.scalaInt def outputEncoder: Encoder[Int] = Encoders.scalaInt }.toColumn() val ds: Dataset[Data] = ... val aggregated = ds.select(customSummer) Based loosely on Aggregator from Algebird: https://github.com/twitter/algebird - IN
- The input type for the aggregation. 
- BUF
- The type of the intermediate value of the reduction. 
- OUT
- The type of the final output result. 
 - Since
- 1.6.0 
 
- 
      
      
      
        
      
    
      
        abstract 
        class
      
      
        MutableAggregationBuffer extends Row
      
      
      A Rowrepresenting a mutable aggregation buffer.A Rowrepresenting a mutable aggregation buffer.This is not meant to be extended outside of Spark. - Annotations
- @Stable()
- Since
- 1.5.0 
 
- 
      
      
      
        
      
    
      
        sealed abstract 
        class
      
      
        UserDefinedFunction extends AnyRef
      
      
      A user-defined function. A user-defined function. To create one, use the udffunctions infunctions.As an example: // Define a UDF that returns true or false based on some numeric score. val predict = udf((score: Double) => score > 0.5) // Projects a column that adds a prediction column based on the score column. df.select( predict(df("score")) ) - Annotations
- @Stable()
- Since
- 1.3.0 
 
- 
      
      
      
        
      
    
      
        
        class
      
      
        Window extends AnyRef
      
      
      Utility functions for defining window in DataFrames. Utility functions for defining window in DataFrames. // PARTITION BY country ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW Window.partitionBy("country").orderBy("date") .rowsBetween(Window.unboundedPreceding, Window.currentRow) // PARTITION BY country ORDER BY date ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING Window.partitionBy("country").orderBy("date").rowsBetween(-3, 3) - Annotations
- @Stable()
- Since
- 1.4.0 
 
- 
      
      
      
        
      
    
      
        
        class
      
      
        WindowSpec extends AnyRef
      
      
      A window specification that defines the partitioning, ordering, and frame boundaries. A window specification that defines the partitioning, ordering, and frame boundaries. Use the static methods in Window to create a WindowSpec. - Annotations
- @Stable()
- Since
- 1.4.0 
 
- 
      
      
      
        
      
    
      
        abstract 
        class
      
      
        UserDefinedAggregateFunction extends Serializable
      
      
      The base class for implementing user-defined aggregate functions (UDAF). The base class for implementing user-defined aggregate functions (UDAF). - Annotations
- @Stable() @deprecated
- Deprecated
- (Since version 3.0.0) 
- Since
- 1.5.0 
 
Value Members
- 
      
      
      
        
      
    
      
        
        object
      
      
        Window
      
      
      Utility functions for defining window in DataFrames. Utility functions for defining window in DataFrames. // PARTITION BY country ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW Window.partitionBy("country").orderBy("date") .rowsBetween(Window.unboundedPreceding, Window.currentRow) // PARTITION BY country ORDER BY date ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING Window.partitionBy("country").orderBy("date").rowsBetween(-3, 3) - Annotations
- @Stable()
- Since
- 1.4.0 
- Note
- When ordering is not defined, an unbounded window frame (rowFrame, unboundedPreceding, unboundedFollowing) is used by default. When ordering is defined, a growing window frame (rangeFrame, unboundedPreceding, currentRow) is used by default.