Package

org.apache.flink.ml

pipeline

Permalink

package pipeline

Visibility
  1. Public
  2. All

Type Members

  1. case class ChainedPredictor[T <: Transformer[T], P <: Predictor[P]](transformer: T, predictor: P) extends Predictor[ChainedPredictor[T, P]] with Product with Serializable

    Permalink

    Predictor which represents a pipeline of possibly multiple Transformer and a trailing Predictor.

    Predictor which represents a pipeline of possibly multiple Transformer and a trailing Predictor.

    The ChainedPredictor can be used as a regular Predictor. Upon calling the fit method, the input data is piped through all preceding Transformer in the pipeline and the resulting data is given to the trailing Predictor. The same holds true for the predict operation.

    The pipeline mechanism has been inspired by scikit-learn

    T

    Type of the preceding Transformer

    P

    Type of the trailing Predictor

    transformer

    Preceding Transformer of the pipeline

    predictor

    Trailing Predictor of the pipeline

  2. case class ChainedTransformer[L <: Transformer[L], R <: Transformer[R]](left: L, right: R) extends Transformer[ChainedTransformer[L, R]] with Product with Serializable

    Permalink

    Transformer which represents the chaining of two Transformer.

    Transformer which represents the chaining of two Transformer.

    A ChainedTransformer can be treated as regular Transformer. Upon calling the fit or transform operation, the data is piped through all Transformer of the pipeline.

    The pipeline mechanism has been inspired by scikit-learn

    L

    Type of the left Transformer

    R

    Type of the right Transformer

    left

    Left Transformer of the pipeline

    right

    Right Transformer of the pipeline

  3. trait Estimator[Self] extends WithParameters

    Permalink

    Base trait for Flink's pipeline operators.

    Base trait for Flink's pipeline operators.

    An estimator can be fitted to input data. In order to do that the implementing class has to provide an implementation of a FitOperation with the correct input type. In order to make the FitOperation retrievable by the Scala compiler, the implementation should be placed in the companion object of the implementing class.

    The pipeline mechanism has been inspired by scikit-learn

  4. trait EvaluateDataSetOperation[Instance, Testing, Prediction] extends Serializable

    Permalink

    Type class for the evaluate operation of Predictor.

    Type class for the evaluate operation of Predictor. This evaluate operation works on DataSets.

    It takes a DataSet of some type. For each element of this DataSet the evaluate method computes the prediction value and returns a tuple of true label value and prediction value.

    Instance

    The concrete type of the Predictor instance that we will use to make the predictions

    Testing

    The type of the example that we will use to make the predictions (input)

    Prediction

    The type of the label that the prediction operation will produce (output)

  5. trait FitOperation[Self, Training] extends AnyRef

    Permalink

    Type class for the fit operation of an Estimator.

    Type class for the fit operation of an Estimator.

    The FitOperation contains a self type parameter so that the Scala compiler looks into the companion object of this class to find implicit values.

    Self

    Type of the Estimator subclass for which the FitOperation is defined

    Training

    Type of the training data

  6. trait PredictDataSetOperation[Self, Testing, Prediction] extends Serializable

    Permalink

    Type class for the predict operation of Predictor.

    Type class for the predict operation of Predictor. This predict operation works on DataSets.

    Predictors either have to implement this trait or the PredictOperation trait. The implementation has to be made available as an implicit value or function in the scope of their companion objects.

    The first type parameter is the type of the implementing Predictor class so that the Scala compiler includes the companion object of this class in the search scope for the implicit values.

    Self

    Type of Predictor implementing class

    Testing

    Type of testing data

    Prediction

    Type of predicted data

  7. trait PredictOperation[Instance, Model, Testing, Prediction] extends Serializable

    Permalink

    Type class for predict operation.

    Type class for predict operation. It takes an element and the model and then computes the prediction value for this element.

    It is sufficient for a Predictor to only implement this trait to support the evaluate and predict method.

    Instance

    The concrete type of the Predictor that we will use for predictions

    Model

    The representation of the predictive model for the algorithm, for example a Vector of weights

    Testing

    The type of the example that we will use to make the predictions (input)

    Prediction

    The type of the label that the prediction operation will produce (output)

  8. trait Predictor[Self] extends Estimator[Self] with WithParameters

    Permalink

    Predictor trait for Flink's pipeline operators.

    Predictor trait for Flink's pipeline operators.

    A Predictor calculates predictions for testing data based on the model it learned during the fit operation (training phase). In order to do that, the implementing class has to provide a FitOperation and a PredictDataSetOperation implementation for the correct types. The implicit values should be put into the scope of the companion object of the implementing class to make them retrievable for the Scala compiler.

    The pipeline mechanism has been inspired by scikit-learn

    Self

    Type of the implementing class

  9. trait TransformDataSetOperation[Instance, Input, Output] extends Serializable

    Permalink

    Type class for a transform operation of Transformer.

    Type class for a transform operation of Transformer. This works on DataSet of elements.

    The TransformDataSetOperation contains a self type parameter so that the Scala compiler looks into the companion object of this class to find implicit values.

    Instance

    Type of the Transformer for which the TransformDataSetOperation is defined

    Input

    Input data type

    Output

    Output data type

  10. trait TransformOperation[Instance, Model, Input, Output] extends Serializable

    Permalink

    Type class for a transform operation which works on a single element and the corresponding model of the Transformer.

  11. trait Transformer[Self <: Transformer[Self]] extends Estimator[Self] with WithParameters with Serializable

    Permalink

    Transformer trait for Flink's pipeline operators.

    Transformer trait for Flink's pipeline operators.

    A Transformer transforms a DataSet of an input type into a DataSet of an output type. Furthermore, a Transformer is also an Estimator, because some transformations depend on the training data. In order to do that the implementing class has to provide a TransformDataSetOperation and FitOperation implementation. The Scala compiler finds these implicit values if it is put in the scope of the companion object of the implementing class.

    Transformer can be chained with other Transformer and Predictor to create pipelines. These pipelines can consist of an arbitrary number of Transformer and at most one trailing Predictor.

    The pipeline mechanism has been inspired by scikit-learn

Value Members

  1. object ChainedPredictor extends Serializable

    Permalink
  2. object ChainedTransformer extends Serializable

    Permalink
  3. object Estimator

    Permalink
  4. object Predictor

    Permalink
  5. object Transformer extends Serializable

    Permalink

Ungrouped