Scales observations, so that all features are in a user-specified range.
Maps a vector into the polynomial feature space.
Maps a vector into the polynomial feature space.
This transformer takes a a vector of values (x, y, z, ...)
and maps it into the
polynomial feature space of degree d
. That is to say, it calculates the following
representation:
(x, y, z, x2, xy, y2, yz, z2, x3, x2y, x2z, xyz, ...)^T
This transformer can be prepended to all org.apache.flink.ml.pipeline.Transformer and org.apache.flink.ml.pipeline.Predictor implementations which expect an input of LabeledVector.
val trainingDS: DataSet[LabeledVector] = ... val polyFeatures = PolynomialFeatures() .setDegree(3) val mlr = MultipleLinearRegression() val pipeline = polyFeatures.chainPredictor(mlr) pipeline.fit(trainingDS)
Scales observations, so that all features have a user-specified mean and standard deviation.
Scales observations, so that all features have a user-specified mean and standard deviation. By default for StandardScaler transformer mean=0.0 and std=1.0.
This transformer takes a subtype of Vector of values and maps it to a scaled subtype of Vector such that each feature has a user-specified mean and standard deviation.
This transformer can be prepended to all Transformer and org.apache.flink.ml.pipeline.Predictor implementations which expect as input a subtype of Vector.
val trainingDS: DataSet[Vector] = env.fromCollection(data) val transformer = StandardScaler().setMean(10.0).setStd(2.0) transformer.fit(trainingDS) val transformedDS = transformer.transform(trainingDS)
- Mean: The mean value of transformed data set; by default equal to 0 - Std: The standard deviation of the transformed data set; by default equal to 1
Scales observations, so that all features are in a user-specified range. By default for MinMaxScaler transformer range = [0,1].
This transformer takes a subtype of Vector of values and maps it to a scaled subtype of Vector such that each feature lies between a user-specified range.
This transformer can be prepended to all Transformer and org.apache.flink.ml.pipeline.Predictor implementations which expect as input a subtype of Vector or a LabeledVector.
Parameters
- Min: The minimum value of the range of the transformed data set; by default equal to 0 - Max: The maximum value of the range of the transformed data set; by default equal to 1