Function that fits the binary model
Function that fits the binary model
the predictor to wrap
the predictor to wrap
Suggested depth for treeAggregate (greater than or equal to 2).
Suggested depth for treeAggregate (greater than or equal to 2). If the dimensions of features or the number of partitions are large, this param could be adjusted to a larger size. Default is 2.
Set the ElasticNet mixing parameter.
Set the ElasticNet mixing parameter. For alpha = 0, the penalty is an L2 penalty. For alpha = 1, it is an L1 penalty. For alpha in (0,1), the penalty is a combination of L1 and L2. Default is 0.0 which is an L2 penalty.
The shape parameter to control the amount of robustness.
The shape parameter to control the amount of robustness. Must be > 1.0. At larger values of epsilon, the huber criterion becomes more similar to least squares regression; for small values of epsilon, the criterion is more similar to L1 regression. Default is 1.35 to get as much robustness as possible while retaining 95% statistical efficiency for normally distributed data. It matches sklearn HuberRegressor and is "M" from A robust hybrid of lasso and ridge regression. Only valid when "loss" is "huber".
Set if we should fit the intercept.
Set if we should fit the intercept. Default is true.
Set the type of loss function to be optimized.
Set the type of loss function to be optimized. Supported options: "squaredError" (https://en.wikipedia.org/wiki/Mean_squared_error) and "huber" (https://en.wikipedia.org/wiki/Huber_loss). Default is squaredError.
Set the maximum number of iterations.
Set the maximum number of iterations. Default is 100.
Set the regularization parameter.
Set the regularization parameter. Default is 0.0.
Set the solver algorithm used for optimization.
Set the solver algorithm used for optimization. In case of linear regression, this can be "l-bfgs", "normal" and "auto".
LinearRegression.MAX_FEATURES_FOR_NORMAL_SOLVER
.Whether to standardize the training features before fitting the model.
Whether to standardize the training features before fitting the model. The coefficients of models will be always returned on the original scale, so it will be transparent for users. Default is true.
With/without standardization, the models should be always converged to the same solution when no regularization is applied. In R's GLMNET package, the default behavior is true as well.
Set the convergence tolerance of iterations.
Set the convergence tolerance of iterations. Smaller value will lead to higher accuracy with the cost of more iterations. Default is 1E-6.
Whether to over-/under-sample training instances according to the given weights in weightCol.
Whether to over-/under-sample training instances according to the given weights in weightCol. If not set or empty, all instances are treated equally (weight 1.0). Default is not set, so all instances have weight one.
stage uid
stage uid
Wrapper around spark ml linear regression org.apache.spark.ml.regression.LinearRegression