java.lang.Object

co.elastic.clients.util.ObjectBuilderBase

co.elastic.clients.util.WithJsonObjectBuilderBase<BuilderT>

co.elastic.clients.elasticsearch.ml.DataframeAnalysisBase.AbstractBuilder<BuilderT>

All Implemented Interfaces:: WithJson<BuilderT>

Direct Known Subclasses:: DataframeAnalysisClassification.Builder, DataframeAnalysisRegression.Builder

Enclosing class:: DataframeAnalysisBase

public abstract static class DataframeAnalysisBase.AbstractBuilder<BuilderT extends DataframeAnalysisBase.AbstractBuilder<BuilderT>> extends WithJsonObjectBuilderBase<BuilderT>

Constructor Summary

Constructors

Constructor

Description

AbstractBuilder()
Method Summary

Modifier and Type

Method

Description

final BuilderT

alpha(Double value)

Advanced configuration option.

final BuilderT

dependentVariable(String value)

Required - Defines which field of the document is to be predicted.

final BuilderT

downsampleFactor(Double value)

Advanced configuration option.

final BuilderT

earlyStoppingEnabled(Boolean value)

Advanced configuration option.

final BuilderT

eta(Double value)

Advanced configuration option.

final BuilderT

etaGrowthRatePerTree(Double value)

Advanced configuration option.

final BuilderT

featureBagFraction(Double value)

Advanced configuration option.

final BuilderT

featureProcessors(DataframeAnalysisFeatureProcessor value, DataframeAnalysisFeatureProcessor... values)

Advanced configuration option.

final BuilderT

featureProcessors(Function<DataframeAnalysisFeatureProcessor.Builder,ObjectBuilder<DataframeAnalysisFeatureProcessor>> fn)

Advanced configuration option.

final BuilderT

featureProcessors(List<DataframeAnalysisFeatureProcessor> list)

Advanced configuration option.

final BuilderT

gamma(Double value)

Advanced configuration option.

final BuilderT

lambda(Double value)

Advanced configuration option.

final BuilderT

maxOptimizationRoundsPerHyperparameter(Integer value)

Advanced configuration option.

final BuilderT

maxTrees(Integer value)

Advanced configuration option.

final BuilderT

numTopFeatureImportanceValues(Integer value)

Advanced configuration option.

final BuilderT

predictionFieldName(String value)

Defines the name of the prediction field in the results.

final BuilderT

randomizeSeed(Double value)

Defines the seed for the random generator that is used to pick training data.

protected abstract BuilderT

self()

final BuilderT

softTreeDepthLimit(Integer value)

Advanced configuration option.

final BuilderT

softTreeDepthTolerance(Double value)

Advanced configuration option.

final BuilderT

trainingPercent(String value)

Defines what percentage of the eligible documents that will be used for training.

Methods inherited from class co.elastic.clients.util.WithJsonObjectBuilderBase
withJson

Methods inherited from class co.elastic.clients.util.ObjectBuilderBase
_checkSingleUse, _listAdd, _listAddAll, _mapPut, _mapPutAll

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface co.elastic.clients.json.WithJson
withJson, withJson

Constructor Details
- AbstractBuilder
  
  public AbstractBuilder()
Method Details
- alpha
  
  public final BuilderT alpha(@Nullable Double value)
  
  Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.
  API name: alpha
- dependentVariable
  
  public final BuilderT dependentVariable(String value)
  
  Required - Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.
  API name: dependent_variable
- downsampleFactor
  
  public final BuilderT downsampleFactor(@Nullable Double value)
  
  Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.
  API name: downsample_factor
- earlyStoppingEnabled
  
  public final BuilderT earlyStoppingEnabled(@Nullable Boolean value)
  
  Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.
  API name: early_stopping_enabled
- eta
  
  public final BuilderT eta(@Nullable Double value)
  
  Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.
  API name: eta
- etaGrowthRatePerTree
  
  public final BuilderT etaGrowthRatePerTree(@Nullable Double value)
  
  Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.
  API name: eta_growth_rate_per_tree
- featureBagFraction
  
  public final BuilderT featureBagFraction(@Nullable Double value)
  
  Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.
  API name: feature_bag_fraction
- featureProcessors
  
  public final BuilderT featureProcessors(List<DataframeAnalysisFeatureProcessor> list)
  
  Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.
  API name: feature_processors
  Adds all elements of list to featureProcessors.
- featureProcessors
  
  public final BuilderT featureProcessors(DataframeAnalysisFeatureProcessor value, DataframeAnalysisFeatureProcessor... values)
  
  Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.
  API name: feature_processors
  Adds one or more values to featureProcessors.
- featureProcessors
  
  public final BuilderT featureProcessors(Function<DataframeAnalysisFeatureProcessor.Builder,ObjectBuilder<DataframeAnalysisFeatureProcessor>> fn)
  
  Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.
  API name: feature_processors
  Adds a value to featureProcessors using a builder lambda.
- gamma
  
  public final BuilderT gamma(@Nullable Double value)
  
  Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.
  API name: gamma
- lambda
  
  public final BuilderT lambda(@Nullable Double value)
  
  Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.
  API name: lambda
- maxOptimizationRoundsPerHyperparameter
  
  public final BuilderT maxOptimizationRoundsPerHyperparameter(@Nullable Integer value)
  
  Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.
  API name: max_optimization_rounds_per_hyperparameter
- maxTrees
  
  public final BuilderT maxTrees(@Nullable Integer value)
  
  Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.
  API name: max_trees
- numTopFeatureImportanceValues
  
  public final BuilderT numTopFeatureImportanceValues(@Nullable Integer value)
  
  Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.
  API name: num_top_feature_importance_values
- predictionFieldName
  
  public final BuilderT predictionFieldName(@Nullable String value)
  
  Defines the name of the prediction field in the results. Defaults to <dependent_variable>_prediction.
  API name: prediction_field_name
- randomizeSeed
  
  public final BuilderT randomizeSeed(@Nullable Double value)
  
  Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).
  API name: randomize_seed
- softTreeDepthLimit
  
  public final BuilderT softTreeDepthLimit(@Nullable Integer value)
  
  Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.
  API name: soft_tree_depth_limit
- softTreeDepthTolerance
  
  public final BuilderT softTreeDepthTolerance(@Nullable Double value)
  
  Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.
  API name: soft_tree_depth_tolerance
- trainingPercent
  
  public final BuilderT trainingPercent(@Nullable String value)
  
  Defines what percentage of the eligible documents that will be used for training. Documents that are ignored by the analysis (for example those that contain arrays with more than one value) won’t be included in the calculation for used percentage.
  API name: training_percent
- self
  
  protected abstract BuilderT self()
  
  Specified by:
  
  self in class WithJsonObjectBuilderBase<BuilderT extends DataframeAnalysisBase.AbstractBuilder<BuilderT>>

Class DataframeAnalysisBase.AbstractBuilder<BuilderT extends DataframeAnalysisBase.AbstractBuilder<BuilderT>>

Constructor Summary

Method Summary

Methods inherited from class co.elastic.clients.util.WithJsonObjectBuilderBase

Methods inherited from class co.elastic.clients.util.ObjectBuilderBase

Methods inherited from class java.lang.Object

Methods inherited from interface co.elastic.clients.json.WithJson

Constructor Details

AbstractBuilder

Method Details

alpha

dependentVariable

downsampleFactor

earlyStoppingEnabled

eta

etaGrowthRatePerTree

featureBagFraction

featureProcessors

featureProcessors

featureProcessors

gamma

lambda

maxOptimizationRoundsPerHyperparameter

maxTrees

numTopFeatureImportanceValues

predictionFieldName

randomizeSeed

softTreeDepthLimit

softTreeDepthTolerance

trainingPercent

self