whether to cache the Samples after preprocessing.
whether to cache the Samples after preprocessing. Default: true
Clear clipping params, in this case, clipping will not be applied.
Constant gradient clipping thresholds.
Constant gradient clipping thresholds.
Return a deep copy for DLEstimator.
Return a deep copy for DLEstimator. Note that trainSummary and validationSummary will not be copied to the new instance since currently they are not thread-safe.
BigDL criterion
When to stop the training, passed in a Trigger.
When to stop the training, passed in a Trigger. E.g. Trigger.maxIterations
get the validate configuration during training
get the validate configuration during training
an Option of Tuple(ValidationTrigger, Validation data, Array[ValidationMethod[T] ], batchsize)
Statistics (LearningRate, Loss, Throughput, Parameters) collected during training for the validation data if validation data is set, which can be used for visualization via Tensorboard.
Statistics (LearningRate, Loss, Throughput, Parameters) collected during training for the validation data if validation data is set, which can be used for visualization via Tensorboard. Use setValidationSummary to enable validation logger. Then the log will be saved to logDir/appName/ as specified by the parameters of validationSummary.
Default: None
L2 norm gradient clipping threshold.
L2 norm gradient clipping threshold.
learning rate for the optimizer in the NNEstimator.
learning rate for the optimizer in the NNEstimator. Default: 0.001
learning rate decay for each iteration.
learning rate decay for each iteration. Default: 0
Number of max Epoch for the training, an epoch refers to a traverse over the training data Default: 50
Number of max Epoch for the training, an epoch refers to a traverse over the training data Default: 50
BigDL module to be optimized
optimization method to be used.
optimization method to be used. BigDL supports many optimization methods like Adam, SGD and LBFGS. Refer to package com.intel.analytics.bigdl.optim for all the options. Default: SGD
Statistics (LearningRate, Loss, Throughput, Parameters) collected during training for the training data, which can be used for visualization via Tensorboard.
Statistics (LearningRate, Loss, Throughput, Parameters) collected during training for the training data, which can be used for visualization via Tensorboard. Use setTrainSummary to enable train logger. Then the log will be saved to logDir/appName/train as specified by the parameters of TrainSummary.
Default: Not enabled
Set a validate evaluation during training
Set a validate evaluation during training
how often to evaluation validation set
validate data set
a set of validation method ValidationMethod
batch size for validation
this optimizer
Enable validation Summary
sub classes can extend the method and return required model for different transform tasks
sub classes can extend the method and return required model for different transform tasks
NNEstimator extends org.apache.spark.ml.Estimator and supports training a BigDL model with Spark DataFrame data. It can be integrated into a standard Spark ML Pipeline to allow users combine the components of BigDL and Spark MLlib.
NNEstimator supports different feature and label data type through Preprocessing. We provide pre-defined Preprocessing for popular data types like Array or Vector in package com.intel.analytics.zoo.feature, while user can also develop customized Preprocessing. During fit, NNEstimator will extract feature and label data from input DataFrame and use the Preprocessing to prepare data for the model. Using the Preprocessing allows NNEstimator to cache only the raw data and decrease the memory consumption during feature conversion and training. More concrete examples are available in package com.intel.analytics.zoo.examples.nnframes
data type of BigDL Model