Class BatchNormalization.Builder
- java.lang.Object
-
- org.deeplearning4j.nn.conf.layers.Layer.Builder<T>
-
- org.deeplearning4j.nn.conf.layers.BaseLayer.Builder<T>
-
- org.deeplearning4j.nn.conf.layers.FeedForwardLayer.Builder<BatchNormalization.Builder>
-
- org.deeplearning4j.nn.conf.layers.BatchNormalization.Builder
-
- Enclosing class:
- BatchNormalization
public static class BatchNormalization.Builder extends FeedForwardLayer.Builder<BatchNormalization.Builder>
-
-
Field Summary
Fields Modifier and Type Field Description protected double
beta
Used only when 'true' is passed tolockGammaBeta(boolean)
.protected List<LayerConstraint>
betaConstraints
Set constraints to be applied to the beta parameter of this batch normalisation layer.protected CNN2DFormat
cnn2DFormat
protected boolean
cudnnAllowFallback
When using CuDNN and an error is encountered, should fallback to the non-CuDNN implementatation be allowed? If set to false, an exception in CuDNN will be propagated back to the user.protected double
decay
At test time: we can use a global estimate of the mean and variance, calculated using a moving average of the batch means/variances.protected double
eps
Epsilon value for batch normalization; small floating point value added to variance (algorithm 1 in https://arxiv.org/pdf/1502.03167v3.pdf) to reduce/avoid underflow issues.
Default: 1e-5protected double
gamma
Used only when 'true' is passed tolockGammaBeta(boolean)
.protected List<LayerConstraint>
gammaConstraints
Set constraints to be applied to the gamma parameter of this batch normalisation layer.protected boolean
isMinibatch
If doing minibatch training or not.protected boolean
lockGammaBeta
If set to true: lock the gamma and beta parameters to the values for each activation, specified bygamma(double)
andbeta(double)
.protected boolean
useLogStd
How should the moving average of variance be stored? Two different parameterizations are supported.-
Fields inherited from class org.deeplearning4j.nn.conf.layers.FeedForwardLayer.Builder
nIn, nOut
-
Fields inherited from class org.deeplearning4j.nn.conf.layers.BaseLayer.Builder
activationFn, biasInit, biasUpdater, gainInit, gradientNormalization, gradientNormalizationThreshold, iupdater, regularization, regularizationBias, weightInitFn, weightNoise
-
Fields inherited from class org.deeplearning4j.nn.conf.layers.Layer.Builder
allParamConstraints, biasConstraints, iDropout, layerName, weightConstraints
-
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description BatchNormalization.Builder
beta(double beta)
Used only when 'true' is passed tolockGammaBeta(boolean)
.BatchNormalization
build()
BatchNormalization.Builder
constrainBeta(LayerConstraint... constraints)
Set constraints to be applied to the beta parameter of this batch normalisation layer.BatchNormalization.Builder
constrainGamma(LayerConstraint... constraints)
Set constraints to be applied to the gamma parameter of this batch normalisation layer.BatchNormalization.Builder
cudnnAllowFallback(boolean allowFallback)
Deprecated.BatchNormalization.Builder
dataFormat(CNN2DFormat format)
Set the input and output array data format.BatchNormalization.Builder
decay(double decay)
At test time: we can use a global estimate of the mean and variance, calculated using a moving average of the batch means/variances.BatchNormalization.Builder
eps(double eps)
Epsilon value for batch normalization; small floating point value added to variance (algorithm 1 in https://arxiv.org/pdf/1502.03167v3.pdf) to reduce/avoid underflow issues.
Default: 1e-5BatchNormalization.Builder
gamma(double gamma)
Used only when 'true' is passed tolockGammaBeta(boolean)
.BatchNormalization.Builder
helperAllowFallback(boolean allowFallback)
When using CuDNN or MKLDNN and an error is encountered, should fallback to the non-helper implementation be allowed? If set to false, an exception in the helper will be propagated back to the user.BatchNormalization.Builder
lockGammaBeta(boolean lockGammaBeta)
If set to true: lock the gamma and beta parameters to the values for each activation, specified bygamma(double)
andbeta(double)
.BatchNormalization.Builder
minibatch(boolean minibatch)
If doing minibatch training or not.BatchNormalization.Builder
useLogStd(boolean useLogStd)
How should the moving average of variance be stored? Two different parameterizations are supported.-
Methods inherited from class org.deeplearning4j.nn.conf.layers.FeedForwardLayer.Builder
nIn, nIn, nOut, nOut, units
-
Methods inherited from class org.deeplearning4j.nn.conf.layers.BaseLayer.Builder
activation, activation, biasInit, biasUpdater, dist, gainInit, gradientNormalization, gradientNormalizationThreshold, l1, l1Bias, l2, l2Bias, regularization, regularizationBias, updater, updater, weightDecay, weightDecay, weightDecayBias, weightDecayBias, weightInit, weightInit, weightInit, weightNoise
-
Methods inherited from class org.deeplearning4j.nn.conf.layers.Layer.Builder
constrainAllParameters, constrainBias, constrainWeights, dropOut, dropOut, name
-
-
-
-
Field Detail
-
decay
protected double decay
At test time: we can use a global estimate of the mean and variance, calculated using a moving average of the batch means/variances. This moving average is implemented as:
globalMeanEstimate = decay * globalMeanEstimate + (1-decay) * batchMean
globalVarianceEstimate = decay * globalVarianceEstimate + (1-decay) * batchVariance
-
eps
protected double eps
Epsilon value for batch normalization; small floating point value added to variance (algorithm 1 in https://arxiv.org/pdf/1502.03167v3.pdf) to reduce/avoid underflow issues.
Default: 1e-5
-
isMinibatch
protected boolean isMinibatch
If doing minibatch training or not. Default: true. Under most circumstances, this should be set to true. If doing full batch training (i.e., all examples in a single DataSet object - very small data sets) then this should be set to false. Affects how global mean/variance estimates are calculated.
-
lockGammaBeta
protected boolean lockGammaBeta
If set to true: lock the gamma and beta parameters to the values for each activation, specified bygamma(double)
andbeta(double)
. Default: false -> learn gamma and beta parameter values during network training.
-
gamma
protected double gamma
Used only when 'true' is passed tolockGammaBeta(boolean)
. Value is not used otherwise.
Default: 1.0
-
beta
protected double beta
Used only when 'true' is passed tolockGammaBeta(boolean)
. Value is not used otherwise.
Default: 0.0
-
betaConstraints
protected List<LayerConstraint> betaConstraints
Set constraints to be applied to the beta parameter of this batch normalisation layer. Default: no constraints.
Constraints can be used to enforce certain conditions (non-negativity of parameters, max-norm regularization, etc). These constraints are applied at each iteration, after the parameters have been updated.
-
gammaConstraints
protected List<LayerConstraint> gammaConstraints
Set constraints to be applied to the gamma parameter of this batch normalisation layer. Default: no constraints.
Constraints can be used to enforce certain conditions (non-negativity of parameters, max-norm regularization, etc). These constraints are applied at each iteration, after the parameters have been updated.
-
cudnnAllowFallback
protected boolean cudnnAllowFallback
When using CuDNN and an error is encountered, should fallback to the non-CuDNN implementatation be allowed? If set to false, an exception in CuDNN will be propagated back to the user. If false, the built-in (non-CuDNN) implementation for BatchNormalization will be used
-
useLogStd
protected boolean useLogStd
How should the moving average of variance be stored? Two different parameterizations are supported. useLogStd(false): equivalent to 1.0.0-beta3 and earlier. The variance "parameter" is stored directly as variable
useLogStd(true): (Default) variance is stored as log10(stdev)
The motivation here is for numerical stability (FP16 etc) and also distributed training: storing the variance directly can cause numerical issues. For example, a standard deviation of 1e-3 (something that could be encountered in practice) gives a variance of 1e-6, which can be problematic for 16-bit floating point
-
cnn2DFormat
protected CNN2DFormat cnn2DFormat
-
-
Method Detail
-
dataFormat
public BatchNormalization.Builder dataFormat(CNN2DFormat format)
Set the input and output array data format. Defaults to NCHW format - i.e., channels first. SeeCNN2DFormat
for more details- Parameters:
format
- Format to use
-
minibatch
public BatchNormalization.Builder minibatch(boolean minibatch)
If doing minibatch training or not. Default: true. Under most circumstances, this should be set to true. If doing full batch training (i.e., all examples in a single DataSet object - very small data sets) then this should be set to false. Affects how global mean/variance estimates are calculated.- Parameters:
minibatch
- Minibatch parameter
-
gamma
public BatchNormalization.Builder gamma(double gamma)
Used only when 'true' is passed tolockGammaBeta(boolean)
. Value is not used otherwise.
Default: 1.0- Parameters:
gamma
- Gamma parameter for all activations, used only with locked gamma/beta configuration mode
-
beta
public BatchNormalization.Builder beta(double beta)
Used only when 'true' is passed tolockGammaBeta(boolean)
. Value is not used otherwise.
Default: 0.0- Parameters:
beta
- Beta parameter for all activations, used only with locked gamma/beta configuration mode
-
eps
public BatchNormalization.Builder eps(double eps)
Epsilon value for batch normalization; small floating point value added to variance (algorithm 1 in https://arxiv.org/pdf/1502.03167v3.pdf) to reduce/avoid underflow issues.
Default: 1e-5- Parameters:
eps
- Epsilon values to use
-
decay
public BatchNormalization.Builder decay(double decay)
At test time: we can use a global estimate of the mean and variance, calculated using a moving average of the batch means/variances. This moving average is implemented as:
globalMeanEstimate = decay * globalMeanEstimate + (1-decay) * batchMean
globalVarianceEstimate = decay * globalVarianceEstimate + (1-decay) * batchVariance- Parameters:
decay
- Decay value to use for global stats calculation
-
lockGammaBeta
public BatchNormalization.Builder lockGammaBeta(boolean lockGammaBeta)
If set to true: lock the gamma and beta parameters to the values for each activation, specified bygamma(double)
andbeta(double)
. Default: false -> learn gamma and beta parameter values during network training.- Parameters:
lockGammaBeta
- If true: use fixed beta/gamma values. False: learn during
-
constrainBeta
public BatchNormalization.Builder constrainBeta(LayerConstraint... constraints)
Set constraints to be applied to the beta parameter of this batch normalisation layer. Default: no constraints.
Constraints can be used to enforce certain conditions (non-negativity of parameters, max-norm regularization, etc). These constraints are applied at each iteration, after the parameters have been updated.- Parameters:
constraints
- Constraints to apply to the beta parameter of this layer
-
constrainGamma
public BatchNormalization.Builder constrainGamma(LayerConstraint... constraints)
Set constraints to be applied to the gamma parameter of this batch normalisation layer. Default: no constraints.
Constraints can be used to enforce certain conditions (non-negativity of parameters, max-norm regularization, etc). These constraints are applied at each iteration, after the parameters have been updated.- Parameters:
constraints
- Constraints to apply to the gamma parameter of this layer
-
cudnnAllowFallback
@Deprecated public BatchNormalization.Builder cudnnAllowFallback(boolean allowFallback)
Deprecated.When using CuDNN and an error is encountered, should fallback to the non-CuDNN implementatation be allowed? If set to false, an exception in CuDNN will be propagated back to the user. If true, the built-in (non-CuDNN) implementation for BatchNormalization will be used- Parameters:
allowFallback
- Whether fallback to non-CuDNN implementation should be used
-
helperAllowFallback
public BatchNormalization.Builder helperAllowFallback(boolean allowFallback)
When using CuDNN or MKLDNN and an error is encountered, should fallback to the non-helper implementation be allowed? If set to false, an exception in the helper will be propagated back to the user. If true, the built-in (non-MKL/CuDNN) implementation for BatchNormalizationLayer will be used- Parameters:
allowFallback
- Whether fallback to non-CuDNN implementation should be used
-
useLogStd
public BatchNormalization.Builder useLogStd(boolean useLogStd)
How should the moving average of variance be stored? Two different parameterizations are supported. useLogStd(false): equivalent to 1.0.0-beta3 and earlier. The variance "parameter" is stored directly as variable
useLogStd(true): (Default) variance is stored as log10(stdev)
The motivation here is for numerical stability (FP16 etc) and also distributed training: storing the variance directly can cause numerical issues. For example, a standard deviation of 1e-3 (something that could be encountered in practice) gives a variance of 1e-6, which can be problematic for 16-bit floating point
-
build
public BatchNormalization build()
- Specified by:
build
in classLayer.Builder<BatchNormalization.Builder>
-
-