Class BatchNormalization
- java.lang.Object
-
- org.deeplearning4j.nn.layers.AbstractLayer<LayerConfT>
-
- org.deeplearning4j.nn.layers.BaseLayer<BatchNormalization>
-
- org.deeplearning4j.nn.layers.normalization.BatchNormalization
-
- All Implemented Interfaces:
Serializable
,Cloneable
,Layer
,Model
,Trainable
public class BatchNormalization extends BaseLayer<BatchNormalization>
- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.deeplearning4j.nn.api.Layer
Layer.TrainingMode, Layer.Type
-
-
Field Summary
Fields Modifier and Type Field Description static String
BATCH_NORM_CUDNN_HELPER_CLASS_NAME
protected int
helperCountFail
protected int
index
protected List<TrainingListener>
listeners
protected static double
ONE_ON_2LOGE_10
protected INDArray
std
protected INDArray
xHat
protected INDArray
xMu
-
Fields inherited from class org.deeplearning4j.nn.layers.BaseLayer
gradient, gradientsFlattened, gradientViews, optimizer, params, paramsFlattened, score, solver, weightNoiseParams
-
Fields inherited from class org.deeplearning4j.nn.layers.AbstractLayer
cacheMode, conf, dataType, dropoutApplied, epochCount, input, inputModificationAllowed, iterationCount, maskArray, maskState, preOutput, trainingListeners
-
-
Constructor Summary
Constructors Constructor Description BatchNormalization(NeuralNetConfiguration conf, DataType dataType)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description INDArray
activate(boolean training, LayerWorkspaceMgr workspaceMgr)
Perform forward pass and return the activations array with the last set inputPair<Gradient,INDArray>
backpropGradient(INDArray epsilon, LayerWorkspaceMgr workspaceMgr)
Calculate the gradient relative to the error in the next layervoid
fit(INDArray input, LayerWorkspaceMgr workspaceMgr)
Fit the model to the given dataLayerHelper
getHelper()
int
getIndex()
Get the layer index.Collection<TrainingListener>
getListeners()
Get the iteration listeners for this layer.long[]
getShape(INDArray x)
Gradient
gradient()
Get the gradient.boolean
isPretrainLayer()
Returns true if the layer can be trained in an unsupervised/pretrain manner (AE, VAE, etc)INDArray
preOutput(INDArray x, Layer.TrainingMode training, LayerWorkspaceMgr workspaceMgr)
void
setIndex(int index)
Set the layer index.void
setListeners(TrainingListener... listeners)
Set theTrainingListener
s for this model.Layer.Type
type()
Returns the layer typeboolean
updaterDivideByMinibatch(String paramName)
DL4J layers typically produce the sum of the gradients during the backward pass for each layer, and if required (if minibatch=true) then divide by the minibatch size.
However, there are some exceptions, such as the batch norm mean/variance estimate parameters: these "gradients" are actually not gradients, but are updates to be applied directly to the parameter vector.-
Methods inherited from class org.deeplearning4j.nn.layers.BaseLayer
calcRegularizationScore, clear, clearNoiseWeightParams, clone, computeGradientAndScore, fit, getGradientsViewArray, getOptimizer, getParam, getParamWithNoise, hasBias, hasLayerNorm, layerConf, numParams, params, paramTable, paramTable, preOutput, preOutputWithPreNorm, score, setBackpropGradientsViewArray, setParam, setParams, setParams, setParamsViewArray, setParamTable, setScoreWithZ, toString, update, update
-
Methods inherited from class org.deeplearning4j.nn.layers.AbstractLayer
activate, addListeners, allowInputModification, applyConstraints, applyDropOutIfNecessary, applyMask, assertInputSet, backpropDropOutIfPresent, batchSize, close, conf, feedForwardMaskArray, getConfig, getEpochCount, getInput, getInputMiniBatchSize, getMaskArray, gradientAndScore, init, input, layerId, numParams, setCacheMode, setConf, setEpochCount, setInput, setInputMiniBatchSize, setListeners, setMaskArray
-
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface org.deeplearning4j.nn.api.Layer
getIterationCount, setIterationCount
-
-
-
-
Field Detail
-
ONE_ON_2LOGE_10
protected static final double ONE_ON_2LOGE_10
-
helperCountFail
protected int helperCountFail
-
index
protected int index
-
listeners
protected List<TrainingListener> listeners
-
std
protected INDArray std
-
xMu
protected INDArray xMu
-
xHat
protected INDArray xHat
-
BATCH_NORM_CUDNN_HELPER_CLASS_NAME
public static final String BATCH_NORM_CUDNN_HELPER_CLASS_NAME
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
BatchNormalization
public BatchNormalization(NeuralNetConfiguration conf, DataType dataType)
-
-
Method Detail
-
type
public Layer.Type type()
Description copied from interface:Layer
Returns the layer type- Specified by:
type
in interfaceLayer
- Overrides:
type
in classAbstractLayer<BatchNormalization>
- Returns:
-
backpropGradient
public Pair<Gradient,INDArray> backpropGradient(INDArray epsilon, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:Layer
Calculate the gradient relative to the error in the next layer- Specified by:
backpropGradient
in interfaceLayer
- Overrides:
backpropGradient
in classBaseLayer<BatchNormalization>
- Parameters:
epsilon
- w^(L+1)*delta^(L+1). Or, equiv: dC/da, i.e., (dC/dz)*(dz/da) = dC/da, where C is cost function a=sigma(z) is activation.workspaceMgr
- Workspace manager- Returns:
- Pair
where Gradient is gradient for this layer, INDArray is epsilon (activation gradient) needed by next layer, but before element-wise multiply by sigmaPrime(z). So for standard feed-forward layer, if this layer is L, then return.getSecond() == dL/dIn = (w^(L)*(delta^(L))^T)^T. Note that the returned array should be placed in the ArrayType.ACTIVATION_GRAD
workspace via the workspace manager
-
fit
public void fit(INDArray input, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:Model
Fit the model to the given data- Specified by:
fit
in interfaceModel
- Overrides:
fit
in classBaseLayer<BatchNormalization>
- Parameters:
input
- the data to fit the model to
-
activate
public INDArray activate(boolean training, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:Layer
Perform forward pass and return the activations array with the last set input- Specified by:
activate
in interfaceLayer
- Overrides:
activate
in classBaseLayer<BatchNormalization>
- Parameters:
training
- training or test modeworkspaceMgr
- Workspace manager- Returns:
- the activation (layer output) of the last specified input. Note that the returned array should be placed
in the
ArrayType.ACTIVATIONS
workspace via the workspace manager
-
gradient
public Gradient gradient()
Description copied from interface:Model
Get the gradient. Note that this method will not calculate the gradient, it will rather return the gradient that has been computed before. For calculating the gradient, seeModel.computeGradientAndScore(LayerWorkspaceMgr)
} .- Specified by:
gradient
in interfaceModel
- Overrides:
gradient
in classBaseLayer<BatchNormalization>
- Returns:
- the gradient for this model, as calculated before
-
preOutput
public INDArray preOutput(INDArray x, Layer.TrainingMode training, LayerWorkspaceMgr workspaceMgr)
-
getListeners
public Collection<TrainingListener> getListeners()
Description copied from interface:Layer
Get the iteration listeners for this layer.- Specified by:
getListeners
in interfaceLayer
- Overrides:
getListeners
in classAbstractLayer<BatchNormalization>
-
setListeners
public void setListeners(TrainingListener... listeners)
Description copied from interface:Layer
Set theTrainingListener
s for this model. If any listeners have previously been set, they will be replaced by this method- Specified by:
setListeners
in interfaceLayer
- Specified by:
setListeners
in interfaceModel
- Overrides:
setListeners
in classAbstractLayer<BatchNormalization>
-
setIndex
public void setIndex(int index)
Description copied from interface:Layer
Set the layer index.- Specified by:
setIndex
in interfaceLayer
- Overrides:
setIndex
in classAbstractLayer<BatchNormalization>
-
getIndex
public int getIndex()
Description copied from interface:Layer
Get the layer index.- Specified by:
getIndex
in interfaceLayer
- Overrides:
getIndex
in classAbstractLayer<BatchNormalization>
-
isPretrainLayer
public boolean isPretrainLayer()
Description copied from interface:Layer
Returns true if the layer can be trained in an unsupervised/pretrain manner (AE, VAE, etc)- Returns:
- true if the layer can be pretrained (using fit(INDArray), false otherwise
-
getHelper
public LayerHelper getHelper()
- Specified by:
getHelper
in interfaceLayer
- Overrides:
getHelper
in classAbstractLayer<BatchNormalization>
- Returns:
- Get the layer helper, if any
-
getShape
public long[] getShape(INDArray x)
-
updaterDivideByMinibatch
public boolean updaterDivideByMinibatch(String paramName)
Description copied from interface:Trainable
DL4J layers typically produce the sum of the gradients during the backward pass for each layer, and if required (if minibatch=true) then divide by the minibatch size.
However, there are some exceptions, such as the batch norm mean/variance estimate parameters: these "gradients" are actually not gradients, but are updates to be applied directly to the parameter vector. Put another way, most gradients should be divided by the minibatch to get the average; some "gradients" are actually final updates already, and should not be divided by the minibatch size.- Specified by:
updaterDivideByMinibatch
in interfaceTrainable
- Overrides:
updaterDivideByMinibatch
in classAbstractLayer<BatchNormalization>
- Parameters:
paramName
- Name of the parameter- Returns:
- True if gradients should be divided by minibatch (most params); false otherwise (edge cases like batch norm mean/variance estimates)
-
-