Class BatchNorm
- All Implemented Interfaces:
Block
- Direct Known Subclasses:
GhostBatchNorm
The problem of varying distribution of input data requires the training process of a deep network to compensate for each different data distribution per batch, hence changing parameters' values as new batch data is processed and changes distribution of the network's (and each of its layers) activations. This condition is termed as internal covariate shift, and such occurrence prevents the network to learn faster and generalize better to unseen data.
With batch normalization, one benefits by having faster learning process as batch normalization allows larger learning rate without causing gradient problems on backpropagation as all inputs are normalized and hence reducing the scale of weight update impact on backpropagation. In some cases, the utilization of batch normalization layer regularizes the network and reduces, even eliminates, the need for dropout, which in turn results in even faster training process since dropout slows down training by 2-3 times. However, it was reported that batch normalization may not be beneficial when small batch sizes are used.
Formally, batch normalization is represented below:
\(\hat{x} \:=\: \frac{x \:-\: \mu_{batch}}{\sqrt{\sigma^2_{batch} \:+\: \epsilon}}\),
where \(\hat{x}\) is the normalized input, \(\mu_{batch}\) and \(\sigma^2_{batch}\) respectively
denote the mean and variance of a batch, and \(\epsilon\) (epsilon) is a constant for numerical
stability. The scale and shift operation can be formally defined as follows:
\(y \:=\: \gamma\hat{x} \:+\: \beta\),
where \(\gamma\) is the scale factor and \(\beta\) is the shift factor.
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic class
BatchNorm.BaseBuilder<T extends BatchNorm.BaseBuilder<T>>
static class
The Builder to construct aBatchNorm
. -
Field Summary
Fields inherited from class ai.djl.nn.AbstractBlock
children, parameters
Fields inherited from class ai.djl.nn.AbstractBaseBlock
inputNames, inputShapes, outputDataTypes, version
-
Method Summary
Modifier and TypeMethodDescriptionstatic NDList
Applies Batch Normalization for each channel across a batch of data.static NDList
Applies Batch Normalization for each channel across a batch of data.static NDList
batchNorm
(NDArray input, NDArray runningMean, NDArray runningVar, NDArray gamma, NDArray beta, int axis) Applies Batch Normalization for each channel across a batch of data.static NDList
batchNorm
(NDArray input, NDArray runningMean, NDArray runningVar, NDArray gamma, NDArray beta, int axis, float momentum, float eps, boolean training) Applies Batch Normalization for each channel across a batch of data.protected void
beforeInitialize
(Shape... inputShapes) Performs any action necessary before initialization.static BatchNorm.BaseBuilder<?>
builder()
Creates a builder to build aBatchNorm
.protected NDList
forwardInternal
(ParameterStore parameterStore, NDList inputs, boolean training, ai.djl.util.PairList<String, Object> params) A helper forBlock.forward(ParameterStore, NDList, boolean, PairList)
after initialization.Shape[]
getOutputShapes
(Shape[] inputShapes) Returns the expected output shapes of the block for the specified input shapes.void
loadMetadata
(byte loadVersion, DataInputStream is) Overwrite this to load additional metadata with the parameter values.void
Sets the shape ofParameter
s.protected void
Override this method to save additional data apart from parameter values.Methods inherited from class ai.djl.nn.AbstractBlock
addChildBlock, addChildBlock, addChildBlockSingleton, addParameter, getChildren, getDirectParameters
Methods inherited from class ai.djl.nn.AbstractBaseBlock
cast, clear, describeInput, forward, forward, forwardInternal, getInputShapes, getOutputDataTypes, getParameters, initialize, initializeChildBlocks, isInitialized, loadParameters, readInputShapes, saveInputShapes, saveParameters, setInitializer, setInitializer, setInitializer, toString
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface ai.djl.nn.Block
forward, freezeParameters, freezeParameters, getCustomMetadata, getOutputShapes
-
Method Details
-
forwardInternal
protected NDList forwardInternal(ParameterStore parameterStore, NDList inputs, boolean training, ai.djl.util.PairList<String, Object> params) A helper forBlock.forward(ParameterStore, NDList, boolean, PairList)
after initialization.- Specified by:
forwardInternal
in classAbstractBaseBlock
- Parameters:
parameterStore
- the parameter storeinputs
- the input NDListtraining
- true for a training forward passparams
- optional parameters- Returns:
- the output of the forward pass
-
getOutputShapes
Returns the expected output shapes of the block for the specified input shapes.- Parameters:
inputShapes
- the shapes of the inputs- Returns:
- the expected output shapes of the block
-
beforeInitialize
Performs any action necessary before initialization. For example, keep the input information or verify the layout.- Overrides:
beforeInitialize
in classAbstractBaseBlock
- Parameters:
inputShapes
- the expected shapes of the input
-
prepare
Sets the shape ofParameter
s.- Overrides:
prepare
in classAbstractBaseBlock
- Parameters:
inputShapes
- the shapes of inputs
-
saveMetadata
Override this method to save additional data apart from parameter values.This default implementation saves the currently set input shapes.
- Overrides:
saveMetadata
in classAbstractBaseBlock
- Parameters:
os
- the non-null output stream the parameter values and metadata are written to- Throws:
IOException
- saving failed
-
loadMetadata
public void loadMetadata(byte loadVersion, DataInputStream is) throws IOException, MalformedModelException Overwrite this to load additional metadata with the parameter values.If you overwrite
AbstractBaseBlock.saveMetadata(DataOutputStream)
or need to provide backward compatibility to older binary formats, you probably need to overwrite this. This default implementation checks if the version number fits, if not it throws anMalformedModelException
. After that it restores the input shapes.- Overrides:
loadMetadata
in classAbstractBaseBlock
- Parameters:
loadVersion
- the version used for loading this metadata.is
- the input stream we are loading from- Throws:
IOException
- loading failedMalformedModelException
- data can be loaded but has wrong format
-
batchNorm
Applies Batch Normalization for each channel across a batch of data.- Parameters:
input
- the inputNDArray
of shape (batchSize, inputChannel, *), * could be empty, width, (height, width), (depth, height, width)runningMean
- runningMeanNDArray
runningVar
- runningVarNDArray
- Returns:
- the output
NDArray
of shape (batchSize, inputChannel, *), * could be empty, width, (height, width), (depth, height, width)
-
batchNorm
public static NDList batchNorm(NDArray input, NDArray runningMean, NDArray runningVar, NDArray gamma, NDArray beta) Applies Batch Normalization for each channel across a batch of data.- Parameters:
input
- the inputNDArray
of shape (batchSize, inputChannel, *), * could be empty, width, (height, width), (depth, height, width)runningMean
- runningMeanNDArray
runningVar
- runningVarNDArray
gamma
- gamma weightNDArray
beta
- beta weightNDArray
- Returns:
- the output
NDArray
of shape (batchSize, inputChannel, *), * could be empty, width, (height, width), (depth, height, width)
-
batchNorm
public static NDList batchNorm(NDArray input, NDArray runningMean, NDArray runningVar, NDArray gamma, NDArray beta, int axis) Applies Batch Normalization for each channel across a batch of data.- Parameters:
input
- the inputNDArray
of shape (batchSize, inputChannel, *), * could be empty, width, (height, width), (depth, height, width)runningMean
- runningMeanNDArray
runningVar
- runningVarNDArray
gamma
- gamma weightNDArray
beta
- beta weightNDArray
axis
- the axis that should be normalized- Returns:
- the output
NDArray
of shape (batchSize, inputChannel, *), * could be empty, width, (height, width), (depth, height, width)
-
batchNorm
public static NDList batchNorm(NDArray input, NDArray runningMean, NDArray runningVar, NDArray gamma, NDArray beta, int axis, float momentum, float eps, boolean training) Applies Batch Normalization for each channel across a batch of data.- Parameters:
input
- the inputNDArray
of shape (batchSize, inputChannel, *), * could be empty, width, (height, width), (depth, height, width)runningMean
- runningMeanNDArray
runningVar
- runningVarNDArray
gamma
- gamma weightNDArray
beta
- beta weightNDArray
axis
- the axis that should be normalizedmomentum
- the value used for the runningMean and runningVar computation.eps
- a value added to the denominator for numerical stabilitytraining
- indicate the training mode if true- Returns:
- the output
NDArray
of shape (batchSize, inputChannel, *), * could be empty, width, (height, width), (depth, height, width)
-
builder
Creates a builder to build aBatchNorm
.- Returns:
- a new builder
-