public class BatchNorm extends AbstractBlock
The problem of varying distribution of input data requires the training process of a deep network to compensate for each different data distribution per batch, hence changing parameters' values as new batch data is processed and changes distribution of the network's (and each of its layers) activations. This condition is termed as internal covariate shift, and such occurrence prevents the network to learn faster and generalize better to unseen data.
With batch normalization, one benefits by having faster learning process as batch normalization allows larger learning rate without causing gradient problems on backpropagation as all inputs are normalized and hence reducing the scale of weight update impact on backpropagation. In some cases, the utilization of batch normalization layer regularizes the network and reduces, even eliminates, the need for dropout, which in turn results in even faster training process since dropout slows down training by 2-3 times. However, it was reported that batch normalization may not be beneficial when small batch sizes are used.
Formally, batch normalization is represented below:
\(\hat{x} \:=\: \frac{x \:-\: \mu_{batch}}{\sqrt{\sigma^2_{batch} \:+\: \epsilon}}\),
where \(\hat{x}\) is the normalized input, \(\mu_{batch}\) and \(\sigma^2_{batch}\) respectively
denote the mean and variance of a batch, and \(\epsilon\) (epsilon) is a constant for numerical
stability. The scale and shift operation can be formally defined as follows:
\(y \:=\: \gamma\hat{x} \:+\: \beta\),
where \(\gamma\) is the scale factor and \(\beta\) is the shift factor.
Modifier and Type | Class and Description |
---|---|
static class |
BatchNorm.Builder
The Builder to construct a
BatchNorm . |
children, inputNames, inputShapes, parameters, parameterShapeCallbacks, version
Modifier and Type | Method and Description |
---|---|
static NDList |
batchNorm(NDArray input,
NDArray runningMean,
NDArray runningVar)
Applies Batch Normalization for each channel across a batch of data.
|
static NDList |
batchNorm(NDArray input,
NDArray runningMean,
NDArray runningVar,
NDArray gamma,
NDArray beta)
Applies Batch Normalization for each channel across a batch of data.
|
static NDList |
batchNorm(NDArray input,
NDArray runningMean,
NDArray runningVar,
NDArray gamma,
NDArray beta,
int axis)
Applies Batch Normalization for each channel across a batch of data.
|
static NDList |
batchNorm(NDArray input,
NDArray runningMean,
NDArray runningVar,
NDArray gamma,
NDArray beta,
int axis,
float momentum,
float eps,
boolean training)
Applies Batch Normalization for each channel across a batch of data.
|
void |
beforeInitialize(Shape[] inputShapes)
Performs any action necessary before initialization.
|
static BatchNorm.Builder |
builder()
Creates a builder to build a
BatchNorm . |
NDList |
forward(ParameterStore parameterStore,
NDList inputs,
boolean training,
ai.djl.util.PairList<java.lang.String,java.lang.Object> params)
Applies the operating function of the block once.
|
Shape[] |
getOutputShapes(NDManager manager,
Shape[] inputShapes)
Returns the expected output shapes of the block for the specified input shapes.
|
void |
loadMetadata(byte version,
java.io.DataInputStream is)
Overwrite this to load additional metadata with the parameter values.
|
protected void |
saveMetadata(java.io.DataOutputStream os)
Override this method to save additional data apart from parameter values.
|
addChildBlock, addParameter, addParameter, addParameter, cast, clear, describeInput, getChildren, getDirectParameters, getParameters, getParameterShape, initialize, initializeChildBlocks, isInitialized, loadParameters, readInputShapes, saveInputShapes, saveParameters, setInitializer, setInitializer, toString
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
forward, forward, validateLayout
public NDList forward(ParameterStore parameterStore, NDList inputs, boolean training, ai.djl.util.PairList<java.lang.String,java.lang.Object> params)
parameterStore
- the parameter storeinputs
- the input NDListtraining
- true for a training forward passparams
- optional parameterspublic Shape[] getOutputShapes(NDManager manager, Shape[] inputShapes)
manager
- an NDManagerinputShapes
- the shapes of the inputspublic void beforeInitialize(Shape[] inputShapes)
beforeInitialize
in class AbstractBlock
inputShapes
- the expected shapes of the inputprotected void saveMetadata(java.io.DataOutputStream os) throws java.io.IOException
This default implementation saves the currently set input shapes.
saveMetadata
in class AbstractBlock
os
- the non-null output stream the parameter values and metadata are written tojava.io.IOException
- saving failedpublic void loadMetadata(byte version, java.io.DataInputStream is) throws java.io.IOException, MalformedModelException
If you overwrite AbstractBlock.saveMetadata(DataOutputStream)
or need to provide
backward compatibility to older binary formats, you prabably need to overwrite this. This
default implementation checks if the version number fits, if not it throws an MalformedModelException
. After that it restores the input shapes.
loadMetadata
in class AbstractBlock
version
- the version used for loading this metadata.is
- the input stream we are loading fromjava.io.IOException
- loading failedMalformedModelException
- data can be loaded but has wrong formatpublic static NDList batchNorm(NDArray input, NDArray runningMean, NDArray runningVar)
input
- the input NDArray
of shape (batchSize, inputChannel, *), * could be
empty, width, (height, width), (depth, height, width)runningMean
- runningMean NDArray
runningVar
- runningVar NDArray
NDArray
of shape (batchSize, inputChannel, *), * could be empty,
width, (height, width), (depth, height, width)public static NDList batchNorm(NDArray input, NDArray runningMean, NDArray runningVar, NDArray gamma, NDArray beta)
input
- the input NDArray
of shape (batchSize, inputChannel, *), * could be
empty, width, (height, width), (depth, height, width)runningMean
- runningMean NDArray
runningVar
- runningVar NDArray
gamma
- gamma weight NDArray
beta
- beta weight NDArray
NDArray
of shape (batchSize, inputChannel, *), * could be empty,
width, (height, width), (depth, height, width)public static NDList batchNorm(NDArray input, NDArray runningMean, NDArray runningVar, NDArray gamma, NDArray beta, int axis)
input
- the input NDArray
of shape (batchSize, inputChannel, *), * could be
empty, width, (height, width), (depth, height, width)runningMean
- runningMean NDArray
runningVar
- runningVar NDArray
gamma
- gamma weight NDArray
beta
- beta weight NDArray
axis
- the axis that should be normalizedNDArray
of shape (batchSize, inputChannel, *), * could be empty,
width, (height, width), (depth, height, width)public static NDList batchNorm(NDArray input, NDArray runningMean, NDArray runningVar, NDArray gamma, NDArray beta, int axis, float momentum, float eps, boolean training)
input
- the input NDArray
of shape (batchSize, inputChannel, *), * could be
empty, width, (height, width), (depth, height, width)runningMean
- runningMean NDArray
runningVar
- runningVar NDArray
gamma
- gamma weight NDArray
beta
- beta weight NDArray
axis
- the axis that should be normalizedmomentum
- the value used for the runningMean and runningVar computation.eps
- a value added to the denominator for numerical stabilitytraining
- indicate the training mode if trueNDArray
of shape (batchSize, inputChannel, *), * could be empty,
width, (height, width), (depth, height, width)public static BatchNorm.Builder builder()
BatchNorm
.