Package org.deeplearning4j.nn.updater
Class BaseMultiLayerUpdater<T extends Model>
- java.lang.Object
-
- org.deeplearning4j.nn.updater.BaseMultiLayerUpdater<T>
-
- All Implemented Interfaces:
Serializable,Updater
- Direct Known Subclasses:
ComputationGraphUpdater,LayerUpdater,MultiLayerUpdater
public abstract class BaseMultiLayerUpdater<T extends Model> extends Object implements Updater
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected List<INDArray>gradientsForMinibatchDivisionprotected booleaninitializedMinibatchDivisionprotected Map<String,Trainable>layersByNameprotected Tnetworkprotected List<UpdaterBlock>updaterBlocksprotected INDArrayupdaterStateViewArray
-
Constructor Summary
Constructors Constructor Description BaseMultiLayerUpdater(T network)BaseMultiLayerUpdater(T network, INDArray updaterState)
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected voiddivideByMinibatch(boolean isExternal, Gradient gradient, int batchSize)booleanequals(Object o)abstract INDArraygetFlattenedGradientsView()protected List<INDArray>getMinibatchDivisionSubsets(INDArray from)protected abstract Trainable[]getOrderedLayers()protected abstract INDArraygetParams()INDArraygetStateViewArray()INDArraygetStateViewArrayCopy()A synchronized version ofgetStateViewArray()that duplicates the view array internally.inthashCode()protected abstract booleanisMiniBatch()protected booleanisSingleLayerUpdater()voidpreApply(Trainable layer, Gradient gradient, int iteration)Pre-apply: Apply gradient normalization/clippingvoidsetStateViewArray(Trainable layer, INDArray viewArray, boolean initialize)Set the internal (historical) state view array for this updatervoidsetStateViewArray(INDArray viewArray)Set the view array.voidupdate(Trainable layer, Gradient gradient, int iteration, int epoch, int batchSize, LayerWorkspaceMgr workspaceMgr)Updater: updates the modelvoidupdate(Gradient gradient, int iteration, int epoch, int batchSize, LayerWorkspaceMgr workspaceMgr)Update the gradient for the model.
-
-
-
Field Detail
-
updaterBlocks
protected final List<UpdaterBlock> updaterBlocks
-
updaterStateViewArray
protected INDArray updaterStateViewArray
-
initializedMinibatchDivision
protected boolean initializedMinibatchDivision
-
-
Constructor Detail
-
BaseMultiLayerUpdater
public BaseMultiLayerUpdater(T network)
-
-
Method Detail
-
getOrderedLayers
protected abstract Trainable[] getOrderedLayers()
- Returns:
- Array of layers, in the correct order (i.e., same order as the parameter/gradient/updater flattening order - input to output for MultiLayerNetwork, or topological order for ComputationGraph)
-
getFlattenedGradientsView
public abstract INDArray getFlattenedGradientsView()
- Returns:
- The flattened gradient view array for the model
-
getParams
protected abstract INDArray getParams()
- Returns:
- The flattened parameter array for the model
-
isMiniBatch
protected abstract boolean isMiniBatch()
- Returns:
- True if the configuration for the model is set to minibatch (divide by minibatch size), false otherwise
-
setStateViewArray
public void setStateViewArray(INDArray viewArray)
Set the view array. Note that this does an assign operation - the provided array is not stored internally.- Parameters:
viewArray- The new updater state
-
setStateViewArray
public void setStateViewArray(Trainable layer, INDArray viewArray, boolean initialize)
Description copied from interface:UpdaterSet the internal (historical) state view array for this updater- Specified by:
setStateViewArrayin interfaceUpdater- Parameters:
layer- Layer that this updater belongs toviewArray- View arrayinitialize- Whether to initialize the array or not
-
getStateViewArray
public INDArray getStateViewArray()
- Specified by:
getStateViewArrayin interfaceUpdater- Returns:
- the view array for this updater
-
getStateViewArrayCopy
public INDArray getStateViewArrayCopy()
A synchronized version ofgetStateViewArray()that duplicates the view array internally. This should be used in preference togetStateViewArray()when the updater state is accessed in one thread while another thread is using the updater for training.- Returns:
- A copy (duplicate) of the updater state
-
update
public void update(Trainable layer, Gradient gradient, int iteration, int epoch, int batchSize, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:UpdaterUpdater: updates the model
-
update
public void update(Gradient gradient, int iteration, int epoch, int batchSize, LayerWorkspaceMgr workspaceMgr)
Update the gradient for the model. This operates in 3 steps: 1. Pre-apply: gradient clipping, etc on a per-layer basis 2. Execute the updater (Adam, Nesterov momentum, etc) - in blocks of layers at a time 3. Divide by minibatch size- Parameters:
gradient- Gradient to updateriteration- The current iteration (i.e., number of parameter updates so far)batchSize- The current minibatch size (number of examples)
-
divideByMinibatch
protected void divideByMinibatch(boolean isExternal, Gradient gradient, int batchSize)
-
isSingleLayerUpdater
protected boolean isSingleLayerUpdater()
-
preApply
public void preApply(Trainable layer, Gradient gradient, int iteration)
Pre-apply: Apply gradient normalization/clipping- Parameters:
layer- Layer to apply gradient normalization/clipping forgradient- Gradient to updateiteration- The current iteration (i.e., number of parameter updates so far)
-
-