Class BaseWrapperLayer
- java.lang.Object
-
- org.deeplearning4j.nn.layers.wrapper.BaseWrapperLayer
-
- All Implemented Interfaces:
Serializable
,Cloneable
,Layer
,Model
,Trainable
- Direct Known Subclasses:
FrozenLayer
,FrozenLayerWithBackprop
,LastTimeStepLayer
,MaskZeroLayer
,TimeDistributedLayer
public abstract class BaseWrapperLayer extends Object implements Layer
- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.deeplearning4j.nn.api.Layer
Layer.TrainingMode, Layer.Type
-
-
Field Summary
Fields Modifier and Type Field Description protected Layer
underlying
-
Constructor Summary
Constructors Constructor Description BaseWrapperLayer(@NonNull Layer underlying)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description INDArray
activate(boolean training, LayerWorkspaceMgr workspaceMgr)
Perform forward pass and return the activations array with the last set inputINDArray
activate(INDArray input, boolean training, LayerWorkspaceMgr workspaceMgr)
Perform forward pass and return the activations array with the specified inputvoid
addListeners(TrainingListener... listener)
This method ADDS additional TrainingListener to existing listenersvoid
allowInputModification(boolean allow)
A performance optimization: mark whether the layer is allowed to modify its input array in-place.void
applyConstraints(int iteration, int epoch)
Apply any constraints to the modelPair<Gradient,INDArray>
backpropGradient(INDArray epsilon, LayerWorkspaceMgr workspaceMgr)
Calculate the gradient relative to the error in the next layerint
batchSize()
The current inputs batch sizedouble
calcRegularizationScore(boolean backpropParamsOnly)
Calculate the regularization component of the score, for the parameters in this layer
For example, the L1, L2 and/or weight decay components of the loss functionvoid
clear()
Clear inputvoid
clearNoiseWeightParams()
void
close()
void
computeGradientAndScore(LayerWorkspaceMgr workspaceMgr)
Update the scoreNeuralNetConfiguration
conf()
The configuration for the neural networkPair<INDArray,MaskState>
feedForwardMaskArray(INDArray maskArray, MaskState currentMaskState, int minibatchSize)
Feed forward the input mask array, setting in the layer as appropriate.void
fit()
All models have a fit methodvoid
fit(INDArray data, LayerWorkspaceMgr workspaceMgr)
Fit the model to the given dataTrainingConfig
getConfig()
int
getEpochCount()
INDArray
getGradientsViewArray()
LayerHelper
getHelper()
int
getIndex()
Get the layer index.int
getInputMiniBatchSize()
Get current/last input mini-batch size, as set by setInputMiniBatchSize(int)int
getIterationCount()
Collection<TrainingListener>
getListeners()
Get the iteration listeners for this layer.INDArray
getMaskArray()
ConvexOptimizer
getOptimizer()
Returns this models optimizerINDArray
getParam(String param)
Get the parameterGradient
gradient()
Get the gradient.Pair<Gradient,Double>
gradientAndScore()
Get the gradient and scorevoid
init()
Init the modelINDArray
input()
The input/feature matrix for the modelboolean
isPretrainLayer()
Returns true if the layer can be trained in an unsupervised/pretrain manner (AE, VAE, etc)long
numParams()
the number of parameters for the modellong
numParams(boolean backwards)
the number of parameters for the modelINDArray
params()
Parameters of the model (if any)Map<String,INDArray>
paramTable()
The param tableMap<String,INDArray>
paramTable(boolean backpropParamsOnly)
Table of parameters by key, for backprop For many models (dense layers, etc) - all parameters are backprop parametersdouble
score()
The score for the modelvoid
setBackpropGradientsViewArray(INDArray gradients)
Set the gradients array as a view of the full (backprop) network parameters NOTE: this is intended to be used internally in MultiLayerNetwork and ComputationGraph, not by users.void
setCacheMode(CacheMode mode)
This method sets given CacheMode for current layervoid
setConf(NeuralNetConfiguration conf)
Setter for the configurationvoid
setEpochCount(int epochCount)
Set the current epoch count (number of epochs passed ) for the layer/networkvoid
setIndex(int index)
Set the layer index.void
setInput(INDArray input, LayerWorkspaceMgr workspaceMgr)
Set the layer input.void
setInputMiniBatchSize(int size)
Set current/last input mini-batch size.
Used for score and gradient calculations.void
setIterationCount(int iterationCount)
Set the current iteration count (number of parameter updates) for the layer/networkvoid
setListeners(Collection<TrainingListener> listeners)
Set theTrainingListener
s for this model.void
setListeners(TrainingListener... listeners)
Set theTrainingListener
s for this model.void
setMaskArray(INDArray maskArray)
Set the mask array.void
setParam(String key, INDArray val)
Set the parameter with a new ndarrayvoid
setParams(INDArray params)
Set the parameters for this model.void
setParamsViewArray(INDArray params)
Set the initial parameters array as a view of the full (backprop) network parameters NOTE: this is intended to be used internally in MultiLayerNetwork and ComputationGraph, not by users.void
setParamTable(Map<String,INDArray> paramTable)
Setter for the param tableLayer.Type
type()
Returns the layer typevoid
update(Gradient gradient)
Update layer weights and biases with gradient changevoid
update(INDArray gradient, String paramType)
Perform one update applying the gradientboolean
updaterDivideByMinibatch(String paramName)
DL4J layers typically produce the sum of the gradients during the backward pass for each layer, and if required (if minibatch=true) then divide by the minibatch size.
However, there are some exceptions, such as the batch norm mean/variance estimate parameters: these "gradients" are actually not gradients, but are updates to be applied directly to the parameter vector.
-
-
-
Field Detail
-
underlying
protected Layer underlying
-
-
Constructor Detail
-
BaseWrapperLayer
public BaseWrapperLayer(@NonNull @NonNull Layer underlying)
-
-
Method Detail
-
setCacheMode
public void setCacheMode(CacheMode mode)
Description copied from interface:Layer
This method sets given CacheMode for current layer- Specified by:
setCacheMode
in interfaceLayer
-
calcRegularizationScore
public double calcRegularizationScore(boolean backpropParamsOnly)
Description copied from interface:Layer
Calculate the regularization component of the score, for the parameters in this layer
For example, the L1, L2 and/or weight decay components of the loss function- Specified by:
calcRegularizationScore
in interfaceLayer
- Parameters:
backpropParamsOnly
- If true: calculate regularization score based on backprop params only. If false: calculate based on all params (including pretrain params, if any)- Returns:
- the regularization score of
-
type
public Layer.Type type()
Description copied from interface:Layer
Returns the layer type
-
backpropGradient
public Pair<Gradient,INDArray> backpropGradient(INDArray epsilon, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:Layer
Calculate the gradient relative to the error in the next layer- Specified by:
backpropGradient
in interfaceLayer
- Parameters:
epsilon
- w^(L+1)*delta^(L+1). Or, equiv: dC/da, i.e., (dC/dz)*(dz/da) = dC/da, where C is cost function a=sigma(z) is activation.workspaceMgr
- Workspace manager- Returns:
- Pair
where Gradient is gradient for this layer, INDArray is epsilon (activation gradient) needed by next layer, but before element-wise multiply by sigmaPrime(z). So for standard feed-forward layer, if this layer is L, then return.getSecond() == dL/dIn = (w^(L)*(delta^(L))^T)^T. Note that the returned array should be placed in the ArrayType.ACTIVATION_GRAD
workspace via the workspace manager
-
activate
public INDArray activate(boolean training, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:Layer
Perform forward pass and return the activations array with the last set input- Specified by:
activate
in interfaceLayer
- Parameters:
training
- training or test modeworkspaceMgr
- Workspace manager- Returns:
- the activation (layer output) of the last specified input. Note that the returned array should be placed
in the
ArrayType.ACTIVATIONS
workspace via the workspace manager
-
activate
public INDArray activate(INDArray input, boolean training, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:Layer
Perform forward pass and return the activations array with the specified input- Specified by:
activate
in interfaceLayer
- Parameters:
input
- the input to usetraining
- train or test modeworkspaceMgr
- Workspace manager.- Returns:
- Activations array. Note that the returned array should be placed in the
ArrayType.ACTIVATIONS
workspace via the workspace manager
-
getListeners
public Collection<TrainingListener> getListeners()
Description copied from interface:Layer
Get the iteration listeners for this layer.- Specified by:
getListeners
in interfaceLayer
-
setListeners
public void setListeners(TrainingListener... listeners)
Description copied from interface:Layer
Set theTrainingListener
s for this model. If any listeners have previously been set, they will be replaced by this method- Specified by:
setListeners
in interfaceLayer
- Specified by:
setListeners
in interfaceModel
-
addListeners
public void addListeners(TrainingListener... listener)
Description copied from interface:Model
This method ADDS additional TrainingListener to existing listeners- Specified by:
addListeners
in interfaceModel
-
fit
public void fit()
Description copied from interface:Model
All models have a fit method
-
update
public void update(Gradient gradient)
Description copied from interface:Model
Update layer weights and biases with gradient change
-
update
public void update(INDArray gradient, String paramType)
Description copied from interface:Model
Perform one update applying the gradient
-
score
public double score()
Description copied from interface:Model
The score for the model
-
computeGradientAndScore
public void computeGradientAndScore(LayerWorkspaceMgr workspaceMgr)
Description copied from interface:Model
Update the score- Specified by:
computeGradientAndScore
in interfaceModel
-
params
public INDArray params()
Description copied from interface:Model
Parameters of the model (if any)
-
numParams
public long numParams()
Description copied from interface:Model
the number of parameters for the model
-
numParams
public long numParams(boolean backwards)
Description copied from interface:Model
the number of parameters for the model
-
setParams
public void setParams(INDArray params)
Description copied from interface:Model
Set the parameters for this model. This expects a linear ndarray which then be unpacked internally relative to the expected ordering of the model
-
setParamsViewArray
public void setParamsViewArray(INDArray params)
Description copied from interface:Model
Set the initial parameters array as a view of the full (backprop) network parameters NOTE: this is intended to be used internally in MultiLayerNetwork and ComputationGraph, not by users.- Specified by:
setParamsViewArray
in interfaceModel
- Parameters:
params
- a 1 x nParams row vector that is a view of the larger (MLN/CG) parameters array
-
getGradientsViewArray
public INDArray getGradientsViewArray()
- Specified by:
getGradientsViewArray
in interfaceModel
- Specified by:
getGradientsViewArray
in interfaceTrainable
- Returns:
- 1D gradients view array
-
setBackpropGradientsViewArray
public void setBackpropGradientsViewArray(INDArray gradients)
Description copied from interface:Model
Set the gradients array as a view of the full (backprop) network parameters NOTE: this is intended to be used internally in MultiLayerNetwork and ComputationGraph, not by users.- Specified by:
setBackpropGradientsViewArray
in interfaceModel
- Parameters:
gradients
- a 1 x nParams row vector that is a view of the larger (MLN/CG) gradients array
-
fit
public void fit(INDArray data, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:Model
Fit the model to the given data
-
gradient
public Gradient gradient()
Description copied from interface:Model
Get the gradient. Note that this method will not calculate the gradient, it will rather return the gradient that has been computed before. For calculating the gradient, seeModel.computeGradientAndScore(LayerWorkspaceMgr)
} .
-
gradientAndScore
public Pair<Gradient,Double> gradientAndScore()
Description copied from interface:Model
Get the gradient and score- Specified by:
gradientAndScore
in interfaceModel
- Returns:
- the gradient and score
-
batchSize
public int batchSize()
Description copied from interface:Model
The current inputs batch size
-
conf
public NeuralNetConfiguration conf()
Description copied from interface:Model
The configuration for the neural network
-
setConf
public void setConf(NeuralNetConfiguration conf)
Description copied from interface:Model
Setter for the configuration
-
input
public INDArray input()
Description copied from interface:Model
The input/feature matrix for the model
-
getOptimizer
public ConvexOptimizer getOptimizer()
Description copied from interface:Model
Returns this models optimizer- Specified by:
getOptimizer
in interfaceModel
- Returns:
- this models optimizer
-
getParam
public INDArray getParam(String param)
Description copied from interface:Model
Get the parameter
-
paramTable
public Map<String,INDArray> paramTable()
Description copied from interface:Model
The param table- Specified by:
paramTable
in interfaceModel
- Returns:
-
paramTable
public Map<String,INDArray> paramTable(boolean backpropParamsOnly)
Description copied from interface:Model
Table of parameters by key, for backprop For many models (dense layers, etc) - all parameters are backprop parameters- Specified by:
paramTable
in interfaceModel
- Specified by:
paramTable
in interfaceTrainable
- Parameters:
backpropParamsOnly
- If true, return backprop params only. If false: return all params (equivalent to paramsTable())- Returns:
- Parameter table
-
setParamTable
public void setParamTable(Map<String,INDArray> paramTable)
Description copied from interface:Model
Setter for the param table- Specified by:
setParamTable
in interfaceModel
-
setParam
public void setParam(String key, INDArray val)
Description copied from interface:Model
Set the parameter with a new ndarray
-
clear
public void clear()
Description copied from interface:Model
Clear input
-
applyConstraints
public void applyConstraints(int iteration, int epoch)
Description copied from interface:Model
Apply any constraints to the model- Specified by:
applyConstraints
in interfaceModel
-
init
public void init()
Description copied from interface:Model
Init the model
-
setListeners
public void setListeners(Collection<TrainingListener> listeners)
Description copied from interface:Layer
Set theTrainingListener
s for this model. If any listeners have previously been set, they will be replaced by this method- Specified by:
setListeners
in interfaceLayer
- Specified by:
setListeners
in interfaceModel
-
setIndex
public void setIndex(int index)
Description copied from interface:Layer
Set the layer index.
-
getIndex
public int getIndex()
Description copied from interface:Layer
Get the layer index.
-
getIterationCount
public int getIterationCount()
- Specified by:
getIterationCount
in interfaceLayer
- Returns:
- The current iteration count (number of parameter updates) for the layer/network
-
getEpochCount
public int getEpochCount()
- Specified by:
getEpochCount
in interfaceLayer
- Returns:
- The current epoch count (number of training epochs passed) for the layer/network
-
setIterationCount
public void setIterationCount(int iterationCount)
Description copied from interface:Layer
Set the current iteration count (number of parameter updates) for the layer/network- Specified by:
setIterationCount
in interfaceLayer
-
setEpochCount
public void setEpochCount(int epochCount)
Description copied from interface:Layer
Set the current epoch count (number of epochs passed ) for the layer/network- Specified by:
setEpochCount
in interfaceLayer
-
setInput
public void setInput(INDArray input, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:Layer
Set the layer input.
-
setInputMiniBatchSize
public void setInputMiniBatchSize(int size)
Description copied from interface:Layer
Set current/last input mini-batch size.
Used for score and gradient calculations. Mini batch size may be different from getInput().size(0) due to reshaping operations - for example, when using RNNs with DenseLayer and OutputLayer. Called automatically during forward pass.- Specified by:
setInputMiniBatchSize
in interfaceLayer
-
getInputMiniBatchSize
public int getInputMiniBatchSize()
Description copied from interface:Layer
Get current/last input mini-batch size, as set by setInputMiniBatchSize(int)- Specified by:
getInputMiniBatchSize
in interfaceLayer
- See Also:
Layer.setInputMiniBatchSize(int)
-
setMaskArray
public void setMaskArray(INDArray maskArray)
Description copied from interface:Layer
Set the mask array. Note: In general,Layer.feedForwardMaskArray(INDArray, MaskState, int)
should be used in preference to this.- Specified by:
setMaskArray
in interfaceLayer
- Parameters:
maskArray
- Mask array to set
-
getMaskArray
public INDArray getMaskArray()
- Specified by:
getMaskArray
in interfaceLayer
-
isPretrainLayer
public boolean isPretrainLayer()
Description copied from interface:Layer
Returns true if the layer can be trained in an unsupervised/pretrain manner (AE, VAE, etc)- Specified by:
isPretrainLayer
in interfaceLayer
- Returns:
- true if the layer can be pretrained (using fit(INDArray), false otherwise
-
clearNoiseWeightParams
public void clearNoiseWeightParams()
- Specified by:
clearNoiseWeightParams
in interfaceLayer
-
feedForwardMaskArray
public Pair<INDArray,MaskState> feedForwardMaskArray(INDArray maskArray, MaskState currentMaskState, int minibatchSize)
Description copied from interface:Layer
Feed forward the input mask array, setting in the layer as appropriate. This allows different layers to handle masks differently - for example, bidirectional RNNs and normal RNNs operate differently with masks (the former sets activations to 0 outside of the data present region (and keeps the mask active for future layers like dense layers), whereas normal RNNs don't zero out the activations/errors )instead relying on backpropagated error arrays to handle the variable length case.
This is also used for example for networks that contain global pooling layers, arbitrary preprocessors, etc.- Specified by:
feedForwardMaskArray
in interfaceLayer
- Parameters:
maskArray
- Mask array to setcurrentMaskState
- Current state of the mask - seeMaskState
minibatchSize
- Current minibatch size. Needs to be known as it cannot always be inferred from the activations array due to reshaping (such as a DenseLayer within a recurrent neural network)- Returns:
- New mask array after this layer, along with the new mask state.
-
allowInputModification
public void allowInputModification(boolean allow)
Description copied from interface:Layer
A performance optimization: mark whether the layer is allowed to modify its input array in-place. In many cases, this is totally safe - in others, the input array will be shared by multiple layers, and hence it's not safe to modify the input array. This is usually used by ops such as dropout.- Specified by:
allowInputModification
in interfaceLayer
- Parameters:
allow
- If true: the input array is safe to modify. If false: the input array should be copied before it is modified (i.e., in-place modifications are un-safe)
-
getHelper
public LayerHelper getHelper()
-
getConfig
public TrainingConfig getConfig()
-
updaterDivideByMinibatch
public boolean updaterDivideByMinibatch(String paramName)
Description copied from interface:Trainable
DL4J layers typically produce the sum of the gradients during the backward pass for each layer, and if required (if minibatch=true) then divide by the minibatch size.
However, there are some exceptions, such as the batch norm mean/variance estimate parameters: these "gradients" are actually not gradients, but are updates to be applied directly to the parameter vector. Put another way, most gradients should be divided by the minibatch to get the average; some "gradients" are actually final updates already, and should not be divided by the minibatch size.- Specified by:
updaterDivideByMinibatch
in interfaceTrainable
- Parameters:
paramName
- Name of the parameter- Returns:
- True if gradients should be divided by minibatch (most params); false otherwise (edge cases like batch norm mean/variance estimates)
-
-