Class BidirectionalLayer
- java.lang.Object
-
- org.deeplearning4j.nn.layers.recurrent.BidirectionalLayer
-
- All Implemented Interfaces:
Serializable
,Cloneable
,Layer
,RecurrentLayer
,Model
,Trainable
public class BidirectionalLayer extends Object implements RecurrentLayer
- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.deeplearning4j.nn.api.Layer
Layer.TrainingMode, Layer.Type
-
-
Constructor Summary
Constructors Constructor Description BidirectionalLayer(@NonNull NeuralNetConfiguration conf, @NonNull Layer fwd, @NonNull Layer bwd, @NonNull INDArray paramsView)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description INDArray
activate(boolean training, LayerWorkspaceMgr workspaceMgr)
Perform forward pass and return the activations array with the last set inputINDArray
activate(INDArray input, boolean training, LayerWorkspaceMgr workspaceMgr)
Perform forward pass and return the activations array with the specified inputvoid
addListeners(TrainingListener... listener)
This method ADDS additional TrainingListener to existing listenersvoid
allowInputModification(boolean allow)
A performance optimization: mark whether the layer is allowed to modify its input array in-place.void
applyConstraints(int iteration, int epoch)
Apply any constraints to the modelPair<Gradient,INDArray>
backpropGradient(INDArray epsilon, LayerWorkspaceMgr workspaceMgr)
Calculate the gradient relative to the error in the next layerint
batchSize()
The current inputs batch sizedouble
calcRegularizationScore(boolean backpropParamsOnly)
Calculate the regularization component of the score, for the parameters in this layer
For example, the L1, L2 and/or weight decay components of the loss functionvoid
clear()
Clear inputvoid
clearNoiseWeightParams()
void
close()
void
computeGradientAndScore(LayerWorkspaceMgr workspaceMgr)
Update the scoreNeuralNetConfiguration
conf()
The configuration for the neural networkPair<INDArray,MaskState>
feedForwardMaskArray(INDArray maskArray, MaskState currentMaskState, int minibatchSize)
Feed forward the input mask array, setting in the layer as appropriate.void
fit()
All models have a fit methodvoid
fit(INDArray data, LayerWorkspaceMgr workspaceMgr)
Fit the model to the given dataTrainingConfig
getConfig()
int
getEpochCount()
INDArray
getGradientsViewArray()
LayerHelper
getHelper()
int
getIndex()
Get the layer index.int
getInputMiniBatchSize()
Get current/last input mini-batch size, as set by setInputMiniBatchSize(int)int
getIterationCount()
Collection<TrainingListener>
getListeners()
Get the iteration listeners for this layer.INDArray
getMaskArray()
ConvexOptimizer
getOptimizer()
Returns this models optimizerINDArray
getParam(String param)
Get the parameterGradient
gradient()
Get the gradient.Pair<Gradient,Double>
gradientAndScore()
Get the gradient and scorevoid
init()
Init the modelINDArray
input()
The input/feature matrix for the modelboolean
isPretrainLayer()
Returns true if the layer can be trained in an unsupervised/pretrain manner (AE, VAE, etc)long
numParams()
the number of parameters for the modellong
numParams(boolean backwards)
the number of parameters for the modelINDArray
params()
Parameters of the model (if any)Map<String,INDArray>
paramTable()
The param tableMap<String,INDArray>
paramTable(boolean backpropParamsOnly)
Table of parameters by key, for backprop For many models (dense layers, etc) - all parameters are backprop parametersINDArray
rnnActivateUsingStoredState(INDArray input, boolean training, boolean storeLastForTBPTT, LayerWorkspaceMgr workspaceMgr)
Similar to rnnTimeStep, this method is used for activations using the state stored in the stateMap as the initialization.void
rnnClearPreviousState()
Reset/clear the stateMap for rnnTimeStep() and tBpttStateMap for rnnActivateUsingStoredState()Map<String,INDArray>
rnnGetPreviousState()
Returns a shallow copy of the RNN stateMap (that contains the stored history for use in methods such as rnnTimeStepMap<String,INDArray>
rnnGetTBPTTState()
Get the RNN truncated backpropagations through time (TBPTT) state for the recurrent layer.void
rnnSetPreviousState(Map<String,INDArray> stateMap)
Set the stateMap (stored history).void
rnnSetTBPTTState(Map<String,INDArray> state)
Set the RNN truncated backpropagations through time (TBPTT) state for the recurrent layer.INDArray
rnnTimeStep(INDArray input, LayerWorkspaceMgr workspaceMgr)
Do one or more time steps using the previous time step state stored in stateMap.
Can be used to efficiently do forward pass one or n-steps at a time (instead of doing forward pass always from t=0)
If stateMap is empty, default initialization (usually zeros) is used
Implementations also update stateMap at the end of this methoddouble
score()
The score for the modelvoid
setBackpropGradientsViewArray(INDArray gradients)
Set the gradients array as a view of the full (backprop) network parameters NOTE: this is intended to be used internally in MultiLayerNetwork and ComputationGraph, not by users.void
setCacheMode(CacheMode mode)
This method sets given CacheMode for current layervoid
setConf(NeuralNetConfiguration conf)
Setter for the configurationvoid
setEpochCount(int epochCount)
Set the current epoch count (number of epochs passed ) for the layer/networkvoid
setIndex(int index)
Set the layer index.void
setInput(INDArray input, LayerWorkspaceMgr layerWorkspaceMgr)
Set the layer input.void
setInputMiniBatchSize(int size)
Set current/last input mini-batch size.
Used for score and gradient calculations.void
setIterationCount(int iterationCount)
Set the current iteration count (number of parameter updates) for the layer/networkvoid
setListeners(Collection<TrainingListener> listeners)
Set theTrainingListener
s for this model.void
setListeners(TrainingListener... listeners)
Set theTrainingListener
s for this model.void
setMaskArray(INDArray maskArray)
Set the mask array.void
setParam(String key, INDArray val)
Set the parameter with a new ndarrayvoid
setParams(INDArray params)
Set the parameters for this model.void
setParamsViewArray(INDArray params)
Set the initial parameters array as a view of the full (backprop) network parameters NOTE: this is intended to be used internally in MultiLayerNetwork and ComputationGraph, not by users.void
setParamTable(Map<String,INDArray> paramTable)
Setter for the param tablePair<Gradient,INDArray>
tbpttBackpropGradient(INDArray epsilon, int tbpttBackLength, LayerWorkspaceMgr workspaceMgr)
Truncated BPTT equivalent of Layer.backpropGradient().Layer.Type
type()
Returns the layer typevoid
update(Gradient gradient)
Update layer weights and biases with gradient changevoid
update(INDArray gradient, String paramType)
Perform one update applying the gradientboolean
updaterDivideByMinibatch(String paramName)
DL4J layers typically produce the sum of the gradients during the backward pass for each layer, and if required (if minibatch=true) then divide by the minibatch size.
However, there are some exceptions, such as the batch norm mean/variance estimate parameters: these "gradients" are actually not gradients, but are updates to be applied directly to the parameter vector.
-
-
-
Constructor Detail
-
BidirectionalLayer
public BidirectionalLayer(@NonNull @NonNull NeuralNetConfiguration conf, @NonNull @NonNull Layer fwd, @NonNull @NonNull Layer bwd, @NonNull @NonNull INDArray paramsView)
-
-
Method Detail
-
rnnTimeStep
public INDArray rnnTimeStep(INDArray input, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:RecurrentLayer
Do one or more time steps using the previous time step state stored in stateMap.
Can be used to efficiently do forward pass one or n-steps at a time (instead of doing forward pass always from t=0)
If stateMap is empty, default initialization (usually zeros) is used
Implementations also update stateMap at the end of this method- Specified by:
rnnTimeStep
in interfaceRecurrentLayer
- Parameters:
input
- Input to this layer- Returns:
- activations
-
rnnGetPreviousState
public Map<String,INDArray> rnnGetPreviousState()
Description copied from interface:RecurrentLayer
Returns a shallow copy of the RNN stateMap (that contains the stored history for use in methods such as rnnTimeStep- Specified by:
rnnGetPreviousState
in interfaceRecurrentLayer
-
rnnSetPreviousState
public void rnnSetPreviousState(Map<String,INDArray> stateMap)
Description copied from interface:RecurrentLayer
Set the stateMap (stored history). Values set using this method will be used in next call to rnnTimeStep()- Specified by:
rnnSetPreviousState
in interfaceRecurrentLayer
-
rnnClearPreviousState
public void rnnClearPreviousState()
Description copied from interface:RecurrentLayer
Reset/clear the stateMap for rnnTimeStep() and tBpttStateMap for rnnActivateUsingStoredState()- Specified by:
rnnClearPreviousState
in interfaceRecurrentLayer
-
rnnActivateUsingStoredState
public INDArray rnnActivateUsingStoredState(INDArray input, boolean training, boolean storeLastForTBPTT, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:RecurrentLayer
Similar to rnnTimeStep, this method is used for activations using the state stored in the stateMap as the initialization. However, unlike rnnTimeStep this method does not alter the stateMap; therefore, unlike rnnTimeStep, multiple calls to this method (with identical input) will:
(a) result in the same output
(b) leave the state maps (both stateMap and tBpttStateMap) in an identical state- Specified by:
rnnActivateUsingStoredState
in interfaceRecurrentLayer
- Parameters:
input
- Layer inputtraining
- if true: training. Otherwise: teststoreLastForTBPTT
- If true: store the final state in tBpttStateMap for use in truncated BPTT training- Returns:
- Layer activations
-
rnnGetTBPTTState
public Map<String,INDArray> rnnGetTBPTTState()
Description copied from interface:RecurrentLayer
Get the RNN truncated backpropagations through time (TBPTT) state for the recurrent layer. The TBPTT state is used to store intermediate activations/state between updating parameters when doing TBPTT learning- Specified by:
rnnGetTBPTTState
in interfaceRecurrentLayer
- Returns:
- State for the RNN layer
-
rnnSetTBPTTState
public void rnnSetTBPTTState(Map<String,INDArray> state)
Description copied from interface:RecurrentLayer
Set the RNN truncated backpropagations through time (TBPTT) state for the recurrent layer. The TBPTT state is used to store intermediate activations/state between updating parameters when doing TBPTT learning- Specified by:
rnnSetTBPTTState
in interfaceRecurrentLayer
- Parameters:
state
- TBPTT state to set
-
tbpttBackpropGradient
public Pair<Gradient,INDArray> tbpttBackpropGradient(INDArray epsilon, int tbpttBackLength, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:RecurrentLayer
Truncated BPTT equivalent of Layer.backpropGradient(). Primary difference here is that forward pass in the context of BPTT is that we do forward pass using stored state for truncated BPTT vs. from zero initialization for standard BPTT.- Specified by:
tbpttBackpropGradient
in interfaceRecurrentLayer
-
setCacheMode
public void setCacheMode(CacheMode mode)
Description copied from interface:Layer
This method sets given CacheMode for current layer- Specified by:
setCacheMode
in interfaceLayer
-
calcRegularizationScore
public double calcRegularizationScore(boolean backpropParamsOnly)
Description copied from interface:Layer
Calculate the regularization component of the score, for the parameters in this layer
For example, the L1, L2 and/or weight decay components of the loss function- Specified by:
calcRegularizationScore
in interfaceLayer
- Parameters:
backpropParamsOnly
- If true: calculate regularization score based on backprop params only. If false: calculate based on all params (including pretrain params, if any)- Returns:
- the regularization score of
-
type
public Layer.Type type()
Description copied from interface:Layer
Returns the layer type
-
backpropGradient
public Pair<Gradient,INDArray> backpropGradient(INDArray epsilon, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:Layer
Calculate the gradient relative to the error in the next layer- Specified by:
backpropGradient
in interfaceLayer
- Parameters:
epsilon
- w^(L+1)*delta^(L+1). Or, equiv: dC/da, i.e., (dC/dz)*(dz/da) = dC/da, where C is cost function a=sigma(z) is activation.workspaceMgr
- Workspace manager- Returns:
- Pair
where Gradient is gradient for this layer, INDArray is epsilon (activation gradient) needed by next layer, but before element-wise multiply by sigmaPrime(z). So for standard feed-forward layer, if this layer is L, then return.getSecond() == dL/dIn = (w^(L)*(delta^(L))^T)^T. Note that the returned array should be placed in the ArrayType.ACTIVATION_GRAD
workspace via the workspace manager
-
activate
public INDArray activate(boolean training, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:Layer
Perform forward pass and return the activations array with the last set input- Specified by:
activate
in interfaceLayer
- Parameters:
training
- training or test modeworkspaceMgr
- Workspace manager- Returns:
- the activation (layer output) of the last specified input. Note that the returned array should be placed
in the
ArrayType.ACTIVATIONS
workspace via the workspace manager
-
activate
public INDArray activate(INDArray input, boolean training, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:Layer
Perform forward pass and return the activations array with the specified input- Specified by:
activate
in interfaceLayer
- Parameters:
input
- the input to usetraining
- train or test modeworkspaceMgr
- Workspace manager.- Returns:
- Activations array. Note that the returned array should be placed in the
ArrayType.ACTIVATIONS
workspace via the workspace manager
-
getListeners
public Collection<TrainingListener> getListeners()
Description copied from interface:Layer
Get the iteration listeners for this layer.- Specified by:
getListeners
in interfaceLayer
-
setListeners
public void setListeners(TrainingListener... listeners)
Description copied from interface:Layer
Set theTrainingListener
s for this model. If any listeners have previously been set, they will be replaced by this method- Specified by:
setListeners
in interfaceLayer
- Specified by:
setListeners
in interfaceModel
-
addListeners
public void addListeners(TrainingListener... listener)
Description copied from interface:Model
This method ADDS additional TrainingListener to existing listeners- Specified by:
addListeners
in interfaceModel
-
fit
public void fit()
Description copied from interface:Model
All models have a fit method
-
update
public void update(Gradient gradient)
Description copied from interface:Model
Update layer weights and biases with gradient change
-
update
public void update(INDArray gradient, String paramType)
Description copied from interface:Model
Perform one update applying the gradient
-
score
public double score()
Description copied from interface:Model
The score for the model
-
computeGradientAndScore
public void computeGradientAndScore(LayerWorkspaceMgr workspaceMgr)
Description copied from interface:Model
Update the score- Specified by:
computeGradientAndScore
in interfaceModel
-
params
public INDArray params()
Description copied from interface:Model
Parameters of the model (if any)
-
getConfig
public TrainingConfig getConfig()
-
numParams
public long numParams()
Description copied from interface:Model
the number of parameters for the model
-
numParams
public long numParams(boolean backwards)
Description copied from interface:Model
the number of parameters for the model
-
setParams
public void setParams(INDArray params)
Description copied from interface:Model
Set the parameters for this model. This expects a linear ndarray which then be unpacked internally relative to the expected ordering of the model
-
setParamsViewArray
public void setParamsViewArray(INDArray params)
Description copied from interface:Model
Set the initial parameters array as a view of the full (backprop) network parameters NOTE: this is intended to be used internally in MultiLayerNetwork and ComputationGraph, not by users.- Specified by:
setParamsViewArray
in interfaceModel
- Parameters:
params
- a 1 x nParams row vector that is a view of the larger (MLN/CG) parameters array
-
getGradientsViewArray
public INDArray getGradientsViewArray()
- Specified by:
getGradientsViewArray
in interfaceModel
- Specified by:
getGradientsViewArray
in interfaceTrainable
- Returns:
- 1D gradients view array
-
setBackpropGradientsViewArray
public void setBackpropGradientsViewArray(INDArray gradients)
Description copied from interface:Model
Set the gradients array as a view of the full (backprop) network parameters NOTE: this is intended to be used internally in MultiLayerNetwork and ComputationGraph, not by users.- Specified by:
setBackpropGradientsViewArray
in interfaceModel
- Parameters:
gradients
- a 1 x nParams row vector that is a view of the larger (MLN/CG) gradients array
-
fit
public void fit(INDArray data, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:Model
Fit the model to the given data
-
gradient
public Gradient gradient()
Description copied from interface:Model
Get the gradient. Note that this method will not calculate the gradient, it will rather return the gradient that has been computed before. For calculating the gradient, seeModel.computeGradientAndScore(LayerWorkspaceMgr)
} .
-
gradientAndScore
public Pair<Gradient,Double> gradientAndScore()
Description copied from interface:Model
Get the gradient and score- Specified by:
gradientAndScore
in interfaceModel
- Returns:
- the gradient and score
-
batchSize
public int batchSize()
Description copied from interface:Model
The current inputs batch size
-
conf
public NeuralNetConfiguration conf()
Description copied from interface:Model
The configuration for the neural network
-
setConf
public void setConf(NeuralNetConfiguration conf)
Description copied from interface:Model
Setter for the configuration
-
input
public INDArray input()
Description copied from interface:Model
The input/feature matrix for the model
-
getOptimizer
public ConvexOptimizer getOptimizer()
Description copied from interface:Model
Returns this models optimizer- Specified by:
getOptimizer
in interfaceModel
- Returns:
- this models optimizer
-
getParam
public INDArray getParam(String param)
Description copied from interface:Model
Get the parameter
-
paramTable
public Map<String,INDArray> paramTable()
Description copied from interface:Model
The param table- Specified by:
paramTable
in interfaceModel
- Returns:
-
paramTable
public Map<String,INDArray> paramTable(boolean backpropParamsOnly)
Description copied from interface:Model
Table of parameters by key, for backprop For many models (dense layers, etc) - all parameters are backprop parameters- Specified by:
paramTable
in interfaceModel
- Specified by:
paramTable
in interfaceTrainable
- Parameters:
backpropParamsOnly
- If true, return backprop params only. If false: return all params (equivalent to paramsTable())- Returns:
- Parameter table
-
updaterDivideByMinibatch
public boolean updaterDivideByMinibatch(String paramName)
Description copied from interface:Trainable
DL4J layers typically produce the sum of the gradients during the backward pass for each layer, and if required (if minibatch=true) then divide by the minibatch size.
However, there are some exceptions, such as the batch norm mean/variance estimate parameters: these "gradients" are actually not gradients, but are updates to be applied directly to the parameter vector. Put another way, most gradients should be divided by the minibatch to get the average; some "gradients" are actually final updates already, and should not be divided by the minibatch size.- Specified by:
updaterDivideByMinibatch
in interfaceTrainable
- Parameters:
paramName
- Name of the parameter- Returns:
- True if gradients should be divided by minibatch (most params); false otherwise (edge cases like batch norm mean/variance estimates)
-
setParamTable
public void setParamTable(Map<String,INDArray> paramTable)
Description copied from interface:Model
Setter for the param table- Specified by:
setParamTable
in interfaceModel
-
setParam
public void setParam(String key, INDArray val)
Description copied from interface:Model
Set the parameter with a new ndarray
-
clear
public void clear()
Description copied from interface:Model
Clear input
-
applyConstraints
public void applyConstraints(int iteration, int epoch)
Description copied from interface:Model
Apply any constraints to the model- Specified by:
applyConstraints
in interfaceModel
-
init
public void init()
Description copied from interface:Model
Init the model
-
setListeners
public void setListeners(Collection<TrainingListener> listeners)
Description copied from interface:Layer
Set theTrainingListener
s for this model. If any listeners have previously been set, they will be replaced by this method- Specified by:
setListeners
in interfaceLayer
- Specified by:
setListeners
in interfaceModel
-
setIndex
public void setIndex(int index)
Description copied from interface:Layer
Set the layer index.
-
getIndex
public int getIndex()
Description copied from interface:Layer
Get the layer index.
-
getIterationCount
public int getIterationCount()
- Specified by:
getIterationCount
in interfaceLayer
- Returns:
- The current iteration count (number of parameter updates) for the layer/network
-
getEpochCount
public int getEpochCount()
- Specified by:
getEpochCount
in interfaceLayer
- Returns:
- The current epoch count (number of training epochs passed) for the layer/network
-
setIterationCount
public void setIterationCount(int iterationCount)
Description copied from interface:Layer
Set the current iteration count (number of parameter updates) for the layer/network- Specified by:
setIterationCount
in interfaceLayer
-
setEpochCount
public void setEpochCount(int epochCount)
Description copied from interface:Layer
Set the current epoch count (number of epochs passed ) for the layer/network- Specified by:
setEpochCount
in interfaceLayer
-
setInput
public void setInput(INDArray input, LayerWorkspaceMgr layerWorkspaceMgr)
Description copied from interface:Layer
Set the layer input.
-
setInputMiniBatchSize
public void setInputMiniBatchSize(int size)
Description copied from interface:Layer
Set current/last input mini-batch size.
Used for score and gradient calculations. Mini batch size may be different from getInput().size(0) due to reshaping operations - for example, when using RNNs with DenseLayer and OutputLayer. Called automatically during forward pass.- Specified by:
setInputMiniBatchSize
in interfaceLayer
-
getInputMiniBatchSize
public int getInputMiniBatchSize()
Description copied from interface:Layer
Get current/last input mini-batch size, as set by setInputMiniBatchSize(int)- Specified by:
getInputMiniBatchSize
in interfaceLayer
- See Also:
Layer.setInputMiniBatchSize(int)
-
setMaskArray
public void setMaskArray(INDArray maskArray)
Description copied from interface:Layer
Set the mask array. Note: In general,Layer.feedForwardMaskArray(INDArray, MaskState, int)
should be used in preference to this.- Specified by:
setMaskArray
in interfaceLayer
- Parameters:
maskArray
- Mask array to set
-
getMaskArray
public INDArray getMaskArray()
- Specified by:
getMaskArray
in interfaceLayer
-
isPretrainLayer
public boolean isPretrainLayer()
Description copied from interface:Layer
Returns true if the layer can be trained in an unsupervised/pretrain manner (AE, VAE, etc)- Specified by:
isPretrainLayer
in interfaceLayer
- Returns:
- true if the layer can be pretrained (using fit(INDArray), false otherwise
-
clearNoiseWeightParams
public void clearNoiseWeightParams()
- Specified by:
clearNoiseWeightParams
in interfaceLayer
-
allowInputModification
public void allowInputModification(boolean allow)
Description copied from interface:Layer
A performance optimization: mark whether the layer is allowed to modify its input array in-place. In many cases, this is totally safe - in others, the input array will be shared by multiple layers, and hence it's not safe to modify the input array. This is usually used by ops such as dropout.- Specified by:
allowInputModification
in interfaceLayer
- Parameters:
allow
- If true: the input array is safe to modify. If false: the input array should be copied before it is modified (i.e., in-place modifications are un-safe)
-
feedForwardMaskArray
public Pair<INDArray,MaskState> feedForwardMaskArray(INDArray maskArray, MaskState currentMaskState, int minibatchSize)
Description copied from interface:Layer
Feed forward the input mask array, setting in the layer as appropriate. This allows different layers to handle masks differently - for example, bidirectional RNNs and normal RNNs operate differently with masks (the former sets activations to 0 outside of the data present region (and keeps the mask active for future layers like dense layers), whereas normal RNNs don't zero out the activations/errors )instead relying on backpropagated error arrays to handle the variable length case.
This is also used for example for networks that contain global pooling layers, arbitrary preprocessors, etc.- Specified by:
feedForwardMaskArray
in interfaceLayer
- Parameters:
maskArray
- Mask array to setcurrentMaskState
- Current state of the mask - seeMaskState
minibatchSize
- Current minibatch size. Needs to be known as it cannot always be inferred from the activations array due to reshaping (such as a DenseLayer within a recurrent neural network)- Returns:
- New mask array after this layer, along with the new mask state.
-
getHelper
public LayerHelper getHelper()
-
-