Class LSTM
- java.lang.Object
-
- org.deeplearning4j.nn.layers.AbstractLayer<LayerConfT>
-
- org.deeplearning4j.nn.layers.BaseLayer<LayerConfT>
-
- org.deeplearning4j.nn.layers.recurrent.BaseRecurrentLayer<LSTM>
-
- org.deeplearning4j.nn.layers.recurrent.LSTM
-
- All Implemented Interfaces:
Serializable,Cloneable,Layer,RecurrentLayer,Model,Trainable
public class LSTM extends BaseRecurrentLayer<LSTM>
- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.deeplearning4j.nn.api.Layer
Layer.TrainingMode, Layer.Type
-
-
Field Summary
Fields Modifier and Type Field Description protected FwdPassReturncachedFwdPassstatic StringCUDNN_LSTM_CLASS_NAMEprotected LSTMHelperhelperstatic StringSTATE_KEY_PREV_ACTIVATIONstatic StringSTATE_KEY_PREV_MEMCELL-
Fields inherited from class org.deeplearning4j.nn.layers.recurrent.BaseRecurrentLayer
helperCountFail, stateMap, tBpttStateMap
-
Fields inherited from class org.deeplearning4j.nn.layers.BaseLayer
gradient, gradientsFlattened, gradientViews, optimizer, params, paramsFlattened, score, solver, weightNoiseParams
-
Fields inherited from class org.deeplearning4j.nn.layers.AbstractLayer
cacheMode, conf, dataType, dropoutApplied, epochCount, index, input, inputModificationAllowed, iterationCount, maskArray, maskState, preOutput, trainingListeners
-
-
Constructor Summary
Constructors Constructor Description LSTM(NeuralNetConfiguration conf, DataType dataType)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description INDArrayactivate(boolean training, LayerWorkspaceMgr workspaceMgr)Perform forward pass and return the activations array with the last set inputINDArrayactivate(INDArray input, boolean training, LayerWorkspaceMgr workspaceMgr)Perform forward pass and return the activations array with the specified inputPair<Gradient,INDArray>backpropGradient(INDArray epsilon, LayerWorkspaceMgr workspaceMgr)Calculate the gradient relative to the error in the next layerPair<INDArray,MaskState>feedForwardMaskArray(INDArray maskArray, MaskState currentMaskState, int minibatchSize)Feed forward the input mask array, setting in the layer as appropriate.LayerHelpergetHelper()Gradientgradient()Get the gradient.booleanisPretrainLayer()Returns true if the layer can be trained in an unsupervised/pretrain manner (AE, VAE, etc)INDArrayrnnActivateUsingStoredState(INDArray input, boolean training, boolean storeLastForTBPTT, LayerWorkspaceMgr workspaceMgr)Similar to rnnTimeStep, this method is used for activations using the state stored in the stateMap as the initialization.INDArrayrnnTimeStep(INDArray input, LayerWorkspaceMgr workspaceMgr)Do one or more time steps using the previous time step state stored in stateMap.
Can be used to efficiently do forward pass one or n-steps at a time (instead of doing forward pass always from t=0)
If stateMap is empty, default initialization (usually zeros) is used
Implementations also update stateMap at the end of this methodPair<Gradient,INDArray>tbpttBackpropGradient(INDArray epsilon, int tbpttBackwardLength, LayerWorkspaceMgr workspaceMgr)Truncated BPTT equivalent of Layer.backpropGradient().Layer.Typetype()Returns the layer type-
Methods inherited from class org.deeplearning4j.nn.layers.recurrent.BaseRecurrentLayer
getDataFormat, permuteIfNWC, rnnClearPreviousState, rnnGetPreviousState, rnnGetTBPTTState, rnnSetPreviousState, rnnSetTBPTTState
-
Methods inherited from class org.deeplearning4j.nn.layers.BaseLayer
calcRegularizationScore, clear, clearNoiseWeightParams, clone, computeGradientAndScore, fit, fit, getGradientsViewArray, getOptimizer, getParam, getParamWithNoise, hasBias, hasLayerNorm, layerConf, numParams, params, paramTable, paramTable, preOutput, preOutputWithPreNorm, score, setBackpropGradientsViewArray, setParam, setParams, setParams, setParamsViewArray, setParamTable, setScoreWithZ, toString, update, update
-
Methods inherited from class org.deeplearning4j.nn.layers.AbstractLayer
addListeners, allowInputModification, applyConstraints, applyDropOutIfNecessary, applyMask, assertInputSet, backpropDropOutIfPresent, batchSize, close, conf, getConfig, getEpochCount, getIndex, getInput, getInputMiniBatchSize, getListeners, getMaskArray, gradientAndScore, init, input, layerId, numParams, setCacheMode, setConf, setEpochCount, setIndex, setInput, setInputMiniBatchSize, setListeners, setListeners, setMaskArray, updaterDivideByMinibatch
-
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface org.deeplearning4j.nn.api.Layer
allowInputModification, calcRegularizationScore, clearNoiseWeightParams, getEpochCount, getIndex, getInputMiniBatchSize, getIterationCount, getListeners, getMaskArray, setCacheMode, setEpochCount, setIndex, setInput, setInputMiniBatchSize, setIterationCount, setListeners, setListeners, setMaskArray
-
Methods inherited from interface org.deeplearning4j.nn.api.Model
addListeners, applyConstraints, batchSize, clear, close, computeGradientAndScore, conf, fit, fit, getGradientsViewArray, getOptimizer, getParam, gradientAndScore, init, input, numParams, numParams, params, paramTable, paramTable, score, setBackpropGradientsViewArray, setConf, setParam, setParams, setParamsViewArray, setParamTable, update, update
-
Methods inherited from interface org.deeplearning4j.nn.api.Trainable
getConfig, getGradientsViewArray, numParams, params, paramTable, updaterDivideByMinibatch
-
-
-
-
Field Detail
-
STATE_KEY_PREV_ACTIVATION
public static final String STATE_KEY_PREV_ACTIVATION
- See Also:
- Constant Field Values
-
STATE_KEY_PREV_MEMCELL
public static final String STATE_KEY_PREV_MEMCELL
- See Also:
- Constant Field Values
-
helper
protected LSTMHelper helper
-
cachedFwdPass
protected FwdPassReturn cachedFwdPass
-
CUDNN_LSTM_CLASS_NAME
public static final String CUDNN_LSTM_CLASS_NAME
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
LSTM
public LSTM(NeuralNetConfiguration conf, DataType dataType)
-
-
Method Detail
-
gradient
public Gradient gradient()
Description copied from interface:ModelGet the gradient. Note that this method will not calculate the gradient, it will rather return the gradient that has been computed before. For calculating the gradient, seeModel.computeGradientAndScore(LayerWorkspaceMgr)} .
-
backpropGradient
public Pair<Gradient,INDArray> backpropGradient(INDArray epsilon, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:LayerCalculate the gradient relative to the error in the next layer- Specified by:
backpropGradientin interfaceLayer- Overrides:
backpropGradientin classBaseLayer<LSTM>- Parameters:
epsilon- w^(L+1)*delta^(L+1). Or, equiv: dC/da, i.e., (dC/dz)*(dz/da) = dC/da, where C is cost function a=sigma(z) is activation.workspaceMgr- Workspace manager- Returns:
- Pair
where Gradient is gradient for this layer, INDArray is epsilon (activation gradient) needed by next layer, but before element-wise multiply by sigmaPrime(z). So for standard feed-forward layer, if this layer is L, then return.getSecond() == dL/dIn = (w^(L)*(delta^(L))^T)^T. Note that the returned array should be placed in the ArrayType.ACTIVATION_GRADworkspace via the workspace manager
-
tbpttBackpropGradient
public Pair<Gradient,INDArray> tbpttBackpropGradient(INDArray epsilon, int tbpttBackwardLength, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:RecurrentLayerTruncated BPTT equivalent of Layer.backpropGradient(). Primary difference here is that forward pass in the context of BPTT is that we do forward pass using stored state for truncated BPTT vs. from zero initialization for standard BPTT.
-
activate
public INDArray activate(INDArray input, boolean training, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:LayerPerform forward pass and return the activations array with the specified input- Specified by:
activatein interfaceLayer- Overrides:
activatein classAbstractLayer<LSTM>- Parameters:
input- the input to usetraining- train or test modeworkspaceMgr- Workspace manager.- Returns:
- Activations array. Note that the returned array should be placed in the
ArrayType.ACTIVATIONSworkspace via the workspace manager
-
activate
public INDArray activate(boolean training, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:LayerPerform forward pass and return the activations array with the last set input- Specified by:
activatein interfaceLayer- Overrides:
activatein classBaseLayer<LSTM>- Parameters:
training- training or test modeworkspaceMgr- Workspace manager- Returns:
- the activation (layer output) of the last specified input. Note that the returned array should be placed
in the
ArrayType.ACTIVATIONSworkspace via the workspace manager
-
type
public Layer.Type type()
Description copied from interface:LayerReturns the layer type- Specified by:
typein interfaceLayer- Overrides:
typein classAbstractLayer<LSTM>- Returns:
-
isPretrainLayer
public boolean isPretrainLayer()
Description copied from interface:LayerReturns true if the layer can be trained in an unsupervised/pretrain manner (AE, VAE, etc)- Returns:
- true if the layer can be pretrained (using fit(INDArray), false otherwise
-
feedForwardMaskArray
public Pair<INDArray,MaskState> feedForwardMaskArray(INDArray maskArray, MaskState currentMaskState, int minibatchSize)
Description copied from interface:LayerFeed forward the input mask array, setting in the layer as appropriate. This allows different layers to handle masks differently - for example, bidirectional RNNs and normal RNNs operate differently with masks (the former sets activations to 0 outside of the data present region (and keeps the mask active for future layers like dense layers), whereas normal RNNs don't zero out the activations/errors )instead relying on backpropagated error arrays to handle the variable length case.
This is also used for example for networks that contain global pooling layers, arbitrary preprocessors, etc.- Specified by:
feedForwardMaskArrayin interfaceLayer- Overrides:
feedForwardMaskArrayin classAbstractLayer<LSTM>- Parameters:
maskArray- Mask array to setcurrentMaskState- Current state of the mask - seeMaskStateminibatchSize- Current minibatch size. Needs to be known as it cannot always be inferred from the activations array due to reshaping (such as a DenseLayer within a recurrent neural network)- Returns:
- New mask array after this layer, along with the new mask state.
-
rnnTimeStep
public INDArray rnnTimeStep(INDArray input, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:RecurrentLayerDo one or more time steps using the previous time step state stored in stateMap.
Can be used to efficiently do forward pass one or n-steps at a time (instead of doing forward pass always from t=0)
If stateMap is empty, default initialization (usually zeros) is used
Implementations also update stateMap at the end of this method- Parameters:
input- Input to this layer- Returns:
- activations
-
rnnActivateUsingStoredState
public INDArray rnnActivateUsingStoredState(INDArray input, boolean training, boolean storeLastForTBPTT, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:RecurrentLayerSimilar to rnnTimeStep, this method is used for activations using the state stored in the stateMap as the initialization. However, unlike rnnTimeStep this method does not alter the stateMap; therefore, unlike rnnTimeStep, multiple calls to this method (with identical input) will:
(a) result in the same output
(b) leave the state maps (both stateMap and tBpttStateMap) in an identical state- Parameters:
input- Layer inputtraining- if true: training. Otherwise: teststoreLastForTBPTT- If true: store the final state in tBpttStateMap for use in truncated BPTT training- Returns:
- Layer activations
-
getHelper
public LayerHelper getHelper()
- Specified by:
getHelperin interfaceLayer- Overrides:
getHelperin classAbstractLayer<LSTM>- Returns:
- Get the layer helper, if any
-
-