LSTM (deeplearning4j-nn 1.0.0-beta7 API)

java.lang.Object
- org.deeplearning4j.nn.layers.AbstractLayer<LayerConfT>
- - org.deeplearning4j.nn.layers.BaseLayer<LayerConfT>
  - - org.deeplearning4j.nn.layers.recurrent.BaseRecurrentLayer<LSTM>
    - - org.deeplearning4j.nn.layers.recurrent.LSTM

All Implemented Interfaces:

Serializable, Cloneable, Layer, RecurrentLayer, Model, Trainable
```
public class LSTM
extends BaseRecurrentLayer<LSTM>
```
LSTM layer implementation. See also for full/vectorized equations (and a comparison to other LSTM variants): Greff et al. 2015, "LSTM: A Search Space Odyssey", pg11. This is the "no peephole" variant in said paper https://arxiv.org/pdf/1503.04069.pdf

Author:

Alex Black

See Also:

GravesLSTM class, for the version with peephole connections, Serialized Form

Nested Class Summary
- Nested classes/interfaces inherited from interface org.deeplearning4j.nn.api.Layer
  Layer.TrainingMode, Layer.Type

Field Summary

Fields
Modifier and Type	Field and Description
`protected FwdPassReturn`	`cachedFwdPass`
`protected LSTMHelper`	`helper`
`static String`	`STATE_KEY_PREV_ACTIVATION`
`static String`	`STATE_KEY_PREV_MEMCELL`

Fields inherited from class org.deeplearning4j.nn.layers.recurrent.BaseRecurrentLayer
helperCountFail, stateMap, tBpttStateMap

Fields inherited from class org.deeplearning4j.nn.layers.BaseLayer
gradient, gradientsFlattened, gradientViews, optimizer, params, paramsFlattened, score, solver, weightNoiseParams

Fields inherited from class org.deeplearning4j.nn.layers.AbstractLayer
cacheMode, conf, dataType, dropoutApplied, epochCount, index, input, inputModificationAllowed, iterationCount, maskArray, maskState, preOutput, trainingListeners

Constructor Summary

Constructors
Constructor and Description

LSTM(NeuralNetConfiguration conf, DataType dataType)

Constructors
Constructor and Description
`LSTM(NeuralNetConfiguration conf, DataType dataType)`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`INDArray`	`activate(boolean training, LayerWorkspaceMgr workspaceMgr)` Perform forward pass and return the activations array with the last set input
`INDArray`	`activate(INDArray input, boolean training, LayerWorkspaceMgr workspaceMgr)` Perform forward pass and return the activations array with the specified input
`Pair<Gradient,INDArray>`	`backpropGradient(INDArray epsilon, LayerWorkspaceMgr workspaceMgr)` Calculate the gradient relative to the error in the next layer
`Pair<INDArray,MaskState>`	`feedForwardMaskArray(INDArray maskArray, MaskState currentMaskState, int minibatchSize)` Feed forward the input mask array, setting in the layer as appropriate.
`LayerHelper`	`getHelper()`
`Gradient`	`gradient()` Get the gradient.
`boolean`	`isPretrainLayer()` Returns true if the layer can be trained in an unsupervised/pretrain manner (AE, VAE, etc)
`INDArray`	`rnnActivateUsingStoredState(INDArray input, boolean training, boolean storeLastForTBPTT, LayerWorkspaceMgr workspaceMgr)` Similar to rnnTimeStep, this method is used for activations using the state stored in the stateMap as the initialization.
`INDArray`	`rnnTimeStep(INDArray input, LayerWorkspaceMgr workspaceMgr)` Do one or more time steps using the previous time step state stored in stateMap. Can be used to efficiently do forward pass one or n-steps at a time (instead of doing forward pass always from t=0) If stateMap is empty, default initialization (usually zeros) is used Implementations also update stateMap at the end of this method
`Pair<Gradient,INDArray>`	`tbpttBackpropGradient(INDArray epsilon, int tbpttBackwardLength, LayerWorkspaceMgr workspaceMgr)` Truncated BPTT equivalent of Layer.backpropGradient().
`Layer.Type`	`type()` Returns the layer type

Methods inherited from class org.deeplearning4j.nn.layers.recurrent.BaseRecurrentLayer
getDataFormat, permuteIfNWC, rnnClearPreviousState, rnnGetPreviousState, rnnGetTBPTTState, rnnSetPreviousState, rnnSetTBPTTState

Methods inherited from class org.deeplearning4j.nn.layers.BaseLayer
calcRegularizationScore, clear, clearNoiseWeightParams, clone, computeGradientAndScore, fit, fit, getGradientsViewArray, getOptimizer, getParam, getParamWithNoise, hasBias, hasLayerNorm, layerConf, numParams, params, paramTable, paramTable, preOutput, preOutputWithPreNorm, score, setBackpropGradientsViewArray, setParam, setParams, setParams, setParamsViewArray, setParamTable, setScoreWithZ, toString, update, update

Methods inherited from class org.deeplearning4j.nn.layers.AbstractLayer
addListeners, allowInputModification, applyConstraints, applyDropOutIfNecessary, applyMask, assertInputSet, backpropDropOutIfPresent, batchSize, close, conf, getConfig, getEpochCount, getIndex, getInput, getInputMiniBatchSize, getListeners, getMaskArray, gradientAndScore, init, input, layerId, numParams, setCacheMode, setConf, setEpochCount, setIndex, setInput, setInputMiniBatchSize, setListeners, setListeners, setMaskArray, updaterDivideByMinibatch

Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Methods inherited from interface org.deeplearning4j.nn.api.Layer
allowInputModification, calcRegularizationScore, clearNoiseWeightParams, getEpochCount, getIndex, getInputMiniBatchSize, getIterationCount, getListeners, getMaskArray, setCacheMode, setEpochCount, setIndex, setInput, setInputMiniBatchSize, setIterationCount, setListeners, setListeners, setMaskArray

Methods inherited from interface org.deeplearning4j.nn.api.Model
addListeners, applyConstraints, batchSize, clear, close, computeGradientAndScore, conf, fit, fit, getGradientsViewArray, getOptimizer, getParam, gradientAndScore, init, input, numParams, numParams, params, paramTable, paramTable, score, setBackpropGradientsViewArray, setConf, setParam, setParams, setParamsViewArray, setParamTable, update, update

Methods inherited from interface org.deeplearning4j.nn.api.Trainable
getConfig, getGradientsViewArray, numParams, params, paramTable, updaterDivideByMinibatch

- Field Detail
  - STATE_KEY_PREV_ACTIVATION
```
public static final String STATE_KEY_PREV_ACTIVATION
```
    See Also:
    
    Constant Field Values
  - STATE_KEY_PREV_MEMCELL
```
public static final String STATE_KEY_PREV_MEMCELL
```
    See Also:
    
    Constant Field Values
  - helper
```
protected LSTMHelper helper
```
  - cachedFwdPass
```
protected FwdPassReturn cachedFwdPass
```
- Constructor Detail
  - LSTM
```
public LSTM(NeuralNetConfiguration conf,
            DataType dataType)
```
- Method Detail
  - gradient
```
public Gradient gradient()
```
    Description copied from interface: Model
    
    Get the gradient. Note that this method will not calculate the gradient, it will rather return the gradient that has been computed before. For calculating the gradient, see Model.computeGradientAndScore(LayerWorkspaceMgr) } .
    
    Specified by:
    
    gradient in interface Model
    
    Overrides:
    
    gradient in class BaseLayer<LSTM>
    
    Returns:
    
    the gradient for this model, as calculated before
  - backpropGradient
```
public Pair<Gradient,INDArray> backpropGradient(INDArray epsilon,
                                                LayerWorkspaceMgr workspaceMgr)
```
    Description copied from interface: Layer
    
    Calculate the gradient relative to the error in the next layer
    
    Specified by:
    
    backpropGradient in interface Layer
    
    Overrides:
    
    backpropGradient in class BaseLayer<LSTM>
    
    Parameters:
    
    epsilon - w^(L+1)*delta^(L+1). Or, equiv: dC/da, i.e., (dC/dz)*(dz/da) = dC/da, where C is cost function a=sigma(z) is activation.
    
    workspaceMgr - Workspace manager
    
    Returns:
    
    Pair where Gradient is gradient for this layer, INDArray is epsilon (activation gradient) needed by next layer, but before element-wise multiply by sigmaPrime(z). So for standard feed-forward layer, if this layer is L, then return.getSecond() == dL/dIn = (w^(L)*(delta^(L))^T)^T. Note that the returned array should be placed in the ArrayType.ACTIVATION_GRAD workspace via the workspace manager
  - tbpttBackpropGradient
```
public Pair<Gradient,INDArray> tbpttBackpropGradient(INDArray epsilon,
                                                     int tbpttBackwardLength,
                                                     LayerWorkspaceMgr workspaceMgr)
```
    Description copied from interface: RecurrentLayer
    
    Truncated BPTT equivalent of Layer.backpropGradient(). Primary difference here is that forward pass in the context of BPTT is that we do forward pass using stored state for truncated BPTT vs. from zero initialization for standard BPTT.
  - activate
```
public INDArray activate(INDArray input,
                         boolean training,
                         LayerWorkspaceMgr workspaceMgr)
```
    Description copied from interface: Layer
    
    Perform forward pass and return the activations array with the specified input
    
    Specified by:
    
    activate in interface Layer
    
    Overrides:
    
    activate in class AbstractLayer<LSTM>
    
    Parameters:
    
    input - the input to use
    
    training - train or test mode
    
    workspaceMgr - Workspace manager.
    
    Returns:
    
    Activations array. Note that the returned array should be placed in the ArrayType.ACTIVATIONS workspace via the workspace manager
  - activate
```
public INDArray activate(boolean training,
                         LayerWorkspaceMgr workspaceMgr)
```
    Description copied from interface: Layer
    
    Perform forward pass and return the activations array with the last set input
    
    Specified by:
    
    activate in interface Layer
    
    Overrides:
    
    activate in class BaseLayer<LSTM>
    
    Parameters:
    
    training - training or test mode
    
    workspaceMgr - Workspace manager
    
    Returns:
    
    the activation (layer output) of the last specified input. Note that the returned array should be placed in the ArrayType.ACTIVATIONS workspace via the workspace manager
  - type
```
public Layer.Type type()
```
    Description copied from interface: Layer
    
    Returns the layer type
    
    Specified by:
    
    type in interface Layer
    
    Overrides:
    
    type in class AbstractLayer<LSTM>
    
    Returns:
  - isPretrainLayer
```
public boolean isPretrainLayer()
```
    Description copied from interface: Layer
    
    Returns true if the layer can be trained in an unsupervised/pretrain manner (AE, VAE, etc)
    
    Returns:
    
    true if the layer can be pretrained (using fit(INDArray), false otherwise
  - feedForwardMaskArray
```
public Pair<INDArray,MaskState> feedForwardMaskArray(INDArray maskArray,
                                                     MaskState currentMaskState,
                                                     int minibatchSize)
```
    Description copied from interface: Layer
    
    Feed forward the input mask array, setting in the layer as appropriate. This allows different layers to handle masks differently - for example, bidirectional RNNs and normal RNNs operate differently with masks (the former sets activations to 0 outside of the data present region (and keeps the mask active for future layers like dense layers), whereas normal RNNs don't zero out the activations/errors )instead relying on backpropagated error arrays to handle the variable length case.
    This is also used for example for networks that contain global pooling layers, arbitrary preprocessors, etc.
    
    Specified by:
    
    feedForwardMaskArray in interface Layer
    
    Overrides:
    
    feedForwardMaskArray in class AbstractLayer<LSTM>
    
    Parameters:
    
    maskArray - Mask array to set
    
    currentMaskState - Current state of the mask - see MaskState
    
    minibatchSize - Current minibatch size. Needs to be known as it cannot always be inferred from the activations array due to reshaping (such as a DenseLayer within a recurrent neural network)
    
    Returns:
    
    New mask array after this layer, along with the new mask state.
  - rnnTimeStep
```
public INDArray rnnTimeStep(INDArray input,
                            LayerWorkspaceMgr workspaceMgr)
```
    Description copied from interface: RecurrentLayer
    
    Do one or more time steps using the previous time step state stored in stateMap.
    Can be used to efficiently do forward pass one or n-steps at a time (instead of doing forward pass always from t=0)
    If stateMap is empty, default initialization (usually zeros) is used
    Implementations also update stateMap at the end of this method
    
    Parameters:
    
    input - Input to this layer
    
    Returns:
    
    activations
  - rnnActivateUsingStoredState
```
public INDArray rnnActivateUsingStoredState(INDArray input,
                                            boolean training,
                                            boolean storeLastForTBPTT,
                                            LayerWorkspaceMgr workspaceMgr)
```
    Description copied from interface: RecurrentLayer
    
    Similar to rnnTimeStep, this method is used for activations using the state stored in the stateMap as the initialization. However, unlike rnnTimeStep this method does not alter the stateMap; therefore, unlike rnnTimeStep, multiple calls to this method (with identical input) will:
    (a) result in the same output
    (b) leave the state maps (both stateMap and tBpttStateMap) in an identical state
    
    Parameters:
    
    input - Layer input
    
    training - if true: training. Otherwise: test
    
    storeLastForTBPTT - If true: store the final state in tBpttStateMap for use in truncated BPTT training
    
    Returns:
    
    Layer activations
  - getHelper
```
public LayerHelper getHelper()
```
    Specified by:
    
    getHelper in interface Layer
    
    Overrides:
    
    getHelper in class AbstractLayer<LSTM>
    
    Returns:
    
    Get the layer helper, if any

Class LSTM

Nested Class Summary

Nested classes/interfaces inherited from interface org.deeplearning4j.nn.api.Layer

Field Summary

Fields inherited from class org.deeplearning4j.nn.layers.recurrent.BaseRecurrentLayer

Fields inherited from class org.deeplearning4j.nn.layers.BaseLayer

Fields inherited from class org.deeplearning4j.nn.layers.AbstractLayer

Constructor Summary

Method Summary

Methods inherited from class org.deeplearning4j.nn.layers.recurrent.BaseRecurrentLayer

Methods inherited from class org.deeplearning4j.nn.layers.BaseLayer

Methods inherited from class org.deeplearning4j.nn.layers.AbstractLayer

Methods inherited from class java.lang.Object

Methods inherited from interface org.deeplearning4j.nn.api.Layer

Methods inherited from interface org.deeplearning4j.nn.api.Model

Methods inherited from interface org.deeplearning4j.nn.api.Trainable

Field Detail

STATE_KEY_PREV_ACTIVATION

STATE_KEY_PREV_MEMCELL

helper

cachedFwdPass

Constructor Detail

LSTM

Method Detail

gradient

backpropGradient

tbpttBackpropGradient

activate

activate

type

isPretrainLayer

feedForwardMaskArray

rnnTimeStep

rnnActivateUsingStoredState

getHelper