Class ComputationGraph

    • Field Detail

      • initCalled

        protected boolean initCalled
      • solver

        protected transient Solver solver
      • flattenedParams

        protected INDArray flattenedParams
      • flattenedGradients

        protected transient INDArray flattenedGradients
      • score

        protected double score
      • clearTbpttState

        protected boolean clearTbpttState
      • helperWorkspaces

        protected transient Map<String,​org.bytedeco.javacpp.Pointer> helperWorkspaces
      • WS_LAYER_WORKING_MEM

        protected static final String WS_LAYER_WORKING_MEM
        Workspace for working memory for a single layer: forward pass and backward pass Note that this is opened/closed once per op (activate/backpropGradient call)
        See Also:
        Constant Field Values
      • WS_ALL_LAYERS_ACT

        protected static final String WS_ALL_LAYERS_ACT
        Workspace for storing all layers' activations - used only to store activations (layer inputs) as part of backprop Not used for inference
        See Also:
        Constant Field Values
      • WS_RNN_LOOP_WORKING_MEM

        protected static final String WS_RNN_LOOP_WORKING_MEM
        Workspace for working memory in RNNs - opened and closed once per RNN time step
        See Also:
        Constant Field Values
      • WS_OUTPUT_MEM

        protected static final String WS_OUTPUT_MEM
        Workspace for output methods that use OutputAdapter
        See Also:
        Constant Field Values
      • WS_RNN_LOOP_WORKING_MEM_CONFIG

        protected static final WorkspaceConfiguration WS_RNN_LOOP_WORKING_MEM_CONFIG
      • vertices

        protected GraphVertex[] vertices
        All GraphVertex objects in the network.
      • topologicalOrder

        protected int[] topologicalOrder
        Indexes of graph vertices, in topological order. The topological order defines the order in which forward pass (and hence also backward pass, which is the opposite to this) is conducted in the network.
      • graphIndices

        protected GraphIndices graphIndices
        Topological sort and vertex index/name + name/index mapping
      • layers

        protected Layer[] layers
        A list of layers. Each of these layers is present in a GraphVertex, but are here for easy reference. This array also defines the order in which the getLayer(int) method returns layers.
    • Method Detail

      • setLastEtlTime

        public void setLastEtlTime​(long time)
        This method allows to set ETL field time, useful for performance tracking
        Parameters:
        time -
      • getLastEtlTime

        public long getLastEtlTime()
        This method returns ETL time field value
        Returns:
      • setCacheMode

        public void setCacheMode​(CacheMode mode)
        This method sets specified CacheMode for all layers within network
        Parameters:
        mode -
      • getNumLayers

        public int getNumLayers()
        Returns the number of layers in the ComputationGraph
      • getLayer

        public Layer getLayer​(int idx)
        Get the layer by the number of that layer, in range 0 to getNumLayers()-1 NOTE: This is different from the internal GraphVertex index for the layer
      • getLayers

        public Layer[] getLayers()
        Get all layers in the ComputationGraph
      • getLayer

        public Layer getLayer​(String name)
        Get a given layer by name.
      • getVertices

        public GraphVertex[] getVertices()
        Returns an array of all GraphVertex objects.
      • getVertex

        public GraphVertex getVertex​(String name)
        Return a given GraphVertex by name, or null if no vertex with that name exists
      • getNumInputArrays

        public int getNumInputArrays()
        The number of inputs to this network
      • getNumOutputArrays

        public int getNumOutputArrays()
        The number of output (arrays) for this network
      • setInput

        public void setInput​(int inputNum,
                             INDArray input)
        Set the specified input for the ComputationGraph
      • setInputs

        public void setInputs​(INDArray... inputs)
        Set all inputs for the ComputationGraph network
      • getInput

        public INDArray getInput​(int inputNum)
        Get the previously set input for the ComputationGraph
      • getInputs

        public INDArray[] getInputs()
        Get the previously set inputs for the ComputationGraph
      • getInputMaskArrays

        public INDArray[] getInputMaskArrays()
        Get the previously set feature/input mask arrays for the ComputationGraph
      • getLabelMaskArrays

        public INDArray[] getLabelMaskArrays()
        Get the previously set label/output mask arrays for the ComputationGraph
      • setLabel

        public void setLabel​(int labelNum,
                             INDArray label)
        Set the specified label for the ComputationGraph
      • setLabels

        public void setLabels​(INDArray... labels)
        Set all labels for the ComputationGraph network
      • setGradientsAccumulator

        public void setGradientsAccumulator​(GradientsAccumulator accumulator)
        This method allows you to specificy GradientsAccumulator instance to be used with this model

        PLEASE NOTE: Do not use this method unless you understand how to use GradientsAccumulator & updates sharing. PLEASE NOTE: Do not use this method on standalone model

        Parameters:
        accumulator -
      • init

        public void init()
        Initialize the ComputationGraph network
        Specified by:
        init in interface Model
        Specified by:
        init in interface NeuralNetwork
      • init

        public void init​(INDArray parameters,
                         boolean cloneParametersArray)
        Initialize the ComputationGraph, optionally with an existing parameters array. If an existing parameters array is specified, it will be used (and the values will not be modified) in the network; if no parameters array is specified, parameters will be initialized randomly according to the network configuration.
        Parameters:
        parameters - Network parameter. May be null. If null: randomly initialize.
        cloneParametersArray - Whether the parameter array (if any) should be cloned, or used directly
      • initGradientsView

        public void initGradientsView()
        This method: initializes the flattened gradients array (used in backprop) and sets the appropriate subset in all layers. As a general rule, this shouldn't ever need to be called manually when doing training via fit(DataSet), fit(DataSetIterator) or fit(MultiDataSet) methods
      • getOutputLayerIndices

        protected int[] getOutputLayerIndices()
      • pretrain

        public void pretrain​(DataSetIterator iter,
                             int numEpochs)
        Pretrain network with a single input and single output. DataSetIterators can only be used if the number of input arrays for the ComputationGraph is 1.
        This method performs layerwise pretraining on all pre-trainable layers in the network (VAEs, Autoencoders, etc), for the specified number of epochs each. For example, if numEpochs=3, then layer 0 will be fit for 3 epochs, followed by layer 1 for 3 epochs, and so on.
        For networks with more than one input use pretrain(MultiDataSetIterator)
      • pretrain

        public void pretrain​(MultiDataSetIterator iter)
        Pretrain network with multiple inputs and/or outputs
      • pretrain

        public void pretrain​(MultiDataSetIterator iter,
                             int numEpochs)
        Pretrain network with multiple inputs and/or outputs
        This method performs layerwise pretraining on all pre-trainable layers in the network (VAEs, Autoencoders, etc), for the specified number of epochs each. For example, if numEpochs=3, then layer 0 will be fit for 3 epochs, followed by layer 1 for 3 epochs, and so on.
        Non-pretrainable layers are ignored
        Parameters:
        iter - Training data
        numEpochs - Number of epochs to fit each layer with
        See Also:
        pretrainLayer(String, MultiDataSetIterator)
      • pretrainLayer

        public void pretrainLayer​(String layerName,
                                  DataSetIterator dataSetIterator)
        Pretrain a specified layer with the given DataSetIterator
        Parameters:
        layerName - Layer name
        dataSetIterator - Data
      • pretrainLayer

        public void pretrainLayer​(String layerName,
                                  MultiDataSetIterator iter)
        Pretrain a specified layer with the given MultiDataSetIterator
        Parameters:
        layerName - Layer name
        iter - Training data
      • fit

        public void fit​(DataSet dataSet)
        Fit the ComputationGraph using a DataSet. Note that this method can only be used with ComputationGraphs with 1 input and 1 output. For networks with more than one input or output, use fit(MultiDataSetIterator)
        Specified by:
        fit in interface NeuralNetwork
      • fit

        public void fit​(@NonNull
                        @NonNull DataSetIterator iterator,
                        int numEpochs)
        Perform minibatch training on all minibatches in the DataSetIterator, for the specified number of epochs. Equvalent to calling fit(DataSetIterator) numEpochs times in a loop
        Parameters:
        iterator - Training data (DataSetIterator). Iterator must support resetting
        numEpochs - Number of training epochs, >= 1
      • fit

        public void fit​(@NonNull
                        @NonNull DataSetIterator iterator)
        Fit the ComputationGraph using a DataSetIterator.
        Note that this method can only be used with ComputationGraphs with 1 input and 1 output
        Method doesn't do layerwise pretraining.
        For pretraining use method pretrain.. pretrain(DataSetIterator)
        Specified by:
        fit in interface NeuralNetwork
        Parameters:
        iterator - Training data (DataSetIterator)
      • fit

        public void fit​(MultiDataSet multiDataSet)
        Fit the ComputationGraph using a MultiDataSet
        Specified by:
        fit in interface NeuralNetwork
      • fit

        public void fit​(@NonNull
                        @NonNull MultiDataSetIterator iterator,
                        int numEpochs)
        Perform minibatch training on all minibatches in the MultiDataSetIterator, for the specified number of epochs. Equvalent to calling fit(MultiDataSetIterator) numEpochs times in a loop
        Parameters:
        iterator - Training data (DataSetIterator). Iterator must support resetting
        numEpochs - Number of training epochs, >= 1
      • fit

        public void fit​(INDArray[] inputs,
                        INDArray[] labels)
        Fit the ComputationGraph given arrays of inputs and labels.
        Parameters:
        inputs - The network inptus
        labels - The labels
      • fit

        public void fit​(INDArray[] inputs,
                        INDArray[] labels,
                        INDArray[] featureMaskArrays,
                        INDArray[] labelMaskArrays)
        Fit the ComputationGraph using the specified inputs and labels (and mask arrays)
        Parameters:
        inputs - The network inputs (features)
        labels - The network labels
        featureMaskArrays - Mask arrays for inputs/features. Typically used for RNN training. May be null.
        labelMaskArrays - Mas arrays for the labels/outputs. Typically used for RNN training. May be null.
      • topologicalSortOrder

        public int[] topologicalSortOrder()
        Calculate a topological sort order for the vertices in the graph. Note that this is used for (a) working out what order to do forward pass, (b) what order to do backprop (i.e., reverse of this) (c) order to flatten parameters (and gradients)

        Specifically, gradients/params/forward pass are executed on vertex[topologicalSortOrder[i]], for i=0..nVertices-1

      • calculateIndices

        public GraphIndices calculateIndices()
        Calculate the indices needed for the network:
        (a) topological sort order
        (b) Map: vertex index -> vertex name
        (c) Map: vertex name -> vertex index
        Returns:
        Calculated indices
      • computeGradientAndScore

        public void computeGradientAndScore()
      • feedForward

        public Map<String,​INDArray> feedForward​(INDArray input,
                                                      int layerTillIndex,
                                                      boolean train)
        Conduct forward pass using a single input array. Note that this method can only be used with ComputationGraphs with a single input array.
        Parameters:
        input - The input array
        layerTillIndex - the layer to feed forward to
        train - If true: do forward pass at training time
        Returns:
        A map of activations for each layer (not each GraphVertex). Keys = layer name, values = layer activations
      • feedForward

        public Map<String,​INDArray> feedForward​(INDArray[] input,
                                                      int layerTillIndex,
                                                      boolean train,
                                                      boolean clearInputs)
        Conduct forward pass using an array of inputs. This overload allows the forward pass to be conducted, optionally (not) clearing the layer input arrays.
        Note: when using clearInputs=false, there can be some performance and memory overhead: this is because the arrays are defined outside of workspaces (which are enabled by default) - otherwise, old/invalidated arrays could still be accessed after calling this method. Consequently: Don't use clearInputs=false unless you have a use case that requires them to remain after feed-forward has been completed
        Parameters:
        input - An array of ComputationGraph inputs
        layerTillIndex - the index of the layer to feed forward to
        train - If true: do forward pass at training time; false: do forward pass at test time
        clearInputs - If true (default for other methods): clear the inputs of all layers after doing forward pass. False don't clear layer inputs.
        Returns:
        A map of activations for each layer (not each GraphVertex). Keys = layer name, values = layer activations
      • feedForward

        public Map<String,​INDArray> feedForward​(INDArray[] input,
                                                      int layerTillIndex,
                                                      boolean train)
        Conduct forward pass using an array of inputs
        Parameters:
        input - An array of ComputationGraph inputs
        layerTillIndex - the index of the layer to feed forward to
        train - If true: do forward pass at training time; false: do forward pass at test time
        Returns:
        A map of activations for each layer (not each GraphVertex). Keys = layer name, values = layer activations
      • feedForward

        public Map<String,​INDArray> feedForward​(boolean train,
                                                      int layerTillIndex)
        Conduct forward pass using the stored inputs
        Parameters:
        train - If true: do forward pass at training time; false: do forward pass at test time
        layerTillIndex - the index of the layer to feed forward to
        Returns:
        A map of activations for each layer (not each GraphVertex). Keys = layer name, values = layer activations
      • feedForward

        public Map<String,​INDArray> feedForward​(INDArray input,
                                                      boolean train)
        Conduct forward pass using a single input array. Note that this method can only be used with ComputationGraphs with a single input array.
        Parameters:
        input - The input array
        train - If true: do forward pass at training time
        Returns:
        A map of activations for each layer (not each GraphVertex). Keys = layer name, values = layer activations
      • feedForward

        public Map<String,​INDArray> feedForward​(INDArray[] input,
                                                      boolean train)
        Conduct forward pass using an array of inputs
        Parameters:
        input - An array of ComputationGraph inputs
        train - If true: do forward pass at training time; false: do forward pass at test time
        Returns:
        A map of activations for each layer (not each GraphVertex). Keys = layer name, values = layer activations
      • feedForward

        public Map<String,​INDArray> feedForward​(INDArray[] input,
                                                      boolean train,
                                                      boolean clearInputs)
        Conduct forward pass using an array of inputs. This overload allows the forward pass to be conducted, optionally (not) clearing the layer input arrays.
        Note: this method should NOT be used with clearInputs = true, unless you know what you are doing. Specifically: when using clearInputs=false, in combination with workspaces, the layer input fields may leak outside of the workspaces in which they were defined - potentially causing a crash. See https://deeplearning4j.konduit.ai/config/config-memory/config-workspaces for more details
        Parameters:
        input - An array of ComputationGraph inputs
        train - If true: do forward pass at training time; false: do forward pass at test time
        clearInputs - If true (default for other methods): clear the inputs of all layers after doing forward pass. False don't clear layer inputs.
        Returns:
        A map of activations for each layer (not each GraphVertex). Keys = layer name, values = layer activations
      • feedForward

        public Map<String,​INDArray> feedForward()
        Conduct forward pass using the stored inputs, at test time
        Returns:
        A map of activations for each layer (not each GraphVertex). Keys = layer name, values = layer activations
      • feedForward

        public Map<String,​INDArray> feedForward​(boolean train)
        Conduct forward pass using the stored inputs
        Parameters:
        train - If true: do forward pass at training time; false: do forward pass at test time
        Returns:
        A map of activations for each layer (not each GraphVertex). Keys = layer name, values = layer activations
      • feedForward

        public Map<String,​INDArray> feedForward​(boolean train,
                                                      boolean excludeOutputLayers,
                                                      boolean includeNonLayerVertexActivations)
        Parameters:
        train - True: training time. False: test time
        excludeOutputLayers - Should we exclude the output layers during forward pass? (usually: false)
        includeNonLayerVertexActivations - Include non-layer vertices in the output may?
        Returns:
        Map of activations. Key: vertex name. Value: activations.
      • output

        public INDArray[] output​(INDArray... input)
        Return an array of network outputs (predictions) at test time, given the specified network inputs Network outputs are for output layers only.
        Parameters:
        input - Inputs to the network
        Returns:
        Output activations (order: same as defined in network configuration)
      • outputSingle

        public INDArray outputSingle​(INDArray... input)
        A convenience method that returns a single INDArray, instead of an INDArray[]. Useful for ComputationGraphs that have only a single output. Otherwise identical to output(INDArray...)
        Parameters:
        input - Inputs to the network
        Returns:
        Output activations array
      • output

        public INDArray[] output​(boolean train,
                                 INDArray... input)
        Return an array of network outputs (predictions), given the specified network inputs Network outputs are for output layers only.
        Parameters:
        train - If true: do forward pass at training time; false: do forward pass at test time
        input - Inputs to the network
        Returns:
        Output activations (order: same as defined in network configuration)
      • output

        public INDArray[] output​(boolean train,
                                 MemoryWorkspace outputWorkspace,
                                 INDArray... input)
        Return an array of network outputs (predictions), given the specified network inputs Network outputs are for output layers only.
        If no memory workspace is provided, the output will be detached (not in any workspace).
        If a memory workspace is provided, the output activation array (i.e., the INDArray returned by this method) will be placed in the specified workspace. This workspace must be opened by the user before calling this method - and the user is responsible for (a) closing this workspace, and (b) ensuring the output array is not used out of scope (i.e., not used after closing the workspace to which it belongs - as this is likely to cause either an exception when used, or a crash).
        Parameters:
        train - If true: do forward pass at training time; false: do forward pass at test time
        outputWorkspace - May be null. If not null: the workspace MUST be opened before calling this method.
        input - Inputs to the network
        Returns:
        Output activations (order: same as defined in network configuration)
      • output

        public INDArray[] output​(boolean train,
                                 @NonNull
                                 @NonNull INDArray[] input,
                                 INDArray[] inputMasks)
        Return an array of network outputs (predictions), given the specified network inputs Network outputs are for output layers only.
        Parameters:
        train - If true: forward pass for training mode. False: test mode
        input - Input arrays to the netwonk
        inputMasks - Optional input mask arrays (may be null)
        Returns:
        Network output activations
      • output

        public INDArray[] output​(boolean train,
                                 @NonNull
                                 @NonNull INDArray[] input,
                                 INDArray[] inputMasks,
                                 INDArray[] labelMasks)
        Return an array of network outputs (predictions), given the specified network inputs Network outputs are for output layers only.
        Parameters:
        train - If true: forward pass for training mode. False: test mode
        input - Input arrays to the netwonk
        inputMasks - Optional input mask arrays (may be null)
        labelMasks - Optional label mask arrays (may be null
        Returns:
        Network output activations
      • output

        public <T> T output​(@NonNull
                            @NonNull INDArray[] inputs,
                            INDArray[] inputMasks,
                            INDArray[] labelMasks,
                            @NonNull
                            @NonNull OutputAdapter<T> outputAdapter)
        This method uses provided OutputAdapter to return custom object built from INDArray PLEASE NOTE: This method uses dedicated Workspace for output generation to avoid redundant allocations
        Type Parameters:
        T - T extends Object
        Parameters:
        inputs - Input arrays to the netwonk
        inputMasks - Optional input mask arrays (may be null)
        labelMasks - Optional label mask arrays (may be null
        outputAdapter - OutputAdapter instance
        Returns:
        T instance produced by OutputAdapter
      • output

        public INDArray[] output​(boolean train,
                                 @NonNull
                                 @NonNull INDArray[] input,
                                 INDArray[] inputMasks,
                                 INDArray[] labelMasks,
                                 MemoryWorkspace outputWorkspace)
        Return an array of network outputs (predictions), given the specified network inputs Network outputs are for output layers only.
        If no memory workspace is provided, the output will be detached (not in any workspace).
        If a memory workspace is provided, the output activation array (i.e., the INDArray returned by this method) will be placed in the specified workspace. This workspace must be opened by the user before calling this method - and the user is responsible for (a) closing this workspace, and (b) ensuring the output array is not used out of scope (i.e., not used after closing the workspace to which it belongs - as this is likely to cause either an exception when used, or a crash).
        Parameters:
        train - If true: forward pass for training mode. False: test mode
        input - Input arrays to the netwonk
        inputMasks - Optional input mask arrays (may be null)
        labelMasks - Optional label mask arrays (may be null
        outputWorkspace - May be null. If not null: the workspace MUST be opened before calling this method.
        Returns:
        Network output activations
      • outputSingle

        public INDArray outputSingle​(boolean train,
                                     INDArray... input)
        A convenience method that returns a single INDArray, instead of an INDArray[]. Useful for ComputationGraphs that have only a single output. Otherwise identical to output(boolean, INDArray...)
        Parameters:
        train - If true: do forward pass at training time; false: do forward pass at test time
        input - Inputs to the network
        Returns:
        Output activations array
      • output

        public INDArray[] output​(boolean train,
                                 boolean clearInputs,
                                 INDArray... input)
        An output method for the network, with optional clearing of the layer inputs.
        Note: most users should use output(boolean, INDArray...) or similar methods, unless they are doing non-standard operations (like providing the input arrays externally)
        Parameters:
        train - If true: output during training. False: output during testing. Affects some things such as dropout
        clearInputs - If true: clear the input arrays for all layers. False: leave the input arrays as-is - which can be useful for "external errors" (no output layer) backprop use cases
        input - Input to the network
        Returns:
        Output from the network
      • output

        public INDArray[] output​(DataSetIterator iterator)
        Generate the output for all examples/batches in the input iterator, and concatenate them into a single array per network output
        Parameters:
        iterator - Data to pass through the network
        Returns:
        output for all examples in the iterator
      • output

        public INDArray[] output​(MultiDataSetIterator iterator)
        Generate the output for all examples/batches in the input iterator, and concatenate them into a single array per network output
        Parameters:
        iterator - Data to pass through the network
        Returns:
        output for all examples in the iterator
      • outputSingle

        public INDArray outputSingle​(DataSetIterator iterator)
        Generate the output for all examples/batches in the input iterator, and concatenate them into a single array. Can only be used with ComputationGraphs with 1 output
        Parameters:
        iterator - Data to pass through the network
        Returns:
        output for all examples in the iterator
      • outputSingle

        public INDArray outputSingle​(MultiDataSetIterator iterator)
        Generate the output for all examples/batches in the input iterator, and concatenate them into a single array. Can only be used with ComputationGraphs with 1 output
        Parameters:
        iterator - Data to pass through the network
        Returns:
        output for all examples in the iterator
      • output

        public INDArray[] output​(List<String> layers,
                                 boolean train,
                                 INDArray[] features,
                                 INDArray[] featureMasks)
        Get the activations for the specific layers only
        Parameters:
        layers - Layers to get the specified activations for
        train - If true: train mode. False: test (inference) mode
        features - Features array
        featureMasks - Feature masks array. May be null
        Returns:
        Activations of the selected layers, in the same order as the "layers" arg/list
      • ffToLayerActivationsDetached

        protected Map<String,​INDArray> ffToLayerActivationsDetached​(boolean train,
                                                                          @NonNull
                                                                          @NonNull FwdPassType fwdPassType,
                                                                          boolean storeLastForTBPTT,
                                                                          int layerIndex,
                                                                          int[] excludeIdxs,
                                                                          @NonNull
                                                                          @NonNull INDArray[] features,
                                                                          INDArray[] fMask,
                                                                          INDArray[] lMask,
                                                                          boolean clearLayers)
        Feed-forward through the network - returning all array activations detached from any workspace. Note that no workspace should be active externally when calling this method (an exception will be thrown if a workspace is open externally)
        Parameters:
        train - Training mode (true) or test/inference mode (false)
        fwdPassType - Type of forward pass to perform (STANDARD or RNN_ACTIVATE_WITH_STORED_STATE only)
        storeLastForTBPTT - ONLY used if fwdPassType == FwdPassType.RNN_ACTIVATE_WITH_STORED_STATE
        layerIndex - Index (inclusive) to stop forward pass at. For all layers, use numLayers-1
        excludeIdxs - Layers (vertices) to exclude from forward pass. These layers will be skipped, and hence are usually output layers or at the end of the network. May be null.
        features - Input feature arrays
        fMask - Feature mask arrays. May be null.
        lMask - Label mask array. May be null.
        clearLayers - Whether the layer inputs should be cleared
        Returns:
        Map of activations (including the input), detached from any workspace
      • ffToLayerActivationsInWS

        protected Map<String,​INDArray> ffToLayerActivationsInWS​(boolean train,
                                                                      int layerIndex,
                                                                      int[] excludeIdxs,
                                                                      FwdPassType fwdPassType,
                                                                      boolean storeLastForTBPTT,
                                                                      INDArray[] input,
                                                                      INDArray[] fMask,
                                                                      INDArray[] lMask,
                                                                      boolean clearInputs)
        Feed-forward through the network - if workspaces are used, all returned activations will be present in workspace WS_ALL_LAYERS_ACT.
        Note: if using workspaces for training, requires that WS_ALL_LAYERS_ACT is open externally. If using NO workspaces, requires that no external workspace is open
        Parameters:
        train - Training mode (true) or test/inference mode (false)
        layerIndex - Index (inclusive) to stop forward pass at. For all layers, use -1
        excludeIdxs - Layers (vertices) to exclude from forward pass. These layers will be skipped, and hence are usually output layers or at the end of the network. May be null.
        fwdPassType - Type of forward pass to perform (STANDARD or RNN_ACTIVATE_WITH_STORED_STATE only)
        storeLastForTBPTT - ONLY used if fwdPassType == FwdPassType.RNN_ACTIVATE_WITH_STORED_STATE
        input - Input feature arrays
        fMask - Feature mask arrays. May be null.
        lMask - Label mask array. May be null.
        clearInputs - Whether the layer inputs should be cleared
        Returns:
        Map of activations (including the input), in workspace WS_ALL_LAYERS_ACT if workspaces are used (detached otherwise)
      • outputOfLayersDetached

        protected INDArray[] outputOfLayersDetached​(boolean train,
                                                    @NonNull
                                                    @NonNull FwdPassType fwdPassType,
                                                    @NonNull
                                                    @lombok.NonNull int[] layerIndexes,
                                                    @NonNull
                                                    @NonNull INDArray[] features,
                                                    INDArray[] fMask,
                                                    INDArray[] lMasks,
                                                    boolean clearLayerInputs,
                                                    boolean detachedInputs,
                                                    MemoryWorkspace outputWorkspace)
        Provide the output of the specified layers, detached from any workspace. This is most commonly used at inference/test time, and is more memory efficient than ffToLayerActivationsDetached(boolean, FwdPassType, boolean, int, int[], INDArray[], INDArray[], INDArray[], boolean) and ffToLayerActivationsInWS(boolean, int, int[], FwdPassType, boolean, INDArray[], INDArray[], INDArray[], boolean).
        This method clears all layer inputs. NOTE: in general, no workspaces should be activated externally for this method! This method handles the workspace activation as required
        Parameters:
        train - Training mode (true) or test/inference mode (false)
        fwdPassType - Type of forward pass to perform (STANDARD or RNN_TIMESTEP only)
        layerIndexes - Indexes of the layers to get the activations for
        features - Input features for the network
        fMask - Input/feature mask array. May be null.
        lMasks - Labels mask array. May be null
        clearLayerInputs - If true: the layer input fields will be cleared
        detachedInputs - If true: the layer input fields will be detached. Usually used for external errors cases
        outputWorkspace - Optional - if provided, outputs should be placed in this workspace. NOTE: this workspace must be open
        Returns:
        Output of the specified layers, detached from any workspace
      • backpropGradient

        public Gradient backpropGradient​(INDArray... epsilons)
        Calculate the gradient of the network with respect to some external errors. Note that this is typically used for things like reinforcement learning, not typical networks that include an OutputLayer or RnnOutputLayer
        Parameters:
        epsilons - Epsilons (errors) at the output. Same order with which the output layers are defined in configuration setOutputs(String...)
        Returns:
        Gradient for the network
      • calcBackpropGradients

        protected void calcBackpropGradients​(boolean clearLayers,
                                             boolean truncatedBPTT,
                                             INDArray... externalEpsilons)
        Do backprop (gradient calculation)
        Parameters:
        truncatedBPTT - false: normal backprop. true: calculate gradients using truncated BPTT for RNN layers
        externalEpsilons - null usually (for typical supervised learning). If not null (and length > 0) then assume that the user has provided some errors externally, as they would do for example in reinforcement learning situations.
      • calcRegularizationScore

        public double calcRegularizationScore​(boolean backpropParamsOnly)
      • setListeners

        public void setListeners​(Collection<TrainingListener> listeners)
        Set the trainingListeners for the ComputationGraph (and all layers in the network)
        Specified by:
        setListeners in interface Model
      • setListeners

        public void setListeners​(TrainingListener... listeners)
        Set the trainingListeners for the ComputationGraph (and all layers in the network)
        Specified by:
        setListeners in interface Model
      • addListeners

        public void addListeners​(TrainingListener... listeners)
        This method ADDS additional TrainingListener to existing listeners
        Specified by:
        addListeners in interface Model
        Parameters:
        listeners - Listeners to add
      • getUpdater

        public ComputationGraphUpdater getUpdater()
        Get the ComputationGraphUpdater for the network. Creates one on demand, if required
      • getUpdater

        public ComputationGraphUpdater getUpdater​(boolean initializeIfAbsent)
        Get the ComputationGraphUpdater for this network
        Parameters:
        initializeIfAbsent - If true: create the updater if one is absent. False: return null if absent.
        Returns:
        Updater
      • setUpdater

        public void setUpdater​(ComputationGraphUpdater updater)
        Set the computationGraphUpdater for the network
      • getOutputLayer

        public Layer getOutputLayer​(int outputLayerIdx)
        Get the specified output layer, by index. The index of the output layer may be 0 to getNumOutputArrays()-1
      • score

        public double score​(DataSet dataSet)
        Sets the input and labels and returns a score for the prediction with respect to the true labels
        This is equivalent to score(DataSet, boolean) with training==true.
        NOTE: this version of the score function can only be used with ComputationGraph networks that have a single input and a single output.
        Parameters:
        dataSet - the data to score
        Returns:
        the score for the given input,label pairs
        See Also:
        score(DataSet, boolean)
      • score

        public double score​(DataSet dataSet,
                            boolean training)
        Sets the input and labels and returns a score for the prediction with respect to the true labels
        NOTE: this version of the score function can only be used with ComputationGraph networks that have a single input and a single output. Use score(MultiDataSet, boolean) for multiple input/output networks
        Parameters:
        dataSet - the data to score
        training - whether score is being calculated at training time (true) or test time (false)
        Returns:
        the score for the given input,label pairs
        See Also:
        score(DataSet, boolean)
      • score

        public double score​(MultiDataSet dataSet)
        Score the network given the MultiDataSet, at test time
      • score

        public double score​(MultiDataSet dataSet,
                            boolean training)
        Sets the input and labels and returns a score for the prediction with respect to the true labels
        Parameters:
        dataSet - the data to score
        training - whether score is being calculated at training time (true) or test time (false)
        Returns:
        the score for the given input,label pairs
      • scoreExamples

        public INDArray scoreExamples​(DataSet data,
                                      boolean addRegularizationTerms)
        Calculate the score for each example in a DataSet individually. Unlike score(DataSet) and score(DataSet, boolean) this method does not average/sum over examples. This method allows for examples to be scored individually (at test time only), which may be useful for example for autoencoder architectures and the like.
        Each row of the output (assuming addRegularizationTerms == true) is equivalent to calling score(DataSet) with a single example.
        Parameters:
        data - The data to score
        addRegularizationTerms - If true: add l1/l2 regularization terms (if any) to the score. If false: don't add regularization terms
        Returns:
        An INDArray (column vector) of size input.numRows(); the ith entry is the score (loss value) of the ith example
      • scoreExamples

        public INDArray scoreExamples​(MultiDataSet dataSet,
                                      boolean addRegularizationTerms)
        Calculate the score for each example in a DataSet individually. Unlike score(MultiDataSet) and score(MultiDataSet, boolean) this method does not average/sum over examples. This method allows for examples to be scored individually (at test time only), which may be useful for example for autoencoder architectures and the like.
        Each row of the output (assuming addRegularizationTerms == true) is equivalent to calling score(MultiDataSet) with a single example.
        Parameters:
        dataSet - The data to score
        addRegularizationTerms - If true: add l1/l2 regularization terms (if any) to the score. If false: don't add regularization terms
        Returns:
        An INDArray (column vector) of size input.numRows(); the ith entry is the score (loss value) of the ith example
      • fit

        public void fit()
        Description copied from interface: Model
        All models have a fit method
        Specified by:
        fit in interface Model
      • update

        public void update​(INDArray gradient,
                           String paramType)
        Description copied from interface: Model
        Perform one update applying the gradient
        Specified by:
        update in interface Model
        Parameters:
        gradient - the gradient to apply
      • update

        public void update​(Gradient gradient)
        Description copied from interface: Model
        Update layer weights and biases with gradient change
        Specified by:
        update in interface Model
      • score

        public double score()
        Description copied from interface: Model
        The score for the model
        Specified by:
        score in interface Model
        Returns:
        the score for the model
      • setScore

        public void setScore​(double score)
      • params

        public INDArray params()
        Description copied from interface: Model
        Parameters of the model (if any)
        Specified by:
        params in interface Model
        Specified by:
        params in interface NeuralNetwork
        Returns:
        the parameters of the model
      • numParams

        public long numParams()
        Description copied from interface: Model
        the number of parameters for the model
        Specified by:
        numParams in interface Model
        Returns:
        the number of parameters for the model
      • numParams

        public long numParams​(boolean backwards)
        Description copied from interface: Model
        the number of parameters for the model
        Specified by:
        numParams in interface Model
        Returns:
        the number of parameters for the model
      • setParams

        public void setParams​(INDArray params)
        Description copied from interface: Model
        Set the parameters for this model. This expects a linear ndarray which then be unpacked internally relative to the expected ordering of the model
        Specified by:
        setParams in interface Model
        Parameters:
        params - the parameters for the model
      • setParamsViewArray

        public void setParamsViewArray​(INDArray gradient)
        Description copied from interface: Model
        Set the initial parameters array as a view of the full (backprop) network parameters NOTE: this is intended to be used internally in MultiLayerNetwork and ComputationGraph, not by users.
        Specified by:
        setParamsViewArray in interface Model
        Parameters:
        gradient - a 1 x nParams row vector that is a view of the larger (MLN/CG) parameters array
      • setBackpropGradientsViewArray

        public void setBackpropGradientsViewArray​(INDArray gradient)
        Description copied from interface: Model
        Set the gradients array as a view of the full (backprop) network parameters NOTE: this is intended to be used internally in MultiLayerNetwork and ComputationGraph, not by users.
        Specified by:
        setBackpropGradientsViewArray in interface Model
        Parameters:
        gradient - a 1 x nParams row vector that is a view of the larger (MLN/CG) gradients array
      • fit

        public void fit​(INDArray data,
                        LayerWorkspaceMgr workspaceMgr)
        Description copied from interface: Model
        Fit the model to the given data
        Specified by:
        fit in interface Model
        Parameters:
        data - the data to fit the model to
      • gradient

        public Gradient gradient()
        Description copied from interface: Model
        Get the gradient. Note that this method will not calculate the gradient, it will rather return the gradient that has been computed before. For calculating the gradient, see Model.computeGradientAndScore(LayerWorkspaceMgr) } .
        Specified by:
        gradient in interface Model
        Returns:
        the gradient for this model, as calculated before
      • gradientAndScore

        public Pair<Gradient,​Double> gradientAndScore()
        Description copied from interface: Model
        Get the gradient and score
        Specified by:
        gradientAndScore in interface Model
        Returns:
        the gradient and score
      • batchSize

        public int batchSize()
        Description copied from interface: Model
        The current inputs batch size
        Specified by:
        batchSize in interface Model
        Returns:
        the current inputs batch size
      • conf

        public NeuralNetConfiguration conf()
        Description copied from interface: Model
        The configuration for the neural network
        Specified by:
        conf in interface Model
        Returns:
        the configuration for the neural network
      • input

        public INDArray input()
        Description copied from interface: Model
        The input/feature matrix for the model
        Specified by:
        input in interface Model
        Returns:
        the input/feature matrix for the model
      • getParam

        public INDArray getParam​(String paramName)
        Description copied from interface: Model
        Get the parameter
        Specified by:
        getParam in interface Model
        Parameters:
        paramName - the key of the parameter
        Returns:
        the parameter vector/matrix with that particular key
      • paramTable

        public Map<String,​INDArray> paramTable​(boolean backpropParamsOnly)
        Description copied from interface: Model
        Table of parameters by key, for backprop For many models (dense layers, etc) - all parameters are backprop parameters
        Specified by:
        paramTable in interface Model
        Parameters:
        backpropParamsOnly - If true, return backprop params only. If false: return all params (equivalent to paramsTable())
      • setParamTable

        public void setParamTable​(@NonNull
                                  @NonNull Map<String,​INDArray> paramTable)
        Description copied from interface: Model
        Setter for the param table
        Specified by:
        setParamTable in interface Model
      • setParam

        public void setParam​(String key,
                             INDArray val)
        Description copied from interface: Model
        Set the parameter with a new ndarray
        Specified by:
        setParam in interface Model
        Parameters:
        key - the key to se t
        val - the new ndarray
      • clear

        public void clear()
        Description copied from interface: Model
        Clear input
        Specified by:
        clear in interface Model
      • applyConstraints

        public void applyConstraints​(int iteration,
                                     int epoch)
        Description copied from interface: Model
        Apply any constraints to the model
        Specified by:
        applyConstraints in interface Model
      • rnnTimeStep

        public INDArray[] rnnTimeStep​(INDArray... inputs)
        If this ComputationGraph contains one or more RNN layers: conduct forward pass (prediction) but using previous stored state for any RNN layers. The activations for the final step are also stored in the RNN layers for use next time rnnTimeStep() is called.
        This method can be used to generate output one or more steps at a time instead of always having to do forward pass from t=0. Example uses are for streaming data, and for generating samples from network output one step at a time (where samples are then fed back into the network as input)
        If no previous state is present in RNN layers (i.e., initially or after calling rnnClearPreviousState()), the default initialization (usually 0) is used.
        Supports mini-batch (i.e., multiple predictions/forward pass in parallel) as well as for single examples.
        Parameters:
        inputs - Input to network. May be for one or multiple time steps. For single time step: input has shape [miniBatchSize,inputSize] or [miniBatchSize,inputSize,1]. miniBatchSize=1 for single example.
        For multiple time steps: [miniBatchSize,inputSize,inputTimeSeriesLength]
        Returns:
        Output activations. If output is RNN layer (such as RnnOutputLayer): if all inputs have shape [miniBatchSize,inputSize] i.e., is 2d, then outputs have shape [miniBatchSize,outputSize] (i.e., also 2d) instead of [miniBatchSize,outputSize,1].
        Otherwise output is 3d [miniBatchSize,outputSize,inputTimeSeriesLength] when using RnnOutputLayer (or unmodified otherwise).
      • rnnTimeStep

        public INDArray[] rnnTimeStep​(MemoryWorkspace outputWorkspace,
                                      INDArray... inputs)
        See rnnTimeStep(INDArray...) for details.
        If no memory workspace is provided, the output will be detached (not in any workspace).
        If a memory workspace is provided, the output activation array (i.e., the INDArray returned by this method) will be placed in the specified workspace. This workspace must be opened by the user before calling this method - and the user is responsible for (a) closing this workspace, and (b) ensuring the output array is not used out of scope (i.e., not used after closing the workspace to which it belongs - as this is likely to cause either an exception when used, or a crash).
        Parameters:
        inputs - Input activations
        outputWorkspace - Output workspace. May be null
        Returns:
        The output/activations from the network (either detached or in the specified workspace if provided)
      • rnnGetPreviousState

        public Map<String,​INDArray> rnnGetPreviousState​(int layer)
        Get the state of the RNN layer, as used in rnnTimeStep(INDArray...).
        Parameters:
        layer - Number/index of the layer.
        Returns:
        Hidden state, or null if layer is not an RNN layer
      • rnnGetPreviousState

        public Map<String,​INDArray> rnnGetPreviousState​(String layerName)
        Get the state of the RNN layer, as used in rnnTimeStep(INDArray...).
        Parameters:
        layerName - name of the layer
        Returns:
        Hidden state, or null if layer is not an RNN layer
      • rnnSetPreviousState

        public void rnnSetPreviousState​(int layer,
                                        Map<String,​INDArray> state)
        Set the state of the RNN layer, for use in rnnTimeStep(INDArray...)
        Parameters:
        layer - The number/index of the layer.
        state - The state to set the specified layer to
      • rnnSetPreviousState

        public void rnnSetPreviousState​(String layerName,
                                        Map<String,​INDArray> state)
        Set the state of the RNN layer, for use in rnnTimeStep(INDArray...)
        Parameters:
        layerName - The name of the layer.
        state - The state to set the specified layer to
      • rnnClearPreviousState

        public void rnnClearPreviousState()
        Clear the previous state of the RNN layers (if any), used in rnnTimeStep(INDArray...)
      • rnnActivateUsingStoredState

        public Map<String,​INDArray> rnnActivateUsingStoredState​(INDArray[] inputs,
                                                                      boolean training,
                                                                      boolean storeLastForTBPTT)
        Similar to rnnTimeStep and feedForward() methods. Difference here is that this method:
        (a) like rnnTimeStep does forward pass using stored state for RNN layers, and
        (b) unlike rnnTimeStep does not modify the RNN layer state
        Therefore multiple calls to this method with the same input should have the same output.
        Typically used during training only. Use rnnTimeStep for prediction/forward pass at test time.
        Parameters:
        inputs - Input to network
        training - Whether training or not
        storeLastForTBPTT - set to true if used as part of truncated BPTT training
        Returns:
        Activations for each layer (including input, as per feedforward() etc)
      • setLayerMaskArrays

        public void setLayerMaskArrays​(INDArray[] featureMaskArrays,
                                       INDArray[] labelMaskArrays)
        Set the mask arrays for features and labels. Mask arrays are typically used in situations such as one-to-many and many-to-one learning with recurrent neural networks, as well as for supporting time series of varying lengths within the same minibatch.
        For example, with RNN data sets with input of shape [miniBatchSize,nIn,timeSeriesLength] and outputs of shape [miniBatchSize,nOut,timeSeriesLength], the features and mask arrays will have shape [miniBatchSize,timeSeriesLength] and contain values 0 or 1 at each element (to specify whether a given input/example is present - or merely padding - at a given time step).
        NOTE: This method is not usually used directly. Instead, the various feedForward and fit methods handle setting of masking internally.
        Parameters:
        featureMaskArrays - Mask array for features (input)
        labelMaskArrays - Mask array for labels (output)
        See Also:
        clearLayerMaskArrays()
      • rnnUpdateStateWithTBPTTState

        protected void rnnUpdateStateWithTBPTTState()
        Update the internal state of RNN layers after a truncated BPTT fit call
      • evaluate

        public <T extends Evaluation> T evaluate​(DataSetIterator iterator)
        Evaluate the network (classification performance - single output ComputationGraphs only)
        Parameters:
        iterator - Iterator to evaluate on
        Returns:
        Evaluation object; results of evaluation on all examples in the data set
      • evaluate

        public <T extends Evaluation> T evaluate​(MultiDataSetIterator iterator)
        Evaluate the network (classification performance - single output ComputationGraphs only)
        Parameters:
        iterator - Iterator to evaluate on
        Returns:
        Evaluation object; results of evaluation on all examples in the data set
      • evaluate

        public <T extends Evaluation> T evaluate​(DataSetIterator iterator,
                                                 List<String> labelsList)
        Evaluate the network on the provided data set (single output ComputationGraphs only). Used for evaluating the performance of classifiers
        Parameters:
        iterator - Data to undertake evaluation on
        Returns:
        Evaluation object, summarizing the results of the evaluation on the provided DataSetIterator
      • evaluate

        public <T extends Evaluation> T evaluate​(MultiDataSetIterator iterator,
                                                 List<String> labelsList)
        Evaluate the network on the provided data set (single output ComputationGraphs only). Used for evaluating the performance of classifiers
        Parameters:
        iterator - Data to undertake evaluation on
        Returns:
        Evaluation object, summarizing the results of the evaluation on the provided DataSetIterator
      • evaluate

        public <T extends Evaluation> T evaluate​(DataSetIterator iterator,
                                                 List<String> labelsList,
                                                 int topN)
        Evaluate the network (for classification) on the provided data set, with top N accuracy in addition to standard accuracy. For 'standard' accuracy evaluation only, use topN = 1
        Parameters:
        iterator - Iterator (data) to evaluate on
        labelsList - List of labels. May be null.
        topN - N value for top N accuracy evaluation
        Returns:
        Evaluation object, summarizing the results of the evaluation on the provided DataSetIterator
      • evaluate

        public <T extends Evaluation> T evaluate​(MultiDataSetIterator iterator,
                                                 List<String> labelsList,
                                                 int topN)
        Evaluate the network (for classification) on the provided data set, with top N accuracy in addition to standard accuracy. For 'standard' accuracy evaluation only, use topN = 1
        Parameters:
        iterator - Iterator (data) to evaluate on
        labelsList - List of labels. May be null.
        topN - N value for top N accuracy evaluation
        Returns:
        Evaluation object, summarizing the results of the evaluation on the provided DataSetIterator
      • evaluateRegression

        public <T extends RegressionEvaluation> T evaluateRegression​(DataSetIterator iterator)
        Evaluate the (single output layer only) network for regression performance
        Parameters:
        iterator - Data to evaluate on
        Returns:
        Regression evaluation
      • evaluateRegression

        public <T extends RegressionEvaluation> T evaluateRegression​(MultiDataSetIterator iterator)
        Evaluate the (single output layer only) network for regression performance
        Parameters:
        iterator - Data to evaluate on
        Returns:
        Regression evaluation
      • evaluateRegression

        public <T extends RegressionEvaluation> T evaluateRegression​(DataSetIterator iterator,
                                                                     List<String> columnNames)
        Evaluate the (single output layer only) network for regression performance
        Parameters:
        iterator - Data to evaluate on
        columnNames - Column names for the regression evaluation. May be null.
        Returns:
        Regression evaluation
      • evaluateRegression

        public <T extends RegressionEvaluation> T evaluateRegression​(MultiDataSetIterator iterator,
                                                                     List<String> columnNames)
        Evaluate the (single output layer only) network for regression performance
        Parameters:
        iterator - Data to evaluate on
        Returns:
        Regression evaluation
      • evaluateROC

        public <T extends ROC> T evaluateROC​(DataSetIterator iterator,
                                             int rocThresholdSteps)
        Evaluate the network (must be a binary classifier) on the specified data, using the ROC class
        Parameters:
        iterator - Data to evaluate on
        rocThresholdSteps - Number of threshold steps to use with ROC
        Returns:
        ROC evaluation on the given dataset
      • evaluateROC

        public <T extends ROC> T evaluateROC​(MultiDataSetIterator iterator,
                                             int rocThresholdSteps)
        Evaluate the network (must be a binary classifier) on the specified data, using the ROC class
        Parameters:
        iterator - Data to evaluate on
        rocThresholdSteps - Number of threshold steps to use with ROC
        Returns:
        ROC evaluation on the given dataset
      • evaluateROCMultiClass

        public <T extends ROCMultiClass> T evaluateROCMultiClass​(DataSetIterator iterator,
                                                                 int rocThresholdSteps)
        Evaluate the network on the specified data, using the ROCMultiClass class
        Parameters:
        iterator - Data to evaluate on
        rocThresholdSteps - Number of threshold steps to use with ROCMultiClass
        Returns:
        Multi-class ROC evaluation on the given dataset
      • evaluateROCMultiClass

        public <T extends ROCMultiClass> T evaluateROCMultiClass​(MultiDataSetIterator iterator,
                                                                 int rocThresholdSteps)
        Evaluate the network on the specified data, using the ROCMultiClass class
        Parameters:
        iterator - Data to evaluate on
        rocThresholdSteps - Number of threshold steps to use with ROCMultiClass
        Returns:
        Multi-class ROC evaluation on the given dataset
      • doEvaluation

        public <T extends IEvaluation> T[] doEvaluation​(DataSetIterator iterator,
                                                        T... evaluations)
        Perform evaluation on the given data (DataSetIterator) with the given IEvaluation instance
        Specified by:
        doEvaluation in interface NeuralNetwork
        Type Parameters:
        T - Type of the IEvaluation instance
        Parameters:
        iterator - Test data to evaluate on
        evaluations - IEvaluation instances
        Returns:
        The input IEvaluation instance, after performing evaluation on the test data
      • doEvaluation

        public <T extends IEvaluation> T[] doEvaluation​(MultiDataSetIterator iterator,
                                                        T... evaluations)
        Perform evaluation on the given data (MultiDataSetIterator) with the given IEvaluation instance
        Specified by:
        doEvaluation in interface NeuralNetwork
        Type Parameters:
        T - Type of the IEvaluation instance
        Parameters:
        iterator - Test data to evaluate on
        evaluations - IEvaluation insntance
        Returns:
        The input IEvaluation instance, after performing evaluation on the test data
      • evaluate

        public <T extends IEvaluationMap<Integer,​T[]> evaluate​(DataSetIterator iterator,
                                                                       Map<Integer,​T[]> evaluations)
        Perform evaluation for networks with multiple outputs.
        Parameters:
        iterator - Data to evaluate
        evaluations - Evaluation instances. Key: the network output number (0 to numOutputs-1). Value: the IEvaluation instances to perform evaluation with, for that output only. Note that not every output needs to have an IEvaluation[] defined.
        Returns:
        The same evaluation map, after performing evaluation
      • evaluate

        public <T extends IEvaluationMap<Integer,​T[]> evaluate​(MultiDataSetIterator iterator,
                                                                       Map<Integer,​T[]> evaluations)
        Perform evaluation for networks with multiple outputs.
        Parameters:
        iterator - Data to evaluate
        evaluations - Evaluation instances. Key: the network output number (0 to numOutputs-1). Value: the IEvaluation instances to perform evaluation with, for that output only. Note that not every output needs to have an IEvaluation[] defined.
        Returns:
        The same evaluation map, after performing evaluation
      • summary

        public String summary()
        String detailing the architecture of the computation graph. Vertices are printed in a topological sort order. Columns are Vertex Names with layer/vertex type, nIn, nOut, Total number of parameters and the Shapes of the parameters And the inputs to the vertex Will also give information about frozen layers/vertices, if any.
        Returns:
        Summary as a string
        See Also:
        memoryInfo(int, InputType...)
      • summary

        public String summary​(InputType... inputTypes)
        String detailing the architecture of the computation graph. Will also display activation size when given an input type. Vertices are printed in a topological sort order. Columns are Vertex Names with layer/vertex type, nIn, nOut, Total number of parameters and the Shapes of the parameters And the inputs to the vertex Will also give information about frozen layers/vertices, if any.
        Returns:
        Summary as a string
        See Also:
        memoryInfo(int, InputType...)
      • memoryInfo

        public String memoryInfo​(int minibatch,
                                 InputType... inputTypes)
        Generate information regarding memory use for the network, for the given input types and minibatch size. Note that when using workspaces or CuDNN, the network should be trained for some iterations so that the memory workspaces have time to initialize. Without this, the memory requirements during training may be underestimated. Note also that this is the same information that is generated during an OOM crash when training or performing inference.
        Parameters:
        minibatch - Minibatch size to estimate memory for
        inputTypes - Input types to the network
        Returns:
        A String with information about network memory use information
      • clearLayersStates

        public void clearLayersStates()
        This method just makes sure there's no state preserved within layers
      • incrementEpochCount

        public void incrementEpochCount()
        Increment the epoch count (in the underlying ComputationGraphConfiguration by 1). Note that this is done automatically when using iterator-based fitting methods, such as fit(DataSetIterator) or fit(MultiDataSet). However, when using non-iterator fit methods (DataSet, MultiDataSet, INDArrays etc), the network has no way to know when one epoch ends and another starts. In such situations, this method can be used to increment the epoch counter.
        Note that the epoch counter is used for situations such as some learning rate schedules, and the like. The current epoch count can be obtained using ComputationGraph.getConfiguration().getEpochCount()
      • synchronizeIterEpochCounts

        protected void synchronizeIterEpochCounts()
      • getIterationCount

        public int getIterationCount()
        Returns the number of iterations (parameter updates) that the ComputationGraph has done
        Returns:
        Number of iterations
      • convertDataType

        public ComputationGraph convertDataType​(@NonNull
                                                @NonNull DataType dataType)
        Return a copy of the network with the parameters and activations set to use the specified (floating point) data type. If the existing datatype is the same as the requested dataype, the original network will be returned unchanged. Only floating point datatypes (DOUBLE, FLOAT, HALF) may be used.
        Parameters:
        dataType - Datatype to convert the network to
        Returns:
        The network, set to use the specified datatype for the parameters and activations
      • setLearningRate

        public void setLearningRate​(double newLr)
        Set the learning rate for all layers in the network to the specified value. Note that if any learning rate schedules are currently present, these will be removed in favor of the new (fixed) learning rate.

        Note: This method not free from a performance point of view: a proper learning rate schedule should be used in preference to calling this method at every iteration.
        Parameters:
        newLr - New learning rate for all layers
        See Also:
        setLearningRate(ISchedule), setLearningRate(String, double)
      • setLearningRate

        public void setLearningRate​(ISchedule newLr)
        Set the learning rate schedule for all layers in the network to the specified schedule. This schedule will replace any/all existing schedules, and also any fixed learning rate values.
        Note that the iteration/epoch counts will not be reset. Use ComputationGraphConfiguration#setIterationCount(int) and ComputationGraphConfiguration#setEpochCount(int) if this is required
        Parameters:
        newLr - New learning rate schedule for all layers
        See Also:
        setLearningRate(ISchedule), setLearningRate(String, double)
      • setLearningRate

        public void setLearningRate​(String layerName,
                                    double newLr)
        Set the learning rate for a single layer in the network to the specified value. Note that if any learning rate schedules are currently present, these will be removed in favor of the new (fixed) learning rate.

        Note: This method not free from a performance point of view: a proper learning rate schedule should be used in preference to calling this method at every iteration. Note also that setLearningRate(double) should also be used in preference, when all layers need to be set to a new LR
        Parameters:
        layerName - Name of the layer to set the LR for
        newLr - New learning rate for a single layer
        See Also:
        setLearningRate(ISchedule), setLearningRate(String, double)
      • setLearningRate

        public void setLearningRate​(String layerName,
                                    ISchedule newLr)
        Set the learning rate schedule for a single layer in the network to the specified value.
        Note also that setLearningRate(ISchedule) should also be used in preference, when all layers need to be set to a new LR schedule.
        This schedule will replace any/all existing schedules, and also any fixed learning rate values.
        Note also that the iteration/epoch counts will not be reset. Use ComputationGraphConfiguration#setIterationCount(int) and ComputationGraphConfiguration#setEpochCount(int) if this is required
        Parameters:
        layerName - Name of the layer to set the LR schedule for
        newLr - New learning rate for a single layer
        See Also:
        setLearningRate(ISchedule), setLearningRate(String, double)
      • getLearningRate

        public Double getLearningRate​(String layerName)
        Get the current learning rate, for the specified layer, from the network. Note: If the layer has no learning rate (no parameters, or an updater without a learning rate) then null is returned
        Parameters:
        layerName - Layer name
        Returns:
        Learning rate for the specified layer, or null
      • layerSize

        public long layerSize​(int layer)
        Return the layer size (number of units) for the specified layer. Note that the meaning of the "layer size" can depend on the type of layer. For example:
        - DenseLayer, OutputLayer, recurrent layers: number of units (nOut configuration option)
        - ConvolutionLayer: the channels (number of channels)
        - Subsampling layers, global pooling layers, etc: size of 0 is always returned
        Parameters:
        layer - Index of the layer to get the size of. Must be in range 0 to nLayers-1 inclusive
        Returns:
        Size of the layer
      • layerInputSize

        public long layerInputSize​(int layer)
        Return the input size (number of inputs) for the specified layer.
        Note that the meaning of the "input size" can depend on the type of layer. For example:
        - DenseLayer, OutputLayer, etc: the feature vector size (nIn configuration option)
        - Recurrent layers: the feature vector size per time step (nIn configuration option)
        - ConvolutionLayer: the channels (number of channels)
        - Subsampling layers, global pooling layers, etc: size of 0 is always returned
        Parameters:
        layer - Index of the layer to get the size of. Must be in range 0 to nLayers-1 inclusive
        Returns:
        Size of the layer
      • layerSize

        public long layerSize​(String layerName)
        Return the layer size (number of units) for the specified layer.
        Note that the meaning of the "layer size" can depend on the type of layer. For example:
        - DenseLayer, OutputLayer, recurrent layers: number of units (nOut configuration option)
        - ConvolutionLayer: the channels (number of channels)
        - Subsampling layers, global pooling layers, etc: size of 0 is always returned
        Parameters:
        layerName - Name of the layer to get the size of
        Returns:
        Size of the layer
      • layerInputSize

        public long layerInputSize​(String layerName)
        Return the input size (number of inputs) for the specified layer.
        Note that the meaning of the "input size" can depend on the type of layer. For example:
        - DenseLayer, OutputLayer, etc: the feature vector size (nIn configuration option)
        - Recurrent layers: the feature vector size per time step (nIn configuration option)
        - ConvolutionLayer: the channels (number of channels)
        - Subsampling layers, global pooling layers, etc: size of 0 is always returned
        Parameters:
        layerName - Name of the layer to get the size of
        Returns:
        Size of the layer
      • equals

        public boolean equals​(Object obj)
        Indicates whether some other object is "equal to" this one.

        The equals method implements an equivalence relation on non-null object references:

        • It is reflexive: for any non-null reference value x, x.equals(x) should return true.
        • It is symmetric: for any non-null reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.
        • It is transitive: for any non-null reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true.
        • It is consistent: for any non-null reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the objects is modified.
        • For any non-null reference value x, x.equals(null) should return false.

        The equals method for class Object implements the most discriminating possible equivalence relation on objects; that is, for any non-null reference values x and y, this method returns true if and only if x and y refer to the same object (x == y has the value true).

        Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.

        Overrides:
        equals in class Object
        Parameters:
        obj - the reference object with which to compare.
        Returns:
        true if this object is the same as the obj argument; false otherwise.
        See Also:
        Object.hashCode(), HashMap
      • close

        public void close()
        Close the network and deallocate all native memory, including: parameters, gradients, updater memory and workspaces Note that the network should not be used again for any purpose after it has been closed
        Specified by:
        close in interface Model