Class ComputationGraph
- java.lang.Object
-
- org.deeplearning4j.nn.graph.ComputationGraph
-
- All Implemented Interfaces:
Serializable
,Model
,NeuralNetwork
public class ComputationGraph extends Object implements Serializable, Model, NeuralNetwork
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected boolean
clearTbpttState
protected ComputationGraphConfiguration
configuration
protected INDArray
flattenedGradients
protected INDArray
flattenedParams
protected Gradient
gradient
protected GraphIndices
graphIndices
Topological sort and vertex index/name + name/index mappingprotected Map<String,org.bytedeco.javacpp.Pointer>
helperWorkspaces
protected boolean
initCalled
protected ThreadLocal<Long>
lastEtlTime
protected Layer[]
layers
A list of layers.protected double
score
protected Solver
solver
protected int[]
topologicalOrder
Indexes of graph vertices, in topological order.protected GraphVertex[]
vertices
All GraphVertex objects in the network.protected Map<String,GraphVertex>
verticesMap
Map of vertices by nameprotected static String
WS_ALL_LAYERS_ACT
Workspace for storing all layers' activations - used only to store activations (layer inputs) as part of backprop Not used for inferenceprotected static WorkspaceConfiguration
WS_ALL_LAYERS_ACT_CONFIG
protected WorkspaceConfiguration
WS_LAYER_ACT_X_CONFIG
protected static String
WS_LAYER_WORKING_MEM
Workspace for working memory for a single layer: forward pass and backward pass Note that this is opened/closed once per op (activate/backpropGradient call)protected WorkspaceConfiguration
WS_LAYER_WORKING_MEM_CONFIG
protected static String
WS_OUTPUT_MEM
Workspace for output methods that use OutputAdapterprotected static String
WS_RNN_LOOP_WORKING_MEM
Workspace for working memory in RNNs - opened and closed once per RNN time stepprotected static WorkspaceConfiguration
WS_RNN_LOOP_WORKING_MEM_CONFIG
-
Constructor Summary
Constructors Constructor Description ComputationGraph(ComputationGraphConfiguration configuration)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description void
addListeners(TrainingListener... listeners)
This method ADDS additional TrainingListener to existing listenersvoid
applyConstraints(int iteration, int epoch)
Apply any constraints to the modelGradient
backpropGradient(INDArray... epsilons)
Calculate the gradient of the network with respect to some external errors.int
batchSize()
The current inputs batch sizeprotected void
calcBackpropGradients(boolean clearLayers, boolean truncatedBPTT, INDArray... externalEpsilons)
Do backprop (gradient calculation)double
calcRegularizationScore(boolean backpropParamsOnly)
GraphIndices
calculateIndices()
Calculate the indices needed for the network:
(a) topological sort order
(b) Map: vertex index -> vertex name
(c) Map: vertex name -> vertex indexvoid
clear()
Clear inputvoid
clearLayerMaskArrays()
Remove the mask arrays from all layers.
SeesetLayerMaskArrays(INDArray[], INDArray[])
for details on mask arrays.void
clearLayersStates()
This method just makes sure there's no state preserved within layersComputationGraph
clone()
void
close()
Close the network and deallocate all native memory, including: parameters, gradients, updater memory and workspaces Note that the network should not be used again for any purpose after it has been closedvoid
computeGradientAndScore()
void
computeGradientAndScore(LayerWorkspaceMgr workspaceMgr)
Update the scoreNeuralNetConfiguration
conf()
The configuration for the neural networkComputationGraph
convertDataType(@NonNull DataType dataType)
Return a copy of the network with the parameters and activations set to use the specified (floating point) data type.<T extends IEvaluation>
T[]doEvaluation(DataSetIterator iterator, T... evaluations)
Perform evaluation on the given data (DataSetIterator) with the givenIEvaluation
instance<T extends IEvaluation>
T[]doEvaluation(MultiDataSetIterator iterator, T... evaluations)
Perform evaluation on the given data (MultiDataSetIterator) with the givenIEvaluation
instanceprotected void
doTruncatedBPTT(INDArray[] inputs, INDArray[] labels, INDArray[] featureMasks, INDArray[] labelMasks, LayerWorkspaceMgr workspaceMgr)
Fit the network using truncated BPTTboolean
equals(Object obj)
Indicates whether some other object is "equal to" this one.<T extends Evaluation>
Tevaluate(DataSetIterator iterator)
Evaluate the network (classification performance - single output ComputationGraphs only)<T extends Evaluation>
Tevaluate(DataSetIterator iterator, List<String> labelsList)
Evaluate the network on the provided data set (single output ComputationGraphs only).<T extends Evaluation>
Tevaluate(DataSetIterator iterator, List<String> labelsList, int topN)
Evaluate the network (for classification) on the provided data set, with top N accuracy in addition to standard accuracy.<T extends IEvaluation>
Map<Integer,T[]>evaluate(DataSetIterator iterator, Map<Integer,T[]> evaluations)
Perform evaluation for networks with multiple outputs.<T extends Evaluation>
Tevaluate(MultiDataSetIterator iterator)
Evaluate the network (classification performance - single output ComputationGraphs only)<T extends Evaluation>
Tevaluate(MultiDataSetIterator iterator, List<String> labelsList)
Evaluate the network on the provided data set (single output ComputationGraphs only).<T extends Evaluation>
Tevaluate(MultiDataSetIterator iterator, List<String> labelsList, int topN)
Evaluate the network (for classification) on the provided data set, with top N accuracy in addition to standard accuracy.<T extends IEvaluation>
Map<Integer,T[]>evaluate(MultiDataSetIterator iterator, Map<Integer,T[]> evaluations)
Perform evaluation for networks with multiple outputs.<T extends RegressionEvaluation>
TevaluateRegression(DataSetIterator iterator)
Evaluate the (single output layer only) network for regression performance<T extends RegressionEvaluation>
TevaluateRegression(DataSetIterator iterator, List<String> columnNames)
Evaluate the (single output layer only) network for regression performance<T extends RegressionEvaluation>
TevaluateRegression(MultiDataSetIterator iterator)
Evaluate the (single output layer only) network for regression performance<T extends RegressionEvaluation>
TevaluateRegression(MultiDataSetIterator iterator, List<String> columnNames)
Evaluate the (single output layer only) network for regression performance<T extends ROC>
TevaluateROC(DataSetIterator iterator)
Deprecated.To be removed - useevaluateROC(DataSetIterator, int)
to enforce selection of appropriate ROC/threshold configuration<T extends ROC>
TevaluateROC(DataSetIterator iterator, int rocThresholdSteps)
Evaluate the network (must be a binary classifier) on the specified data, using theROC
class<T extends ROC>
TevaluateROC(MultiDataSetIterator iterator)
Deprecated.To be removed - useevaluateROC(DataSetIterator, int)
to enforce selection of appropriate ROC/threshold configuration<T extends ROC>
TevaluateROC(MultiDataSetIterator iterator, int rocThresholdSteps)
Evaluate the network (must be a binary classifier) on the specified data, using theROC
class<T extends ROCMultiClass>
TevaluateROCMultiClass(DataSetIterator iterator)
Deprecated.To be removed - useevaluateROCMultiClass(DataSetIterator, int)
to enforce selection of appropriate ROC/threshold configuration<T extends ROCMultiClass>
TevaluateROCMultiClass(DataSetIterator iterator, int rocThresholdSteps)
Evaluate the network on the specified data, using theROCMultiClass
class<T extends ROCMultiClass>
TevaluateROCMultiClass(MultiDataSetIterator iterator, int rocThresholdSteps)
Evaluate the network on the specified data, using theROCMultiClass
classMap<String,INDArray>
feedForward()
Conduct forward pass using the stored inputs, at test timeMap<String,INDArray>
feedForward(boolean train)
Conduct forward pass using the stored inputsMap<String,INDArray>
feedForward(boolean train, boolean excludeOutputLayers, boolean includeNonLayerVertexActivations)
Map<String,INDArray>
feedForward(boolean train, int layerTillIndex)
Conduct forward pass using the stored inputsMap<String,INDArray>
feedForward(INDArray[] input, boolean train)
Conduct forward pass using an array of inputsMap<String,INDArray>
feedForward(INDArray[] input, boolean train, boolean clearInputs)
Conduct forward pass using an array of inputs.Map<String,INDArray>
feedForward(INDArray[] input, int layerTillIndex, boolean train)
Conduct forward pass using an array of inputsMap<String,INDArray>
feedForward(INDArray[] input, int layerTillIndex, boolean train, boolean clearInputs)
Conduct forward pass using an array of inputs.Map<String,INDArray>
feedForward(INDArray input, boolean train)
Conduct forward pass using a single input array.Map<String,INDArray>
feedForward(INDArray input, int layerTillIndex, boolean train)
Conduct forward pass using a single input array.protected Map<String,INDArray>
ffToLayerActivationsDetached(boolean train, @NonNull FwdPassType fwdPassType, boolean storeLastForTBPTT, int layerIndex, int[] excludeIdxs, @NonNull INDArray[] features, INDArray[] fMask, INDArray[] lMask, boolean clearLayers)
Feed-forward through the network - returning all array activations detached from any workspace.protected Map<String,INDArray>
ffToLayerActivationsInWS(boolean train, int layerIndex, int[] excludeIdxs, FwdPassType fwdPassType, boolean storeLastForTBPTT, INDArray[] input, INDArray[] fMask, INDArray[] lMask, boolean clearInputs)
Feed-forward through the network - if workspaces are used, all returned activations will be present in workspace WS_ALL_LAYERS_ACT.
Note: if using workspaces for training, requires that WS_ALL_LAYERS_ACT is open externally.void
fit()
All models have a fit methodvoid
fit(@NonNull DataSetIterator iterator)
Fit the ComputationGraph using a DataSetIterator.
Note that this method can only be used with ComputationGraphs with 1 input and 1 output
Method doesn't do layerwise pretraining.
For pretraining use method pretrain..void
fit(@NonNull DataSetIterator iterator, int numEpochs)
Perform minibatch training on all minibatches in the DataSetIterator, for the specified number of epochs.void
fit(@NonNull MultiDataSetIterator iterator, int numEpochs)
Perform minibatch training on all minibatches in the MultiDataSetIterator, for the specified number of epochs.void
fit(INDArray[] inputs, INDArray[] labels)
Fit the ComputationGraph given arrays of inputs and labels.void
fit(INDArray[] inputs, INDArray[] labels, INDArray[] featureMaskArrays, INDArray[] labelMaskArrays)
Fit the ComputationGraph using the specified inputs and labels (and mask arrays)void
fit(INDArray data, LayerWorkspaceMgr workspaceMgr)
Fit the model to the given datavoid
fit(DataSet dataSet)
Fit the ComputationGraph using a DataSet.void
fit(MultiDataSetIterator multi)
Fit the ComputationGraph using a MultiDataSetIterator Method doesn't do layerwise pretraining.
For pretraining use method pretrain..void
fit(MultiDataSet multiDataSet)
Fit the ComputationGraph using a MultiDataSetComputationGraphConfiguration
getConfiguration()
This method returns configuration of this ComputationGraphint
getEpochCount()
Returns the number of epochs that the ComputationGraph has done.INDArray
getGradientsViewArray()
INDArray
getInput(int inputNum)
Get the previously set input for the ComputationGraphINDArray[]
getInputMaskArrays()
Get the previously set feature/input mask arrays for the ComputationGraphINDArray[]
getInputs()
Get the previously set inputs for the ComputationGraphint
getIterationCount()
Returns the number of iterations (parameter updates) that the ComputationGraph has doneINDArray[]
getLabelMaskArrays()
Get the previously set label/output mask arrays for the ComputationGraphlong
getLastEtlTime()
This method returns ETL time field valueLayer
getLayer(int idx)
Get the layer by the number of that layer, in range 0 to getNumLayers()-1 NOTE: This is different from the internal GraphVertex index for the layerLayer
getLayer(String name)
Get a given layer by name.Layer[]
getLayers()
Get all layers in the ComputationGraphDouble
getLearningRate(String layerName)
Get the current learning rate, for the specified layer, from the network.Collection<TrainingListener>
getListeners()
Get the trainingListeners for the ComputationGraphint
getNumInputArrays()
The number of inputs to this networkint
getNumLayers()
Returns the number of layers in the ComputationGraphint
getNumOutputArrays()
The number of output (arrays) for this networkConvexOptimizer
getOptimizer()
Returns this models optimizerLayer
getOutputLayer(int outputLayerIdx)
Get the specified output layer, by index.protected int[]
getOutputLayerIndices()
INDArray
getParam(String paramName)
Get the parameterComputationGraphUpdater
getUpdater()
Get the ComputationGraphUpdater for the network.ComputationGraphUpdater
getUpdater(boolean initializeIfAbsent)
Get the ComputationGraphUpdater for this networkGraphVertex
getVertex(String name)
Return a given GraphVertex by name, or null if no vertex with that name existsGraphVertex[]
getVertices()
Returns an array of all GraphVertex objects.Gradient
gradient()
Get the gradient.Pair<Gradient,Double>
gradientAndScore()
Get the gradient and scorevoid
incrementEpochCount()
Increment the epoch count (in the underlyingComputationGraphConfiguration
by 1).void
init()
Initialize the ComputationGraph networkvoid
init(INDArray parameters, boolean cloneParametersArray)
Initialize the ComputationGraph, optionally with an existing parameters array.void
initGradientsView()
This method: initializes the flattened gradients array (used in backprop) and sets the appropriate subset in all layers.INDArray
input()
The input/feature matrix for the modellong
layerInputSize(int layer)
Return the input size (number of inputs) for the specified layer.
Note that the meaning of the "input size" can depend on the type of layer.long
layerInputSize(String layerName)
Return the input size (number of inputs) for the specified layer.
Note that the meaning of the "input size" can depend on the type of layer.long
layerSize(int layer)
Return the layer size (number of units) for the specified layer.long
layerSize(String layerName)
Return the layer size (number of units) for the specified layer.
Note that the meaning of the "layer size" can depend on the type of layer.static ComputationGraph
load(File f, boolean loadUpdater)
Restore a ComputationGraph to a file, saved usingsave(File)
orModelSerializer
String
memoryInfo(int minibatch, InputType... inputTypes)
Generate information regarding memory use for the network, for the given input types and minibatch size.long
numParams()
the number of parameters for the modellong
numParams(boolean backwards)
the number of parameters for the modelINDArray[]
output(boolean train, boolean clearInputs, INDArray... input)
An output method for the network, with optional clearing of the layer inputs.
Note: most users should useoutput(boolean, INDArray...)
or similar methods, unless they are doing non-standard operations (like providing the input arrays externally)INDArray[]
output(boolean train, @NonNull INDArray[] input, INDArray[] inputMasks)
Return an array of network outputs (predictions), given the specified network inputs Network outputs are for output layers only.INDArray[]
output(boolean train, @NonNull INDArray[] input, INDArray[] inputMasks, INDArray[] labelMasks)
Return an array of network outputs (predictions), given the specified network inputs Network outputs are for output layers only.INDArray[]
output(boolean train, @NonNull INDArray[] input, INDArray[] inputMasks, INDArray[] labelMasks, MemoryWorkspace outputWorkspace)
Return an array of network outputs (predictions), given the specified network inputs Network outputs are for output layers only.
If no memory workspace is provided, the output will be detached (not in any workspace).
If a memory workspace is provided, the output activation array (i.e., the INDArray returned by this method) will be placed in the specified workspace.INDArray[]
output(boolean train, MemoryWorkspace outputWorkspace, INDArray... input)
Return an array of network outputs (predictions), given the specified network inputs Network outputs are for output layers only.
If no memory workspace is provided, the output will be detached (not in any workspace).
If a memory workspace is provided, the output activation array (i.e., the INDArray returned by this method) will be placed in the specified workspace.INDArray[]
output(boolean train, INDArray... input)
Return an array of network outputs (predictions), given the specified network inputs Network outputs are for output layers only.<T> T
output(@NonNull INDArray[] inputs, INDArray[] inputMasks, INDArray[] labelMasks, @NonNull OutputAdapter<T> outputAdapter)
This method uses provided OutputAdapter to return custom object built from INDArray PLEASE NOTE: This method uses dedicated Workspace for output generation to avoid redundant allocationsINDArray[]
output(List<String> layers, boolean train, INDArray[] features, INDArray[] featureMasks)
Get the activations for the specific layers onlyINDArray[]
output(INDArray... input)
Return an array of network outputs (predictions) at test time, given the specified network inputs Network outputs are for output layers only.INDArray[]
output(DataSetIterator iterator)
Generate the output for all examples/batches in the input iterator, and concatenate them into a single array per network outputINDArray[]
output(MultiDataSetIterator iterator)
Generate the output for all examples/batches in the input iterator, and concatenate them into a single array per network outputprotected INDArray[]
outputOfLayersDetached(boolean train, @NonNull FwdPassType fwdPassType, @lombok.NonNull int[] layerIndexes, @NonNull INDArray[] features, INDArray[] fMask, INDArray[] lMasks, boolean clearLayerInputs, boolean detachedInputs, MemoryWorkspace outputWorkspace)
Provide the output of the specified layers, detached from any workspace.INDArray
outputSingle(boolean train, boolean clearInputs, INDArray... input)
Identical tooutputSingle(boolean, boolean, INDArray...)
but has the option of not clearing the input arrays (useful when later backpropagating external errors).INDArray
outputSingle(boolean train, INDArray... input)
A convenience method that returns a single INDArray, instead of an INDArray[].INDArray
outputSingle(INDArray... input)
A convenience method that returns a single INDArray, instead of an INDArray[].INDArray
outputSingle(DataSetIterator iterator)
Generate the output for all examples/batches in the input iterator, and concatenate them into a single array.INDArray
outputSingle(MultiDataSetIterator iterator)
Generate the output for all examples/batches in the input iterator, and concatenate them into a single array.INDArray
params()
Parameters of the model (if any)INDArray
params(boolean backwardOnly)
Deprecated.To be removed.Map<String,INDArray>
paramTable()
The param tableMap<String,INDArray>
paramTable(boolean backpropParamsOnly)
Table of parameters by key, for backprop For many models (dense layers, etc) - all parameters are backprop parametersvoid
pretrain(DataSetIterator iter)
Perform layerwise pretraining for one epoch - seepretrain(DataSetIterator, int)
void
pretrain(DataSetIterator iter, int numEpochs)
Pretrain network with a single input and single output.void
pretrain(MultiDataSetIterator iter)
Pretrain network with multiple inputs and/or outputsvoid
pretrain(MultiDataSetIterator iter, int numEpochs)
Pretrain network with multiple inputs and/or outputs
This method performs layerwise pretraining on all pre-trainable layers in the network (VAEs, Autoencoders, etc), for the specified number of epochs each.void
pretrainLayer(String layerName, DataSetIterator dataSetIterator)
Pretrain a specified layer with the given DataSetIteratorvoid
pretrainLayer(String layerName, MultiDataSetIterator iter)
Pretrain a specified layer with the given MultiDataSetIteratorMap<String,INDArray>
rnnActivateUsingStoredState(INDArray[] inputs, boolean training, boolean storeLastForTBPTT)
Similar to rnnTimeStep and feedForward() methods.void
rnnClearPreviousState()
Clear the previous state of the RNN layers (if any), used inrnnTimeStep(INDArray...)
Map<String,INDArray>
rnnGetPreviousState(int layer)
Get the state of the RNN layer, as used inrnnTimeStep(INDArray...)
.Map<String,INDArray>
rnnGetPreviousState(String layerName)
Get the state of the RNN layer, as used inrnnTimeStep(INDArray...)
.Map<String,Map<String,INDArray>>
rnnGetPreviousStates()
Get a map of states for ALL RNN layers, as used inrnnTimeStep(INDArray...)
.void
rnnSetPreviousState(int layer, Map<String,INDArray> state)
Set the state of the RNN layer, for use inrnnTimeStep(INDArray...)
void
rnnSetPreviousState(String layerName, Map<String,INDArray> state)
Set the state of the RNN layer, for use inrnnTimeStep(INDArray...)
void
rnnSetPreviousStates(Map<String,Map<String,INDArray>> previousStates)
Set the states for all RNN layers, for use inrnnTimeStep(INDArray...)
INDArray[]
rnnTimeStep(MemoryWorkspace outputWorkspace, INDArray... inputs)
SeernnTimeStep(INDArray...)
for details.
If no memory workspace is provided, the output will be detached (not in any workspace).
If a memory workspace is provided, the output activation array (i.e., the INDArray returned by this method) will be placed in the specified workspace.INDArray[]
rnnTimeStep(INDArray... inputs)
If this ComputationGraph contains one or more RNN layers: conduct forward pass (prediction) but using previous stored state for any RNN layers.protected void
rnnUpdateStateWithTBPTTState()
Update the internal state of RNN layers after a truncated BPTT fit callvoid
save(File f)
Save the ComputationGraph to a file.void
save(File f, boolean saveUpdater)
Save the ComputationGraph to a file.double
score()
The score for the modeldouble
score(DataSet dataSet)
Sets the input and labels and returns a score for the prediction with respect to the true labels
This is equivalent toscore(DataSet, boolean)
with training==true.
NOTE: this version of the score function can only be used with ComputationGraph networks that have a single input and a single output.double
score(DataSet dataSet, boolean training)
Sets the input and labels and returns a score for the prediction with respect to the true labels
NOTE: this version of the score function can only be used with ComputationGraph networks that have a single input and a single output.double
score(MultiDataSet dataSet)
Score the network given the MultiDataSet, at test timedouble
score(MultiDataSet dataSet, boolean training)
Sets the input and labels and returns a score for the prediction with respect to the true labelsINDArray
scoreExamples(DataSet data, boolean addRegularizationTerms)
Calculate the score for each example in a DataSet individually.INDArray
scoreExamples(MultiDataSet dataSet, boolean addRegularizationTerms)
Calculate the score for each example in a DataSet individually.void
setBackpropGradientsViewArray(INDArray gradient)
Set the gradients array as a view of the full (backprop) network parameters NOTE: this is intended to be used internally in MultiLayerNetwork and ComputationGraph, not by users.void
setCacheMode(CacheMode mode)
This method sets specified CacheMode for all layers within networkvoid
setConf(NeuralNetConfiguration conf)
Setter for the configurationvoid
setGradientsAccumulator(GradientsAccumulator accumulator)
This method allows you to specificy GradientsAccumulator instance to be used with this modelvoid
setInput(int inputNum, INDArray input)
Set the specified input for the ComputationGraphvoid
setInputs(INDArray... inputs)
Set all inputs for the ComputationGraph networkvoid
setLabel(int labelNum, INDArray label)
Set the specified label for the ComputationGraphvoid
setLabels(INDArray... labels)
Set all labels for the ComputationGraph networkvoid
setLastEtlTime(long time)
This method allows to set ETL field time, useful for performance trackingvoid
setLayerMaskArrays(INDArray[] featureMaskArrays, INDArray[] labelMaskArrays)
Set the mask arrays for features and labels.void
setLearningRate(double newLr)
Set the learning rate for all layers in the network to the specified value.void
setLearningRate(String layerName, double newLr)
Set the learning rate for a single layer in the network to the specified value.void
setLearningRate(String layerName, ISchedule newLr)
Set the learning rate schedule for a single layer in the network to the specified value.
Note also thatsetLearningRate(ISchedule)
should also be used in preference, when all layers need to be set to a new LR schedule.
This schedule will replace any/all existing schedules, and also any fixed learning rate values.
Note also that the iteration/epoch counts will not be reset.void
setLearningRate(ISchedule newLr)
Set the learning rate schedule for all layers in the network to the specified schedule.void
setListeners(Collection<TrainingListener> listeners)
Set the trainingListeners for the ComputationGraph (and all layers in the network)void
setListeners(TrainingListener... listeners)
Set the trainingListeners for the ComputationGraph (and all layers in the network)void
setParam(String key, INDArray val)
Set the parameter with a new ndarrayvoid
setParams(INDArray params)
Set the parameters for this model.void
setParamsViewArray(INDArray gradient)
Set the initial parameters array as a view of the full (backprop) network parameters NOTE: this is intended to be used internally in MultiLayerNetwork and ComputationGraph, not by users.void
setParamTable(@NonNull Map<String,INDArray> paramTable)
Setter for the param tablevoid
setScore(double score)
void
setUpdater(ComputationGraphUpdater updater)
Set the computationGraphUpdater for the networkString
summary()
String detailing the architecture of the computation graph.String
summary(InputType... inputTypes)
String detailing the architecture of the computation graph.protected void
synchronizeIterEpochCounts()
int[]
topologicalSortOrder()
Calculate a topological sort order for the vertices in the graph.void
update(Gradient gradient)
Update layer weights and biases with gradient changevoid
update(INDArray gradient, String paramType)
Perform one update applying the gradientINDArray
updaterState()
This method returns updater state (if applicable), null otherwiseprotected void
validateArrayWorkspaces(LayerWorkspaceMgr mgr, INDArray array, ArrayType arrayType, String vertexName, boolean isInputVertex, String op)
-
-
-
Field Detail
-
configuration
protected ComputationGraphConfiguration configuration
-
initCalled
protected boolean initCalled
-
solver
protected transient Solver solver
-
flattenedParams
protected INDArray flattenedParams
-
flattenedGradients
protected transient INDArray flattenedGradients
-
gradient
protected Gradient gradient
-
score
protected double score
-
clearTbpttState
protected boolean clearTbpttState
-
WS_LAYER_WORKING_MEM
protected static final String WS_LAYER_WORKING_MEM
Workspace for working memory for a single layer: forward pass and backward pass Note that this is opened/closed once per op (activate/backpropGradient call)- See Also:
- Constant Field Values
-
WS_ALL_LAYERS_ACT
protected static final String WS_ALL_LAYERS_ACT
Workspace for storing all layers' activations - used only to store activations (layer inputs) as part of backprop Not used for inference- See Also:
- Constant Field Values
-
WS_RNN_LOOP_WORKING_MEM
protected static final String WS_RNN_LOOP_WORKING_MEM
Workspace for working memory in RNNs - opened and closed once per RNN time step- See Also:
- Constant Field Values
-
WS_OUTPUT_MEM
protected static final String WS_OUTPUT_MEM
Workspace for output methods that use OutputAdapter- See Also:
- Constant Field Values
-
WS_LAYER_WORKING_MEM_CONFIG
protected final WorkspaceConfiguration WS_LAYER_WORKING_MEM_CONFIG
-
WS_ALL_LAYERS_ACT_CONFIG
protected static final WorkspaceConfiguration WS_ALL_LAYERS_ACT_CONFIG
-
WS_LAYER_ACT_X_CONFIG
protected final WorkspaceConfiguration WS_LAYER_ACT_X_CONFIG
-
WS_RNN_LOOP_WORKING_MEM_CONFIG
protected static final WorkspaceConfiguration WS_RNN_LOOP_WORKING_MEM_CONFIG
-
lastEtlTime
protected transient ThreadLocal<Long> lastEtlTime
-
vertices
protected GraphVertex[] vertices
All GraphVertex objects in the network.
-
verticesMap
protected Map<String,GraphVertex> verticesMap
Map of vertices by name
-
topologicalOrder
protected int[] topologicalOrder
Indexes of graph vertices, in topological order. The topological order defines the order in which forward pass (and hence also backward pass, which is the opposite to this) is conducted in the network.
-
graphIndices
protected GraphIndices graphIndices
Topological sort and vertex index/name + name/index mapping
-
layers
protected Layer[] layers
A list of layers. Each of these layers is present in a GraphVertex, but are here for easy reference. This array also defines the order in which the getLayer(int) method returns layers.
-
-
Constructor Detail
-
ComputationGraph
public ComputationGraph(ComputationGraphConfiguration configuration)
-
-
Method Detail
-
setLastEtlTime
public void setLastEtlTime(long time)
This method allows to set ETL field time, useful for performance tracking- Parameters:
time
-
-
getLastEtlTime
public long getLastEtlTime()
This method returns ETL time field value- Returns:
-
setCacheMode
public void setCacheMode(CacheMode mode)
This method sets specified CacheMode for all layers within network- Parameters:
mode
-
-
getConfiguration
public ComputationGraphConfiguration getConfiguration()
This method returns configuration of this ComputationGraph- Returns:
-
getNumLayers
public int getNumLayers()
Returns the number of layers in the ComputationGraph
-
getLayer
public Layer getLayer(int idx)
Get the layer by the number of that layer, in range 0 to getNumLayers()-1 NOTE: This is different from the internal GraphVertex index for the layer
-
getLayers
public Layer[] getLayers()
Get all layers in the ComputationGraph
-
getVertices
public GraphVertex[] getVertices()
Returns an array of all GraphVertex objects.
-
getVertex
public GraphVertex getVertex(String name)
Return a given GraphVertex by name, or null if no vertex with that name exists
-
getNumInputArrays
public int getNumInputArrays()
The number of inputs to this network
-
getNumOutputArrays
public int getNumOutputArrays()
The number of output (arrays) for this network
-
setInput
public void setInput(int inputNum, INDArray input)
Set the specified input for the ComputationGraph
-
setInputs
public void setInputs(INDArray... inputs)
Set all inputs for the ComputationGraph network
-
getInput
public INDArray getInput(int inputNum)
Get the previously set input for the ComputationGraph
-
getInputs
public INDArray[] getInputs()
Get the previously set inputs for the ComputationGraph
-
getInputMaskArrays
public INDArray[] getInputMaskArrays()
Get the previously set feature/input mask arrays for the ComputationGraph
-
getLabelMaskArrays
public INDArray[] getLabelMaskArrays()
Get the previously set label/output mask arrays for the ComputationGraph
-
setLabel
public void setLabel(int labelNum, INDArray label)
Set the specified label for the ComputationGraph
-
setLabels
public void setLabels(INDArray... labels)
Set all labels for the ComputationGraph network
-
setGradientsAccumulator
public void setGradientsAccumulator(GradientsAccumulator accumulator)
This method allows you to specificy GradientsAccumulator instance to be used with this modelPLEASE NOTE: Do not use this method unless you understand how to use GradientsAccumulator & updates sharing. PLEASE NOTE: Do not use this method on standalone model
- Parameters:
accumulator
-
-
init
public void init()
Initialize the ComputationGraph network- Specified by:
init
in interfaceModel
- Specified by:
init
in interfaceNeuralNetwork
-
init
public void init(INDArray parameters, boolean cloneParametersArray)
Initialize the ComputationGraph, optionally with an existing parameters array. If an existing parameters array is specified, it will be used (and the values will not be modified) in the network; if no parameters array is specified, parameters will be initialized randomly according to the network configuration.- Parameters:
parameters
- Network parameter. May be null. If null: randomly initialize.cloneParametersArray
- Whether the parameter array (if any) should be cloned, or used directly
-
initGradientsView
public void initGradientsView()
This method: initializes the flattened gradients array (used in backprop) and sets the appropriate subset in all layers. As a general rule, this shouldn't ever need to be called manually when doing training via fit(DataSet), fit(DataSetIterator) or fit(MultiDataSet) methods
-
getOutputLayerIndices
protected int[] getOutputLayerIndices()
-
pretrain
public void pretrain(DataSetIterator iter)
Perform layerwise pretraining for one epoch - seepretrain(DataSetIterator, int)
-
pretrain
public void pretrain(DataSetIterator iter, int numEpochs)
Pretrain network with a single input and single output. DataSetIterators can only be used if the number of input arrays for the ComputationGraph is 1.
This method performs layerwise pretraining on all pre-trainable layers in the network (VAEs, Autoencoders, etc), for the specified number of epochs each. For example, if numEpochs=3, then layer 0 will be fit for 3 epochs, followed by layer 1 for 3 epochs, and so on.
For networks with more than one input usepretrain(MultiDataSetIterator)
-
pretrain
public void pretrain(MultiDataSetIterator iter)
Pretrain network with multiple inputs and/or outputs
-
pretrain
public void pretrain(MultiDataSetIterator iter, int numEpochs)
Pretrain network with multiple inputs and/or outputs
This method performs layerwise pretraining on all pre-trainable layers in the network (VAEs, Autoencoders, etc), for the specified number of epochs each. For example, if numEpochs=3, then layer 0 will be fit for 3 epochs, followed by layer 1 for 3 epochs, and so on.
Non-pretrainable layers are ignored- Parameters:
iter
- Training datanumEpochs
- Number of epochs to fit each layer with- See Also:
pretrainLayer(String, MultiDataSetIterator)
-
pretrainLayer
public void pretrainLayer(String layerName, DataSetIterator dataSetIterator)
Pretrain a specified layer with the given DataSetIterator- Parameters:
layerName
- Layer namedataSetIterator
- Data
-
pretrainLayer
public void pretrainLayer(String layerName, MultiDataSetIterator iter)
Pretrain a specified layer with the given MultiDataSetIterator- Parameters:
layerName
- Layer nameiter
- Training data
-
fit
public void fit(DataSet dataSet)
Fit the ComputationGraph using a DataSet. Note that this method can only be used with ComputationGraphs with 1 input and 1 output. For networks with more than one input or output, usefit(MultiDataSetIterator)
- Specified by:
fit
in interfaceNeuralNetwork
-
fit
public void fit(@NonNull @NonNull DataSetIterator iterator, int numEpochs)
Perform minibatch training on all minibatches in the DataSetIterator, for the specified number of epochs. Equvalent to callingfit(DataSetIterator)
numEpochs times in a loop- Parameters:
iterator
- Training data (DataSetIterator). Iterator must support resettingnumEpochs
- Number of training epochs, >= 1
-
fit
public void fit(@NonNull @NonNull DataSetIterator iterator)
Fit the ComputationGraph using a DataSetIterator.
Note that this method can only be used with ComputationGraphs with 1 input and 1 output
Method doesn't do layerwise pretraining.
For pretraining use method pretrain..pretrain(DataSetIterator)
- Specified by:
fit
in interfaceNeuralNetwork
- Parameters:
iterator
- Training data (DataSetIterator)
-
fit
public void fit(MultiDataSet multiDataSet)
Fit the ComputationGraph using a MultiDataSet- Specified by:
fit
in interfaceNeuralNetwork
-
fit
public void fit(@NonNull @NonNull MultiDataSetIterator iterator, int numEpochs)
Perform minibatch training on all minibatches in the MultiDataSetIterator, for the specified number of epochs. Equvalent to callingfit(MultiDataSetIterator)
numEpochs times in a loop- Parameters:
iterator
- Training data (DataSetIterator). Iterator must support resettingnumEpochs
- Number of training epochs, >= 1
-
fit
public void fit(MultiDataSetIterator multi)
Fit the ComputationGraph using a MultiDataSetIterator Method doesn't do layerwise pretraining.
For pretraining use method pretrain..pretrain(MultiDataSetIterator)
- Specified by:
fit
in interfaceNeuralNetwork
- Parameters:
multi
- Training data (MultiDataSetIterator)
-
fit
public void fit(INDArray[] inputs, INDArray[] labels)
Fit the ComputationGraph given arrays of inputs and labels.- Parameters:
inputs
- The network inptuslabels
- The labels
-
fit
public void fit(INDArray[] inputs, INDArray[] labels, INDArray[] featureMaskArrays, INDArray[] labelMaskArrays)
Fit the ComputationGraph using the specified inputs and labels (and mask arrays)- Parameters:
inputs
- The network inputs (features)labels
- The network labelsfeatureMaskArrays
- Mask arrays for inputs/features. Typically used for RNN training. May be null.labelMaskArrays
- Mas arrays for the labels/outputs. Typically used for RNN training. May be null.
-
topologicalSortOrder
public int[] topologicalSortOrder()
Calculate a topological sort order for the vertices in the graph. Note that this is used for (a) working out what order to do forward pass, (b) what order to do backprop (i.e., reverse of this) (c) order to flatten parameters (and gradients)Specifically, gradients/params/forward pass are executed on vertex[topologicalSortOrder[i]], for i=0..nVertices-1
-
calculateIndices
public GraphIndices calculateIndices()
Calculate the indices needed for the network:
(a) topological sort order
(b) Map: vertex index -> vertex name
(c) Map: vertex name -> vertex index- Returns:
- Calculated indices
-
computeGradientAndScore
public void computeGradientAndScore(LayerWorkspaceMgr workspaceMgr)
Description copied from interface:Model
Update the score- Specified by:
computeGradientAndScore
in interfaceModel
-
computeGradientAndScore
public void computeGradientAndScore()
-
feedForward
public Map<String,INDArray> feedForward(INDArray input, int layerTillIndex, boolean train)
Conduct forward pass using a single input array. Note that this method can only be used with ComputationGraphs with a single input array.- Parameters:
input
- The input arraylayerTillIndex
- the layer to feed forward totrain
- If true: do forward pass at training time- Returns:
- A map of activations for each layer (not each GraphVertex). Keys = layer name, values = layer activations
-
feedForward
public Map<String,INDArray> feedForward(INDArray[] input, int layerTillIndex, boolean train, boolean clearInputs)
Conduct forward pass using an array of inputs. This overload allows the forward pass to be conducted, optionally (not) clearing the layer input arrays.
Note: when using clearInputs=false, there can be some performance and memory overhead: this is because the arrays are defined outside of workspaces (which are enabled by default) - otherwise, old/invalidated arrays could still be accessed after calling this method. Consequently: Don't use clearInputs=false unless you have a use case that requires them to remain after feed-forward has been completed- Parameters:
input
- An array of ComputationGraph inputslayerTillIndex
- the index of the layer to feed forward totrain
- If true: do forward pass at training time; false: do forward pass at test timeclearInputs
- If true (default for other methods): clear the inputs of all layers after doing forward pass. False don't clear layer inputs.- Returns:
- A map of activations for each layer (not each GraphVertex). Keys = layer name, values = layer activations
-
feedForward
public Map<String,INDArray> feedForward(INDArray[] input, int layerTillIndex, boolean train)
Conduct forward pass using an array of inputs- Parameters:
input
- An array of ComputationGraph inputslayerTillIndex
- the index of the layer to feed forward totrain
- If true: do forward pass at training time; false: do forward pass at test time- Returns:
- A map of activations for each layer (not each GraphVertex). Keys = layer name, values = layer activations
-
feedForward
public Map<String,INDArray> feedForward(boolean train, int layerTillIndex)
Conduct forward pass using the stored inputs- Parameters:
train
- If true: do forward pass at training time; false: do forward pass at test timelayerTillIndex
- the index of the layer to feed forward to- Returns:
- A map of activations for each layer (not each GraphVertex). Keys = layer name, values = layer activations
-
feedForward
public Map<String,INDArray> feedForward(INDArray input, boolean train)
Conduct forward pass using a single input array. Note that this method can only be used with ComputationGraphs with a single input array.- Parameters:
input
- The input arraytrain
- If true: do forward pass at training time- Returns:
- A map of activations for each layer (not each GraphVertex). Keys = layer name, values = layer activations
-
feedForward
public Map<String,INDArray> feedForward(INDArray[] input, boolean train)
Conduct forward pass using an array of inputs- Parameters:
input
- An array of ComputationGraph inputstrain
- If true: do forward pass at training time; false: do forward pass at test time- Returns:
- A map of activations for each layer (not each GraphVertex). Keys = layer name, values = layer activations
-
feedForward
public Map<String,INDArray> feedForward(INDArray[] input, boolean train, boolean clearInputs)
Conduct forward pass using an array of inputs. This overload allows the forward pass to be conducted, optionally (not) clearing the layer input arrays.
Note: this method should NOT be used with clearInputs = true, unless you know what you are doing. Specifically: when using clearInputs=false, in combination with workspaces, the layer input fields may leak outside of the workspaces in which they were defined - potentially causing a crash. See https://deeplearning4j.konduit.ai/config/config-memory/config-workspaces for more details- Parameters:
input
- An array of ComputationGraph inputstrain
- If true: do forward pass at training time; false: do forward pass at test timeclearInputs
- If true (default for other methods): clear the inputs of all layers after doing forward pass. False don't clear layer inputs.- Returns:
- A map of activations for each layer (not each GraphVertex). Keys = layer name, values = layer activations
-
feedForward
public Map<String,INDArray> feedForward()
Conduct forward pass using the stored inputs, at test time- Returns:
- A map of activations for each layer (not each GraphVertex). Keys = layer name, values = layer activations
-
feedForward
public Map<String,INDArray> feedForward(boolean train)
Conduct forward pass using the stored inputs- Parameters:
train
- If true: do forward pass at training time; false: do forward pass at test time- Returns:
- A map of activations for each layer (not each GraphVertex). Keys = layer name, values = layer activations
-
feedForward
public Map<String,INDArray> feedForward(boolean train, boolean excludeOutputLayers, boolean includeNonLayerVertexActivations)
- Parameters:
train
- True: training time. False: test timeexcludeOutputLayers
- Should we exclude the output layers during forward pass? (usually: false)includeNonLayerVertexActivations
- Include non-layer vertices in the output may?- Returns:
- Map of activations. Key: vertex name. Value: activations.
-
output
public INDArray[] output(INDArray... input)
Return an array of network outputs (predictions) at test time, given the specified network inputs Network outputs are for output layers only.- Parameters:
input
- Inputs to the network- Returns:
- Output activations (order: same as defined in network configuration)
-
outputSingle
public INDArray outputSingle(INDArray... input)
A convenience method that returns a single INDArray, instead of an INDArray[]. Useful for ComputationGraphs that have only a single output. Otherwise identical tooutput(INDArray...)
- Parameters:
input
- Inputs to the network- Returns:
- Output activations array
-
output
public INDArray[] output(boolean train, INDArray... input)
Return an array of network outputs (predictions), given the specified network inputs Network outputs are for output layers only.- Parameters:
train
- If true: do forward pass at training time; false: do forward pass at test timeinput
- Inputs to the network- Returns:
- Output activations (order: same as defined in network configuration)
-
output
public INDArray[] output(boolean train, MemoryWorkspace outputWorkspace, INDArray... input)
Return an array of network outputs (predictions), given the specified network inputs Network outputs are for output layers only.
If no memory workspace is provided, the output will be detached (not in any workspace).
If a memory workspace is provided, the output activation array (i.e., the INDArray returned by this method) will be placed in the specified workspace. This workspace must be opened by the user before calling this method - and the user is responsible for (a) closing this workspace, and (b) ensuring the output array is not used out of scope (i.e., not used after closing the workspace to which it belongs - as this is likely to cause either an exception when used, or a crash).- Parameters:
train
- If true: do forward pass at training time; false: do forward pass at test timeoutputWorkspace
- May be null. If not null: the workspace MUST be opened before calling this method.input
- Inputs to the network- Returns:
- Output activations (order: same as defined in network configuration)
-
output
public INDArray[] output(boolean train, @NonNull @NonNull INDArray[] input, INDArray[] inputMasks)
Return an array of network outputs (predictions), given the specified network inputs Network outputs are for output layers only.- Parameters:
train
- If true: forward pass for training mode. False: test modeinput
- Input arrays to the netwonkinputMasks
- Optional input mask arrays (may be null)- Returns:
- Network output activations
-
output
public INDArray[] output(boolean train, @NonNull @NonNull INDArray[] input, INDArray[] inputMasks, INDArray[] labelMasks)
Return an array of network outputs (predictions), given the specified network inputs Network outputs are for output layers only.- Parameters:
train
- If true: forward pass for training mode. False: test modeinput
- Input arrays to the netwonkinputMasks
- Optional input mask arrays (may be null)labelMasks
- Optional label mask arrays (may be null- Returns:
- Network output activations
-
output
public <T> T output(@NonNull @NonNull INDArray[] inputs, INDArray[] inputMasks, INDArray[] labelMasks, @NonNull @NonNull OutputAdapter<T> outputAdapter)
This method uses provided OutputAdapter to return custom object built from INDArray PLEASE NOTE: This method uses dedicated Workspace for output generation to avoid redundant allocations- Type Parameters:
T
- T extends Object- Parameters:
inputs
- Input arrays to the netwonkinputMasks
- Optional input mask arrays (may be null)labelMasks
- Optional label mask arrays (may be nulloutputAdapter
- OutputAdapterinstance - Returns:
- T instance produced by OutputAdapter
-
output
public INDArray[] output(boolean train, @NonNull @NonNull INDArray[] input, INDArray[] inputMasks, INDArray[] labelMasks, MemoryWorkspace outputWorkspace)
Return an array of network outputs (predictions), given the specified network inputs Network outputs are for output layers only.
If no memory workspace is provided, the output will be detached (not in any workspace).
If a memory workspace is provided, the output activation array (i.e., the INDArray returned by this method) will be placed in the specified workspace. This workspace must be opened by the user before calling this method - and the user is responsible for (a) closing this workspace, and (b) ensuring the output array is not used out of scope (i.e., not used after closing the workspace to which it belongs - as this is likely to cause either an exception when used, or a crash).- Parameters:
train
- If true: forward pass for training mode. False: test modeinput
- Input arrays to the netwonkinputMasks
- Optional input mask arrays (may be null)labelMasks
- Optional label mask arrays (may be nulloutputWorkspace
- May be null. If not null: the workspace MUST be opened before calling this method.- Returns:
- Network output activations
-
outputSingle
public INDArray outputSingle(boolean train, INDArray... input)
A convenience method that returns a single INDArray, instead of an INDArray[]. Useful for ComputationGraphs that have only a single output. Otherwise identical tooutput(boolean, INDArray...)
- Parameters:
train
- If true: do forward pass at training time; false: do forward pass at test timeinput
- Inputs to the network- Returns:
- Output activations array
-
outputSingle
public INDArray outputSingle(boolean train, boolean clearInputs, INDArray... input)
Identical tooutputSingle(boolean, boolean, INDArray...)
but has the option of not clearing the input arrays (useful when later backpropagating external errors). Most users should useoutputSingle(boolean, INDArray...)
in preference to this method.
-
output
public INDArray[] output(boolean train, boolean clearInputs, INDArray... input)
An output method for the network, with optional clearing of the layer inputs.
Note: most users should useoutput(boolean, INDArray...)
or similar methods, unless they are doing non-standard operations (like providing the input arrays externally)- Parameters:
train
- If true: output during training. False: output during testing. Affects some things such as dropoutclearInputs
- If true: clear the input arrays for all layers. False: leave the input arrays as-is - which can be useful for "external errors" (no output layer) backprop use casesinput
- Input to the network- Returns:
- Output from the network
-
output
public INDArray[] output(DataSetIterator iterator)
Generate the output for all examples/batches in the input iterator, and concatenate them into a single array per network output- Parameters:
iterator
- Data to pass through the network- Returns:
- output for all examples in the iterator
-
output
public INDArray[] output(MultiDataSetIterator iterator)
Generate the output for all examples/batches in the input iterator, and concatenate them into a single array per network output- Parameters:
iterator
- Data to pass through the network- Returns:
- output for all examples in the iterator
-
outputSingle
public INDArray outputSingle(DataSetIterator iterator)
Generate the output for all examples/batches in the input iterator, and concatenate them into a single array. Can only be used with ComputationGraphs with 1 output- Parameters:
iterator
- Data to pass through the network- Returns:
- output for all examples in the iterator
-
outputSingle
public INDArray outputSingle(MultiDataSetIterator iterator)
Generate the output for all examples/batches in the input iterator, and concatenate them into a single array. Can only be used with ComputationGraphs with 1 output- Parameters:
iterator
- Data to pass through the network- Returns:
- output for all examples in the iterator
-
output
public INDArray[] output(List<String> layers, boolean train, INDArray[] features, INDArray[] featureMasks)
Get the activations for the specific layers only- Parameters:
layers
- Layers to get the specified activations fortrain
- If true: train mode. False: test (inference) modefeatures
- Features arrayfeatureMasks
- Feature masks array. May be null- Returns:
- Activations of the selected layers, in the same order as the "layers" arg/list
-
validateArrayWorkspaces
protected void validateArrayWorkspaces(LayerWorkspaceMgr mgr, INDArray array, ArrayType arrayType, String vertexName, boolean isInputVertex, String op)
-
ffToLayerActivationsDetached
protected Map<String,INDArray> ffToLayerActivationsDetached(boolean train, @NonNull @NonNull FwdPassType fwdPassType, boolean storeLastForTBPTT, int layerIndex, int[] excludeIdxs, @NonNull @NonNull INDArray[] features, INDArray[] fMask, INDArray[] lMask, boolean clearLayers)
Feed-forward through the network - returning all array activations detached from any workspace. Note that no workspace should be active externally when calling this method (an exception will be thrown if a workspace is open externally)- Parameters:
train
- Training mode (true) or test/inference mode (false)fwdPassType
- Type of forward pass to perform (STANDARD or RNN_ACTIVATE_WITH_STORED_STATE only)storeLastForTBPTT
- ONLY used if fwdPassType == FwdPassType.RNN_ACTIVATE_WITH_STORED_STATElayerIndex
- Index (inclusive) to stop forward pass at. For all layers, use numLayers-1excludeIdxs
- Layers (vertices) to exclude from forward pass. These layers will be skipped, and hence are usually output layers or at the end of the network. May be null.features
- Input feature arraysfMask
- Feature mask arrays. May be null.lMask
- Label mask array. May be null.clearLayers
- Whether the layer inputs should be cleared- Returns:
- Map of activations (including the input), detached from any workspace
-
ffToLayerActivationsInWS
protected Map<String,INDArray> ffToLayerActivationsInWS(boolean train, int layerIndex, int[] excludeIdxs, FwdPassType fwdPassType, boolean storeLastForTBPTT, INDArray[] input, INDArray[] fMask, INDArray[] lMask, boolean clearInputs)
Feed-forward through the network - if workspaces are used, all returned activations will be present in workspace WS_ALL_LAYERS_ACT.
Note: if using workspaces for training, requires that WS_ALL_LAYERS_ACT is open externally. If using NO workspaces, requires that no external workspace is open- Parameters:
train
- Training mode (true) or test/inference mode (false)layerIndex
- Index (inclusive) to stop forward pass at. For all layers, use -1excludeIdxs
- Layers (vertices) to exclude from forward pass. These layers will be skipped, and hence are usually output layers or at the end of the network. May be null.fwdPassType
- Type of forward pass to perform (STANDARD or RNN_ACTIVATE_WITH_STORED_STATE only)storeLastForTBPTT
- ONLY used if fwdPassType == FwdPassType.RNN_ACTIVATE_WITH_STORED_STATEinput
- Input feature arraysfMask
- Feature mask arrays. May be null.lMask
- Label mask array. May be null.clearInputs
- Whether the layer inputs should be cleared- Returns:
- Map of activations (including the input), in workspace WS_ALL_LAYERS_ACT if workspaces are used (detached otherwise)
-
outputOfLayersDetached
protected INDArray[] outputOfLayersDetached(boolean train, @NonNull @NonNull FwdPassType fwdPassType, @NonNull @lombok.NonNull int[] layerIndexes, @NonNull @NonNull INDArray[] features, INDArray[] fMask, INDArray[] lMasks, boolean clearLayerInputs, boolean detachedInputs, MemoryWorkspace outputWorkspace)
Provide the output of the specified layers, detached from any workspace. This is most commonly used at inference/test time, and is more memory efficient thanffToLayerActivationsDetached(boolean, FwdPassType, boolean, int, int[], INDArray[], INDArray[], INDArray[], boolean)
andffToLayerActivationsInWS(boolean, int, int[], FwdPassType, boolean, INDArray[], INDArray[], INDArray[], boolean)
.
This method clears all layer inputs. NOTE: in general, no workspaces should be activated externally for this method! This method handles the workspace activation as required- Parameters:
train
- Training mode (true) or test/inference mode (false)fwdPassType
- Type of forward pass to perform (STANDARD or RNN_TIMESTEP only)layerIndexes
- Indexes of the layers to get the activations forfeatures
- Input features for the networkfMask
- Input/feature mask array. May be null.lMasks
- Labels mask array. May be nullclearLayerInputs
- If true: the layer input fields will be cleareddetachedInputs
- If true: the layer input fields will be detached. Usually used for external errors casesoutputWorkspace
- Optional - if provided, outputs should be placed in this workspace. NOTE: this workspace must be open- Returns:
- Output of the specified layers, detached from any workspace
-
backpropGradient
public Gradient backpropGradient(INDArray... epsilons)
Calculate the gradient of the network with respect to some external errors. Note that this is typically used for things like reinforcement learning, not typical networks that include an OutputLayer or RnnOutputLayer- Parameters:
epsilons
- Epsilons (errors) at the output. Same order with which the output layers are defined in configuration setOutputs(String...)- Returns:
- Gradient for the network
-
calcBackpropGradients
protected void calcBackpropGradients(boolean clearLayers, boolean truncatedBPTT, INDArray... externalEpsilons)
Do backprop (gradient calculation)- Parameters:
truncatedBPTT
- false: normal backprop. true: calculate gradients using truncated BPTT for RNN layersexternalEpsilons
- null usually (for typical supervised learning). If not null (and length > 0) then assume that the user has provided some errors externally, as they would do for example in reinforcement learning situations.
-
clone
public ComputationGraph clone()
-
calcRegularizationScore
public double calcRegularizationScore(boolean backpropParamsOnly)
-
setListeners
public void setListeners(Collection<TrainingListener> listeners)
Set the trainingListeners for the ComputationGraph (and all layers in the network)- Specified by:
setListeners
in interfaceModel
-
setListeners
public void setListeners(TrainingListener... listeners)
Set the trainingListeners for the ComputationGraph (and all layers in the network)- Specified by:
setListeners
in interfaceModel
-
addListeners
public void addListeners(TrainingListener... listeners)
This method ADDS additional TrainingListener to existing listeners- Specified by:
addListeners
in interfaceModel
- Parameters:
listeners
- Listeners to add
-
getListeners
public Collection<TrainingListener> getListeners()
Get the trainingListeners for the ComputationGraph
-
getUpdater
public ComputationGraphUpdater getUpdater()
Get the ComputationGraphUpdater for the network. Creates one on demand, if required
-
getUpdater
public ComputationGraphUpdater getUpdater(boolean initializeIfAbsent)
Get the ComputationGraphUpdater for this network- Parameters:
initializeIfAbsent
- If true: create the updater if one is absent. False: return null if absent.- Returns:
- Updater
-
setUpdater
public void setUpdater(ComputationGraphUpdater updater)
Set the computationGraphUpdater for the network
-
getOutputLayer
public Layer getOutputLayer(int outputLayerIdx)
Get the specified output layer, by index. The index of the output layer may be 0 togetNumOutputArrays()
-1
-
params
@Deprecated public INDArray params(boolean backwardOnly)
Deprecated.To be removed. Useparams()
-
score
public double score(DataSet dataSet)
Sets the input and labels and returns a score for the prediction with respect to the true labels
This is equivalent toscore(DataSet, boolean)
with training==true.
NOTE: this version of the score function can only be used with ComputationGraph networks that have a single input and a single output.- Parameters:
dataSet
- the data to score- Returns:
- the score for the given input,label pairs
- See Also:
score(DataSet, boolean)
-
score
public double score(DataSet dataSet, boolean training)
Sets the input and labels and returns a score for the prediction with respect to the true labels
NOTE: this version of the score function can only be used with ComputationGraph networks that have a single input and a single output. Usescore(MultiDataSet, boolean)
for multiple input/output networks- Parameters:
dataSet
- the data to scoretraining
- whether score is being calculated at training time (true) or test time (false)- Returns:
- the score for the given input,label pairs
- See Also:
score(DataSet, boolean)
-
score
public double score(MultiDataSet dataSet)
Score the network given the MultiDataSet, at test time
-
score
public double score(MultiDataSet dataSet, boolean training)
Sets the input and labels and returns a score for the prediction with respect to the true labels- Parameters:
dataSet
- the data to scoretraining
- whether score is being calculated at training time (true) or test time (false)- Returns:
- the score for the given input,label pairs
-
scoreExamples
public INDArray scoreExamples(DataSet data, boolean addRegularizationTerms)
Calculate the score for each example in a DataSet individually. Unlikescore(DataSet)
andscore(DataSet, boolean)
this method does not average/sum over examples. This method allows for examples to be scored individually (at test time only), which may be useful for example for autoencoder architectures and the like.
Each row of the output (assuming addRegularizationTerms == true) is equivalent to calling score(DataSet) with a single example.- Parameters:
data
- The data to scoreaddRegularizationTerms
- If true: add l1/l2 regularization terms (if any) to the score. If false: don't add regularization terms- Returns:
- An INDArray (column vector) of size input.numRows(); the ith entry is the score (loss value) of the ith example
-
scoreExamples
public INDArray scoreExamples(MultiDataSet dataSet, boolean addRegularizationTerms)
Calculate the score for each example in a DataSet individually. Unlikescore(MultiDataSet)
andscore(MultiDataSet, boolean)
this method does not average/sum over examples. This method allows for examples to be scored individually (at test time only), which may be useful for example for autoencoder architectures and the like.
Each row of the output (assuming addRegularizationTerms == true) is equivalent to calling score(MultiDataSet) with a single example.- Parameters:
dataSet
- The data to scoreaddRegularizationTerms
- If true: add l1/l2 regularization terms (if any) to the score. If false: don't add regularization terms- Returns:
- An INDArray (column vector) of size input.numRows(); the ith entry is the score (loss value) of the ith example
-
fit
public void fit()
Description copied from interface:Model
All models have a fit method
-
update
public void update(INDArray gradient, String paramType)
Description copied from interface:Model
Perform one update applying the gradient
-
update
public void update(Gradient gradient)
Description copied from interface:Model
Update layer weights and biases with gradient change
-
score
public double score()
Description copied from interface:Model
The score for the model
-
setScore
public void setScore(double score)
-
params
public INDArray params()
Description copied from interface:Model
Parameters of the model (if any)- Specified by:
params
in interfaceModel
- Specified by:
params
in interfaceNeuralNetwork
- Returns:
- the parameters of the model
-
updaterState
public INDArray updaterState()
Description copied from interface:NeuralNetwork
This method returns updater state (if applicable), null otherwise- Specified by:
updaterState
in interfaceNeuralNetwork
- Returns:
-
numParams
public long numParams()
Description copied from interface:Model
the number of parameters for the model
-
numParams
public long numParams(boolean backwards)
Description copied from interface:Model
the number of parameters for the model
-
setParams
public void setParams(INDArray params)
Description copied from interface:Model
Set the parameters for this model. This expects a linear ndarray which then be unpacked internally relative to the expected ordering of the model
-
setParamsViewArray
public void setParamsViewArray(INDArray gradient)
Description copied from interface:Model
Set the initial parameters array as a view of the full (backprop) network parameters NOTE: this is intended to be used internally in MultiLayerNetwork and ComputationGraph, not by users.- Specified by:
setParamsViewArray
in interfaceModel
- Parameters:
gradient
- a 1 x nParams row vector that is a view of the larger (MLN/CG) parameters array
-
getGradientsViewArray
public INDArray getGradientsViewArray()
- Specified by:
getGradientsViewArray
in interfaceModel
-
setBackpropGradientsViewArray
public void setBackpropGradientsViewArray(INDArray gradient)
Description copied from interface:Model
Set the gradients array as a view of the full (backprop) network parameters NOTE: this is intended to be used internally in MultiLayerNetwork and ComputationGraph, not by users.- Specified by:
setBackpropGradientsViewArray
in interfaceModel
- Parameters:
gradient
- a 1 x nParams row vector that is a view of the larger (MLN/CG) gradients array
-
fit
public void fit(INDArray data, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:Model
Fit the model to the given data
-
gradient
public Gradient gradient()
Description copied from interface:Model
Get the gradient. Note that this method will not calculate the gradient, it will rather return the gradient that has been computed before. For calculating the gradient, seeModel.computeGradientAndScore(LayerWorkspaceMgr)
} .
-
gradientAndScore
public Pair<Gradient,Double> gradientAndScore()
Description copied from interface:Model
Get the gradient and score- Specified by:
gradientAndScore
in interfaceModel
- Returns:
- the gradient and score
-
batchSize
public int batchSize()
Description copied from interface:Model
The current inputs batch size
-
conf
public NeuralNetConfiguration conf()
Description copied from interface:Model
The configuration for the neural network
-
setConf
public void setConf(NeuralNetConfiguration conf)
Description copied from interface:Model
Setter for the configuration
-
input
public INDArray input()
Description copied from interface:Model
The input/feature matrix for the model
-
getOptimizer
public ConvexOptimizer getOptimizer()
Description copied from interface:Model
Returns this models optimizer- Specified by:
getOptimizer
in interfaceModel
- Specified by:
getOptimizer
in interfaceNeuralNetwork
- Returns:
- this models optimizer
-
getParam
public INDArray getParam(String paramName)
Description copied from interface:Model
Get the parameter
-
paramTable
public Map<String,INDArray> paramTable()
Description copied from interface:Model
The param table- Specified by:
paramTable
in interfaceModel
- Returns:
-
paramTable
public Map<String,INDArray> paramTable(boolean backpropParamsOnly)
Description copied from interface:Model
Table of parameters by key, for backprop For many models (dense layers, etc) - all parameters are backprop parameters- Specified by:
paramTable
in interfaceModel
- Parameters:
backpropParamsOnly
- If true, return backprop params only. If false: return all params (equivalent to paramsTable())
-
setParamTable
public void setParamTable(@NonNull @NonNull Map<String,INDArray> paramTable)
Description copied from interface:Model
Setter for the param table- Specified by:
setParamTable
in interfaceModel
-
setParam
public void setParam(String key, INDArray val)
Description copied from interface:Model
Set the parameter with a new ndarray
-
clear
public void clear()
Description copied from interface:Model
Clear input
-
applyConstraints
public void applyConstraints(int iteration, int epoch)
Description copied from interface:Model
Apply any constraints to the model- Specified by:
applyConstraints
in interfaceModel
-
rnnTimeStep
public INDArray[] rnnTimeStep(INDArray... inputs)
If this ComputationGraph contains one or more RNN layers: conduct forward pass (prediction) but using previous stored state for any RNN layers. The activations for the final step are also stored in the RNN layers for use next time rnnTimeStep() is called.
This method can be used to generate output one or more steps at a time instead of always having to do forward pass from t=0. Example uses are for streaming data, and for generating samples from network output one step at a time (where samples are then fed back into the network as input)
If no previous state is present in RNN layers (i.e., initially or after calling rnnClearPreviousState()), the default initialization (usually 0) is used.
Supports mini-batch (i.e., multiple predictions/forward pass in parallel) as well as for single examples.- Parameters:
inputs
- Input to network. May be for one or multiple time steps. For single time step: input has shape [miniBatchSize,inputSize] or [miniBatchSize,inputSize,1]. miniBatchSize=1 for single example.
For multiple time steps: [miniBatchSize,inputSize,inputTimeSeriesLength]- Returns:
- Output activations. If output is RNN layer (such as RnnOutputLayer): if all inputs have shape [miniBatchSize,inputSize]
i.e., is 2d, then outputs have shape [miniBatchSize,outputSize] (i.e., also 2d) instead of [miniBatchSize,outputSize,1].
Otherwise output is 3d [miniBatchSize,outputSize,inputTimeSeriesLength] when using RnnOutputLayer (or unmodified otherwise).
-
rnnTimeStep
public INDArray[] rnnTimeStep(MemoryWorkspace outputWorkspace, INDArray... inputs)
SeernnTimeStep(INDArray...)
for details.
If no memory workspace is provided, the output will be detached (not in any workspace).
If a memory workspace is provided, the output activation array (i.e., the INDArray returned by this method) will be placed in the specified workspace. This workspace must be opened by the user before calling this method - and the user is responsible for (a) closing this workspace, and (b) ensuring the output array is not used out of scope (i.e., not used after closing the workspace to which it belongs - as this is likely to cause either an exception when used, or a crash).- Parameters:
inputs
- Input activationsoutputWorkspace
- Output workspace. May be null- Returns:
- The output/activations from the network (either detached or in the specified workspace if provided)
-
rnnGetPreviousState
public Map<String,INDArray> rnnGetPreviousState(int layer)
Get the state of the RNN layer, as used inrnnTimeStep(INDArray...)
.- Parameters:
layer
- Number/index of the layer.- Returns:
- Hidden state, or null if layer is not an RNN layer
-
rnnGetPreviousState
public Map<String,INDArray> rnnGetPreviousState(String layerName)
Get the state of the RNN layer, as used inrnnTimeStep(INDArray...)
.- Parameters:
layerName
- name of the layer- Returns:
- Hidden state, or null if layer is not an RNN layer
-
rnnGetPreviousStates
public Map<String,Map<String,INDArray>> rnnGetPreviousStates()
Get a map of states for ALL RNN layers, as used inrnnTimeStep(INDArray...)
. Layers that are not RNN layers will not have an entry in the returned map- Returns:
- Map of states (keyed by layer name) or null if layer is not an RNN layer
- See Also:
rnnSetPreviousStates(Map)
-
rnnSetPreviousState
public void rnnSetPreviousState(int layer, Map<String,INDArray> state)
Set the state of the RNN layer, for use inrnnTimeStep(INDArray...)
- Parameters:
layer
- The number/index of the layer.state
- The state to set the specified layer to
-
rnnSetPreviousState
public void rnnSetPreviousState(String layerName, Map<String,INDArray> state)
Set the state of the RNN layer, for use inrnnTimeStep(INDArray...)
- Parameters:
layerName
- The name of the layer.state
- The state to set the specified layer to
-
rnnSetPreviousStates
public void rnnSetPreviousStates(Map<String,Map<String,INDArray>> previousStates)
Set the states for all RNN layers, for use inrnnTimeStep(INDArray...)
- Parameters:
previousStates
- The previous time step states for all layers (key: layer name. Value: layer states)- See Also:
rnnGetPreviousStates()
-
rnnClearPreviousState
public void rnnClearPreviousState()
Clear the previous state of the RNN layers (if any), used inrnnTimeStep(INDArray...)
-
doTruncatedBPTT
protected void doTruncatedBPTT(INDArray[] inputs, INDArray[] labels, INDArray[] featureMasks, INDArray[] labelMasks, LayerWorkspaceMgr workspaceMgr)
Fit the network using truncated BPTT
-
rnnActivateUsingStoredState
public Map<String,INDArray> rnnActivateUsingStoredState(INDArray[] inputs, boolean training, boolean storeLastForTBPTT)
Similar to rnnTimeStep and feedForward() methods. Difference here is that this method:
(a) like rnnTimeStep does forward pass using stored state for RNN layers, and
(b) unlike rnnTimeStep does not modify the RNN layer state
Therefore multiple calls to this method with the same input should have the same output.
Typically used during training only. Use rnnTimeStep for prediction/forward pass at test time.- Parameters:
inputs
- Input to networktraining
- Whether training or notstoreLastForTBPTT
- set to true if used as part of truncated BPTT training- Returns:
- Activations for each layer (including input, as per feedforward() etc)
-
setLayerMaskArrays
public void setLayerMaskArrays(INDArray[] featureMaskArrays, INDArray[] labelMaskArrays)
Set the mask arrays for features and labels. Mask arrays are typically used in situations such as one-to-many and many-to-one learning with recurrent neural networks, as well as for supporting time series of varying lengths within the same minibatch.
For example, with RNN data sets with input of shape [miniBatchSize,nIn,timeSeriesLength] and outputs of shape [miniBatchSize,nOut,timeSeriesLength], the features and mask arrays will have shape [miniBatchSize,timeSeriesLength] and contain values 0 or 1 at each element (to specify whether a given input/example is present - or merely padding - at a given time step).
NOTE: This method is not usually used directly. Instead, the various feedForward and fit methods handle setting of masking internally.- Parameters:
featureMaskArrays
- Mask array for features (input)labelMaskArrays
- Mask array for labels (output)- See Also:
clearLayerMaskArrays()
-
clearLayerMaskArrays
public void clearLayerMaskArrays()
Remove the mask arrays from all layers.
SeesetLayerMaskArrays(INDArray[], INDArray[])
for details on mask arrays.
-
rnnUpdateStateWithTBPTTState
protected void rnnUpdateStateWithTBPTTState()
Update the internal state of RNN layers after a truncated BPTT fit call
-
evaluate
public <T extends Evaluation> T evaluate(DataSetIterator iterator)
Evaluate the network (classification performance - single output ComputationGraphs only)- Parameters:
iterator
- Iterator to evaluate on- Returns:
- Evaluation object; results of evaluation on all examples in the data set
-
evaluate
public <T extends Evaluation> T evaluate(MultiDataSetIterator iterator)
Evaluate the network (classification performance - single output ComputationGraphs only)- Parameters:
iterator
- Iterator to evaluate on- Returns:
- Evaluation object; results of evaluation on all examples in the data set
-
evaluate
public <T extends Evaluation> T evaluate(DataSetIterator iterator, List<String> labelsList)
Evaluate the network on the provided data set (single output ComputationGraphs only). Used for evaluating the performance of classifiers- Parameters:
iterator
- Data to undertake evaluation on- Returns:
- Evaluation object, summarizing the results of the evaluation on the provided DataSetIterator
-
evaluate
public <T extends Evaluation> T evaluate(MultiDataSetIterator iterator, List<String> labelsList)
Evaluate the network on the provided data set (single output ComputationGraphs only). Used for evaluating the performance of classifiers- Parameters:
iterator
- Data to undertake evaluation on- Returns:
- Evaluation object, summarizing the results of the evaluation on the provided DataSetIterator
-
evaluate
public <T extends Evaluation> T evaluate(DataSetIterator iterator, List<String> labelsList, int topN)
Evaluate the network (for classification) on the provided data set, with top N accuracy in addition to standard accuracy. For 'standard' accuracy evaluation only, use topN = 1- Parameters:
iterator
- Iterator (data) to evaluate onlabelsList
- List of labels. May be null.topN
- N value for top N accuracy evaluation- Returns:
- Evaluation object, summarizing the results of the evaluation on the provided DataSetIterator
-
evaluate
public <T extends Evaluation> T evaluate(MultiDataSetIterator iterator, List<String> labelsList, int topN)
Evaluate the network (for classification) on the provided data set, with top N accuracy in addition to standard accuracy. For 'standard' accuracy evaluation only, use topN = 1- Parameters:
iterator
- Iterator (data) to evaluate onlabelsList
- List of labels. May be null.topN
- N value for top N accuracy evaluation- Returns:
- Evaluation object, summarizing the results of the evaluation on the provided DataSetIterator
-
evaluateRegression
public <T extends RegressionEvaluation> T evaluateRegression(DataSetIterator iterator)
Evaluate the (single output layer only) network for regression performance- Parameters:
iterator
- Data to evaluate on- Returns:
- Regression evaluation
-
evaluateRegression
public <T extends RegressionEvaluation> T evaluateRegression(MultiDataSetIterator iterator)
Evaluate the (single output layer only) network for regression performance- Parameters:
iterator
- Data to evaluate on- Returns:
- Regression evaluation
-
evaluateRegression
public <T extends RegressionEvaluation> T evaluateRegression(DataSetIterator iterator, List<String> columnNames)
Evaluate the (single output layer only) network for regression performance- Parameters:
iterator
- Data to evaluate oncolumnNames
- Column names for the regression evaluation. May be null.- Returns:
- Regression evaluation
-
evaluateRegression
public <T extends RegressionEvaluation> T evaluateRegression(MultiDataSetIterator iterator, List<String> columnNames)
Evaluate the (single output layer only) network for regression performance- Parameters:
iterator
- Data to evaluate on- Returns:
- Regression evaluation
-
evaluateROC
@Deprecated public <T extends ROC> T evaluateROC(DataSetIterator iterator)
Deprecated.To be removed - useevaluateROC(DataSetIterator, int)
to enforce selection of appropriate ROC/threshold configuration
-
evaluateROC
public <T extends ROC> T evaluateROC(DataSetIterator iterator, int rocThresholdSteps)
Evaluate the network (must be a binary classifier) on the specified data, using theROC
class- Parameters:
iterator
- Data to evaluate onrocThresholdSteps
- Number of threshold steps to use withROC
- Returns:
- ROC evaluation on the given dataset
-
evaluateROC
@Deprecated public <T extends ROC> T evaluateROC(MultiDataSetIterator iterator)
Deprecated.To be removed - useevaluateROC(DataSetIterator, int)
to enforce selection of appropriate ROC/threshold configuration
-
evaluateROC
public <T extends ROC> T evaluateROC(MultiDataSetIterator iterator, int rocThresholdSteps)
Evaluate the network (must be a binary classifier) on the specified data, using theROC
class- Parameters:
iterator
- Data to evaluate onrocThresholdSteps
- Number of threshold steps to use withROC
- Returns:
- ROC evaluation on the given dataset
-
evaluateROCMultiClass
@Deprecated public <T extends ROCMultiClass> T evaluateROCMultiClass(DataSetIterator iterator)
Deprecated.To be removed - useevaluateROCMultiClass(DataSetIterator, int)
to enforce selection of appropriate ROC/threshold configuration
-
evaluateROCMultiClass
public <T extends ROCMultiClass> T evaluateROCMultiClass(DataSetIterator iterator, int rocThresholdSteps)
Evaluate the network on the specified data, using theROCMultiClass
class- Parameters:
iterator
- Data to evaluate onrocThresholdSteps
- Number of threshold steps to use withROCMultiClass
- Returns:
- Multi-class ROC evaluation on the given dataset
-
evaluateROCMultiClass
public <T extends ROCMultiClass> T evaluateROCMultiClass(MultiDataSetIterator iterator, int rocThresholdSteps)
Evaluate the network on the specified data, using theROCMultiClass
class- Parameters:
iterator
- Data to evaluate onrocThresholdSteps
- Number of threshold steps to use withROCMultiClass
- Returns:
- Multi-class ROC evaluation on the given dataset
-
doEvaluation
public <T extends IEvaluation> T[] doEvaluation(DataSetIterator iterator, T... evaluations)
Perform evaluation on the given data (DataSetIterator) with the givenIEvaluation
instance- Specified by:
doEvaluation
in interfaceNeuralNetwork
- Type Parameters:
T
- Type of the IEvaluation instance- Parameters:
iterator
- Test data to evaluate onevaluations
- IEvaluation instances- Returns:
- The input IEvaluation instance, after performing evaluation on the test data
-
doEvaluation
public <T extends IEvaluation> T[] doEvaluation(MultiDataSetIterator iterator, T... evaluations)
Perform evaluation on the given data (MultiDataSetIterator) with the givenIEvaluation
instance- Specified by:
doEvaluation
in interfaceNeuralNetwork
- Type Parameters:
T
- Type of the IEvaluation instance- Parameters:
iterator
- Test data to evaluate onevaluations
- IEvaluation insntance- Returns:
- The input IEvaluation instance, after performing evaluation on the test data
-
evaluate
public <T extends IEvaluation> Map<Integer,T[]> evaluate(DataSetIterator iterator, Map<Integer,T[]> evaluations)
Perform evaluation for networks with multiple outputs.- Parameters:
iterator
- Data to evaluateevaluations
- Evaluation instances. Key: the network output number (0 to numOutputs-1). Value: the IEvaluation instances to perform evaluation with, for that output only. Note that not every output needs to have an IEvaluation[] defined.- Returns:
- The same evaluation map, after performing evaluation
-
evaluate
public <T extends IEvaluation> Map<Integer,T[]> evaluate(MultiDataSetIterator iterator, Map<Integer,T[]> evaluations)
Perform evaluation for networks with multiple outputs.- Parameters:
iterator
- Data to evaluateevaluations
- Evaluation instances. Key: the network output number (0 to numOutputs-1). Value: the IEvaluation instances to perform evaluation with, for that output only. Note that not every output needs to have an IEvaluation[] defined.- Returns:
- The same evaluation map, after performing evaluation
-
summary
public String summary()
String detailing the architecture of the computation graph. Vertices are printed in a topological sort order. Columns are Vertex Names with layer/vertex type, nIn, nOut, Total number of parameters and the Shapes of the parameters And the inputs to the vertex Will also give information about frozen layers/vertices, if any.- Returns:
- Summary as a string
- See Also:
memoryInfo(int, InputType...)
-
summary
public String summary(InputType... inputTypes)
String detailing the architecture of the computation graph. Will also display activation size when given an input type. Vertices are printed in a topological sort order. Columns are Vertex Names with layer/vertex type, nIn, nOut, Total number of parameters and the Shapes of the parameters And the inputs to the vertex Will also give information about frozen layers/vertices, if any.- Returns:
- Summary as a string
- See Also:
memoryInfo(int, InputType...)
-
memoryInfo
public String memoryInfo(int minibatch, InputType... inputTypes)
Generate information regarding memory use for the network, for the given input types and minibatch size. Note that when using workspaces or CuDNN, the network should be trained for some iterations so that the memory workspaces have time to initialize. Without this, the memory requirements during training may be underestimated. Note also that this is the same information that is generated during an OOM crash when training or performing inference.- Parameters:
minibatch
- Minibatch size to estimate memory forinputTypes
- Input types to the network- Returns:
- A String with information about network memory use information
-
clearLayersStates
public void clearLayersStates()
This method just makes sure there's no state preserved within layers
-
incrementEpochCount
public void incrementEpochCount()
Increment the epoch count (in the underlyingComputationGraphConfiguration
by 1). Note that this is done automatically when using iterator-based fitting methods, such asfit(DataSetIterator)
orfit(MultiDataSet)
. However, when using non-iterator fit methods (DataSet, MultiDataSet, INDArrays etc), the network has no way to know when one epoch ends and another starts. In such situations, this method can be used to increment the epoch counter.
Note that the epoch counter is used for situations such as some learning rate schedules, and the like. The current epoch count can be obtained usingComputationGraph.getConfiguration().getEpochCount()
-
synchronizeIterEpochCounts
protected void synchronizeIterEpochCounts()
-
getIterationCount
public int getIterationCount()
Returns the number of iterations (parameter updates) that the ComputationGraph has done- Returns:
- Number of iterations
-
getEpochCount
public int getEpochCount()
Returns the number of epochs that the ComputationGraph has done. Note that the epoch count is incremented only whenfit(DataSetIterator)
,fit(MultiDataSetIterator)
,fit(DataSetIterator, int)
orfit(MultiDataSetIterator, int)
are used. The epoch count can also be manually incremented usingincrementEpochCount()
- Returns:
- Number of epochs
-
save
public void save(File f) throws IOException
Save the ComputationGraph to a file. Restore usingload(File, boolean)
. Note that this saves the updater (i.e., the state array for momentum/Adam/rmsprop etc), which is desirable if further training will be undertaken.- Parameters:
f
- File to save the network to- Throws:
IOException
- See Also:
ModelSerializer for more details (and saving/loading via streams)
,save(File, boolean)
-
save
public void save(File f, boolean saveUpdater) throws IOException
Save the ComputationGraph to a file. Restore usingload(File, boolean)
.- Parameters:
f
- File to save the network tosaveUpdater
- If true: save the updater (i.e., the state array for momentum/Adam/rmsprop etc), which should usually be saved if further training is required- Throws:
IOException
- See Also:
ModelSerializer for more details (and saving/loading via streams)
,save(File, boolean)
-
load
public static ComputationGraph load(File f, boolean loadUpdater) throws IOException
Restore a ComputationGraph to a file, saved usingsave(File)
orModelSerializer
- Parameters:
f
- File to load the network fromloadUpdater
- If true: load the updater if it is available (i.e., the state array for momentum/Adam/rmsprop etc) - use false if no further training is required, or true if further training will be undertaken- Throws:
IOException
- See Also:
ModelSerializer for more details (and saving/loading via streams)
-
convertDataType
public ComputationGraph convertDataType(@NonNull @NonNull DataType dataType)
Return a copy of the network with the parameters and activations set to use the specified (floating point) data type. If the existing datatype is the same as the requested dataype, the original network will be returned unchanged. Only floating point datatypes (DOUBLE, FLOAT, HALF) may be used.- Parameters:
dataType
- Datatype to convert the network to- Returns:
- The network, set to use the specified datatype for the parameters and activations
-
setLearningRate
public void setLearningRate(double newLr)
Set the learning rate for all layers in the network to the specified value. Note that if any learning rate schedules are currently present, these will be removed in favor of the new (fixed) learning rate.
Note: This method not free from a performance point of view: a proper learning rate schedule should be used in preference to calling this method at every iteration.- Parameters:
newLr
- New learning rate for all layers- See Also:
setLearningRate(ISchedule)
,setLearningRate(String, double)
-
setLearningRate
public void setLearningRate(ISchedule newLr)
Set the learning rate schedule for all layers in the network to the specified schedule. This schedule will replace any/all existing schedules, and also any fixed learning rate values.
Note that the iteration/epoch counts will not be reset. UseComputationGraphConfiguration#setIterationCount(int)
andComputationGraphConfiguration#setEpochCount(int)
if this is required- Parameters:
newLr
- New learning rate schedule for all layers- See Also:
setLearningRate(ISchedule)
,setLearningRate(String, double)
-
setLearningRate
public void setLearningRate(String layerName, double newLr)
Set the learning rate for a single layer in the network to the specified value. Note that if any learning rate schedules are currently present, these will be removed in favor of the new (fixed) learning rate.
Note: This method not free from a performance point of view: a proper learning rate schedule should be used in preference to calling this method at every iteration. Note also thatsetLearningRate(double)
should also be used in preference, when all layers need to be set to a new LR- Parameters:
layerName
- Name of the layer to set the LR fornewLr
- New learning rate for a single layer- See Also:
setLearningRate(ISchedule)
,setLearningRate(String, double)
-
setLearningRate
public void setLearningRate(String layerName, ISchedule newLr)
Set the learning rate schedule for a single layer in the network to the specified value.
Note also thatsetLearningRate(ISchedule)
should also be used in preference, when all layers need to be set to a new LR schedule.
This schedule will replace any/all existing schedules, and also any fixed learning rate values.
Note also that the iteration/epoch counts will not be reset. UseComputationGraphConfiguration#setIterationCount(int)
andComputationGraphConfiguration#setEpochCount(int)
if this is required- Parameters:
layerName
- Name of the layer to set the LR schedule fornewLr
- New learning rate for a single layer- See Also:
setLearningRate(ISchedule)
,setLearningRate(String, double)
-
getLearningRate
public Double getLearningRate(String layerName)
Get the current learning rate, for the specified layer, from the network. Note: If the layer has no learning rate (no parameters, or an updater without a learning rate) then null is returned- Parameters:
layerName
- Layer name- Returns:
- Learning rate for the specified layer, or null
-
layerSize
public long layerSize(int layer)
Return the layer size (number of units) for the specified layer. Note that the meaning of the "layer size" can depend on the type of layer. For example:
- DenseLayer, OutputLayer, recurrent layers: number of units (nOut configuration option)
- ConvolutionLayer: the channels (number of channels)
- Subsampling layers, global pooling layers, etc: size of 0 is always returned- Parameters:
layer
- Index of the layer to get the size of. Must be in range 0 to nLayers-1 inclusive- Returns:
- Size of the layer
-
layerInputSize
public long layerInputSize(int layer)
Return the input size (number of inputs) for the specified layer.
Note that the meaning of the "input size" can depend on the type of layer. For example:
- DenseLayer, OutputLayer, etc: the feature vector size (nIn configuration option)
- Recurrent layers: the feature vector size per time step (nIn configuration option)
- ConvolutionLayer: the channels (number of channels)
- Subsampling layers, global pooling layers, etc: size of 0 is always returned- Parameters:
layer
- Index of the layer to get the size of. Must be in range 0 to nLayers-1 inclusive- Returns:
- Size of the layer
-
layerSize
public long layerSize(String layerName)
Return the layer size (number of units) for the specified layer.
Note that the meaning of the "layer size" can depend on the type of layer. For example:
- DenseLayer, OutputLayer, recurrent layers: number of units (nOut configuration option)
- ConvolutionLayer: the channels (number of channels)
- Subsampling layers, global pooling layers, etc: size of 0 is always returned- Parameters:
layerName
- Name of the layer to get the size of- Returns:
- Size of the layer
-
layerInputSize
public long layerInputSize(String layerName)
Return the input size (number of inputs) for the specified layer.
Note that the meaning of the "input size" can depend on the type of layer. For example:
- DenseLayer, OutputLayer, etc: the feature vector size (nIn configuration option)
- Recurrent layers: the feature vector size per time step (nIn configuration option)
- ConvolutionLayer: the channels (number of channels)
- Subsampling layers, global pooling layers, etc: size of 0 is always returned- Parameters:
layerName
- Name of the layer to get the size of- Returns:
- Size of the layer
-
equals
public boolean equals(Object obj)
Indicates whether some other object is "equal to" this one.The
equals
method implements an equivalence relation on non-null object references:- It is reflexive: for any non-null reference value
x
,x.equals(x)
should returntrue
. - It is symmetric: for any non-null reference values
x
andy
,x.equals(y)
should returntrue
if and only ify.equals(x)
returnstrue
. - It is transitive: for any non-null reference values
x
,y
, andz
, ifx.equals(y)
returnstrue
andy.equals(z)
returnstrue
, thenx.equals(z)
should returntrue
. - It is consistent: for any non-null reference values
x
andy
, multiple invocations ofx.equals(y)
consistently returntrue
or consistently returnfalse
, provided no information used inequals
comparisons on the objects is modified. - For any non-null reference value
x
,x.equals(null)
should returnfalse
.
The
equals
method for classObject
implements the most discriminating possible equivalence relation on objects; that is, for any non-null reference valuesx
andy
, this method returnstrue
if and only ifx
andy
refer to the same object (x == y
has the valuetrue
).Note that it is generally necessary to override the
hashCode
method whenever this method is overridden, so as to maintain the general contract for thehashCode
method, which states that equal objects must have equal hash codes.- Overrides:
equals
in classObject
- Parameters:
obj
- the reference object with which to compare.- Returns:
true
if this object is the same as the obj argument;false
otherwise.- See Also:
Object.hashCode()
,HashMap
- It is reflexive: for any non-null reference value
-
-