Package org.deeplearning4j.nn.multilayer
Class MultiLayerNetwork
- java.lang.Object
-
- org.deeplearning4j.nn.multilayer.MultiLayerNetwork
-
- All Implemented Interfaces:
Serializable
,Cloneable
,Classifier
,Layer
,Model
,NeuralNetwork
,Trainable
public class MultiLayerNetwork extends Object implements Serializable, Classifier, Layer, NeuralNetwork
- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.deeplearning4j.nn.api.Layer
Layer.TrainingMode, Layer.Type
-
-
Field Summary
Fields Modifier and Type Field Description protected boolean
clearTbpttState
protected NeuralNetConfiguration
defaultConfiguration
protected INDArray
flattenedGradients
protected INDArray
flattenedParams
protected Gradient
gradient
protected Map<String,org.bytedeco.javacpp.Pointer>
helperWorkspaces
protected boolean
initCalled
protected boolean
initDone
protected INDArray
input
protected INDArray
labels
protected ThreadLocal<Long>
lastEtlTime
protected int
layerIndex
protected LinkedHashMap<String,Layer>
layerMap
protected Layer[]
layers
protected MultiLayerConfiguration
layerWiseConfigurations
protected INDArray
mask
protected double
score
protected Solver
solver
protected Collection<TrainingListener>
trainingListeners
protected static String
WS_ALL_LAYERS_ACT
Workspace for storing all layers' activations - used only to store activations (layer inputs) as part of backprop Not used for inferenceprotected static WorkspaceConfiguration
WS_ALL_LAYERS_ACT_CONFIG
protected static String
WS_LAYER_ACT_1
Next 2 workspaces: used for: (a) Inference: holds activations for one layer only (b) Backprop: holds activation gradients for one layer only In both cases, they are opened and closed on every second layerprotected static String
WS_LAYER_ACT_2
protected WorkspaceConfiguration
WS_LAYER_ACT_X_CONFIG
protected static String
WS_LAYER_WORKING_MEM
Workspace for working memory for a single layer: forward pass and backward pass Note that this is opened/closed once per op (activate/backpropGradient call)protected WorkspaceConfiguration
WS_LAYER_WORKING_MEM_CONFIG
protected static String
WS_OUTPUT_MEM
Workspace for output methods that use OutputAdapterprotected static String
WS_RNN_LOOP_WORKING_MEM
Workspace for working memory in RNNs - opened and closed once per RNN time stepprotected static WorkspaceConfiguration
WS_RNN_LOOP_WORKING_MEM_CONFIG
-
Constructor Summary
Constructors Constructor Description MultiLayerNetwork(String conf, INDArray params)
Initialize the network based on the configuration (a MultiLayerConfiguration in JSON format) and parameters arrayMultiLayerNetwork(MultiLayerConfiguration conf)
MultiLayerNetwork(MultiLayerConfiguration conf, INDArray params)
Initialize the network based on the configuration and parameters array
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description INDArray
activate(boolean training, LayerWorkspaceMgr mgr)
Perform forward pass and return the activations array with the last set inputINDArray
activate(Layer.TrainingMode training)
Equivalent tooutput(INDArray)
using the input set viasetInput(INDArray)
INDArray
activate(INDArray input, boolean training, LayerWorkspaceMgr mgr)
Perform forward pass and return the activations array with the specified inputINDArray
activate(INDArray input, Layer.TrainingMode training)
Equivalent tooutput(INDArray, TrainingMode)
INDArray
activateSelectedLayers(int from, int to, INDArray input)
Calculate activation for few layers at once.protected INDArray
activationFromPrevLayer(int curr, INDArray input, boolean training, LayerWorkspaceMgr mgr)
void
addListeners(TrainingListener... listeners)
This method ADDS additional TrainingListener to existing listenersvoid
allowInputModification(boolean allow)
A performance optimization: mark whether the layer is allowed to modify its input array in-place.void
applyConstraints(int iteration, int epoch)
Apply any constraints to the modelPair<Gradient,INDArray>
backpropGradient(INDArray epsilon, LayerWorkspaceMgr workspaceMgr)
Calculate the gradient relative to the error in the next layerint
batchSize()
The current inputs batch sizeprotected Pair<Gradient,INDArray>
calcBackpropGradients(INDArray epsilon, boolean withOutputLayer, boolean tbptt, boolean returnInputActGrad)
Calculate gradients and errors.double
calcRegularizationScore(boolean backpropParamsOnly)
Calculate the regularization component of the score, for the parameters in this layer
For example, the L1, L2 and/or weight decay components of the loss functionPair<Gradient,INDArray>
calculateGradients(@NonNull INDArray features, @NonNull INDArray label, INDArray fMask, INDArray labelMask)
Calculate parameter gradients and input activation gradients given the input and labels, and optionally mask arraysvoid
clear()
Clear the inputs.void
clearLayerMaskArrays()
Remove the mask arrays from all layers.
SeesetLayerMaskArrays(INDArray, INDArray)
for details on mask arrays.void
clearLayersStates()
This method just makes sure there's no state preserved within layersvoid
clearNoiseWeightParams()
MultiLayerNetwork
clone()
Clone the MultiLayerNetworkvoid
close()
Close the network and deallocate all native memory, including: parameters, gradients, updater memory and workspaces Note that the network should not be used again for any purpose after it has been closedvoid
computeGradientAndScore()
void
computeGradientAndScore(LayerWorkspaceMgr layerWorkspaceMgr)
Update the scoreNeuralNetConfiguration
conf()
The configuration for the neural networkMultiLayerNetwork
convertDataType(@NonNull DataType dataType)
Return a copy of the network with the parameters and activations set to use the specified (floating point) data type.<T extends IEvaluation>
T[]doEvaluation(DataSetIterator iterator, T... evaluations)
Perform evaluation using an arbitrary IEvaluation instance.<T extends IEvaluation>
T[]doEvaluation(MultiDataSetIterator iterator, T[] evaluations)
This method executes evaluation of the model against given iterator and evaluation implementations<T extends IEvaluation>
T[]doEvaluationHelper(DataSetIterator iterator, T... evaluations)
protected void
doTruncatedBPTT(INDArray input, INDArray labels, INDArray featuresMaskArray, INDArray labelsMaskArray, LayerWorkspaceMgr workspaceMgr)
boolean
equals(Object obj)
Indicates whether some other object is "equal to" this one.<T extends Evaluation>
Tevaluate(@NonNull DataSetIterator iterator)
Evaluate the network (classification performance)Evaluation
evaluate(@NonNull MultiDataSetIterator iterator)
Evaluate the network (classification performance).Evaluation
evaluate(DataSetIterator iterator, List<String> labelsList)
Evaluate the network on the provided data set.Evaluation
evaluate(DataSetIterator iterator, List<String> labelsList, int topN)
Evaluate the network (for classification) on the provided data set, with top N accuracy in addition to standard accuracy.<T extends RegressionEvaluation>
TevaluateRegression(DataSetIterator iterator)
Evaluate the network for regression performanceRegressionEvaluation
evaluateRegression(MultiDataSetIterator iterator)
Evaluate the network for regression performance Can only be used with MultiDataSetIterator instances with a single input/output array<T extends ROC>
TevaluateROC(DataSetIterator iterator)
Deprecated.To be removed - useevaluateROC(DataSetIterator, int)
to enforce selection of appropriate ROC/threshold configuration<T extends ROC>
TevaluateROC(DataSetIterator iterator, int rocThresholdSteps)
Evaluate the network (must be a binary classifier) on the specified data, using theROC
class<T extends ROCMultiClass>
TevaluateROCMultiClass(DataSetIterator iterator)
Deprecated.To be removed - useevaluateROCMultiClass(DataSetIterator, int)
to enforce selection of appropriate ROC/threshold configuration<T extends ROCMultiClass>
TevaluateROCMultiClass(DataSetIterator iterator, int rocThresholdSteps)
Evaluate the network on the specified data, using theROCMultiClass
classdouble
f1Score(INDArray input, INDArray labels)
Perform inference and then calculate the F1 score of the output(input) vs.double
f1Score(DataSet data)
Sets the input and labels and returns the F1 score for the prediction with respect to the true labelsList<INDArray>
feedForward()
Compute activations of all layers from input (inclusive) to output of the final/output layer.List<INDArray>
feedForward(boolean train)
Compute activations from input to output of the output layer.List<INDArray>
feedForward(boolean train, boolean clearInputs)
Perform feed-forward, optionally (not) clearing the layer input arrays.
Note: when using clearInputs=false, there can be some performance and memory overhead: this is because the arrays are defined outside of workspaces (which are enabled by default) - otherwise, old/invalidated arrays could still be accessed after calling this method.List<INDArray>
feedForward(INDArray input)
Compute activations of all layers from input (inclusive) to output of the final/output layer.List<INDArray>
feedForward(INDArray input, boolean train)
Compute all layer activations, from input to output of the output layer.List<INDArray>
feedForward(INDArray input, INDArray featuresMask, INDArray labelsMask)
Compute the activations from the input to the output layer, given mask arrays (that may be null) The masking arrays are used in situations such an one-to-many and many-to-one rucerrent neural network (RNN) designs, as well as for supporting time series of varying lengths within the same minibatch for RNNs.Pair<INDArray,MaskState>
feedForwardMaskArray(INDArray maskArray, MaskState currentMaskState, int minibatchSize)
Feed forward the input mask array, setting in the layer as appropriate.List<INDArray>
feedForwardToLayer(int layerNum, boolean train)
Compute the activations from the input to the specified layer, using the currently set input for the network.
To compute activations for all layers, use feedForward(...) methods
Note: output list includes the original input.List<INDArray>
feedForwardToLayer(int layerNum, INDArray input)
Compute the activations from the input to the specified layer.
To compute activations for all layers, use feedForward(...) methods
Note: output list includes the original input.List<INDArray>
feedForwardToLayer(int layerNum, INDArray input, boolean train)
Compute the activations from the input to the specified layer.
To compute activations for all layers, use feedForward(...) methods
Note: output list includes the original input.protected List<INDArray>
ffToLayerActivationsDetached(boolean train, @NonNull FwdPassType fwdPassType, boolean storeLastForTBPTT, int layerIndex, @NonNull INDArray input, INDArray fMask, INDArray lMask, boolean clearInputs)
Feed-forward through the network - returning all array activations in a list, detached from any workspace.protected List<INDArray>
ffToLayerActivationsInWs(int layerIndex, @NonNull FwdPassType fwdPassType, boolean storeLastForTBPTT, @NonNull INDArray input, INDArray fMask, INDArray lMask)
Feed-forward through the network at training time - returning a list of all activations in a workspace (WS_ALL_LAYERS_ACT) if workspaces are enabled for training; or detached if no workspaces are used.
Note: if using workspaces for training, this method requires that WS_ALL_LAYERS_ACT is open externally.
If using NO workspaces, requires that no external workspace is open
Note that this method does NOT clear the inputs to each layer - instead, they are in the WS_ALL_LAYERS_ACT workspace for use in later backprop.void
fit()
All models have a fit methodvoid
fit(@NonNull DataSetIterator iterator, int numEpochs)
Perform minibatch training on all minibatches in the DataSetIterator, for the specified number of epochs.void
fit(@NonNull MultiDataSetIterator iterator, int numEpochs)
Perform minibatch training on all minibatches in the MultiDataSetIterator, for the specified number of epochs.void
fit(INDArray examples, int[] labels)
Fit the model for one iteration on the provided datavoid
fit(INDArray data, LayerWorkspaceMgr workspaceMgr)
Fit the model to the given datavoid
fit(INDArray data, INDArray labels)
Fit the model for one iteration on the provided datavoid
fit(INDArray features, INDArray labels, INDArray featuresMask, INDArray labelsMask)
Fit the model for one iteration on the provided datavoid
fit(DataSet data)
Fit the model for one iteration on the provided datavoid
fit(DataSetIterator iterator)
Perform minibatch training on all minibatches in the DataSetIterator for 1 epoch.
Note that this method does not do layerwise pretraining.
For pretraining use method pretrain..void
fit(MultiDataSetIterator iterator)
Perform minibatch training on all minibatches in the MultiDataSetIterator.
Note: The MultiDataSets in the MultiDataSetIterator must have exactly 1 input and output array (as MultiLayerNetwork only supports 1 input and 1 output)void
fit(MultiDataSet dataSet)
This method fits model with a given MultiDataSetTrainingConfig
getConfig()
NeuralNetConfiguration
getDefaultConfiguration()
Intended for internal/developer useint
getEpochCount()
INDArray
getGradientsViewArray()
LayerHelper
getHelper()
int
getIndex()
Get the layer index.INDArray
getInput()
int
getInputMiniBatchSize()
Get current/last input mini-batch size, as set by setInputMiniBatchSize(int)int
getIterationCount()
INDArray
getLabels()
long
getLastEtlTime()
Get the last ETL time.Layer
getLayer(int i)
Layer
getLayer(String name)
protected static WorkspaceConfiguration
getLayerActivationWSConfig(int numLayers)
List<String>
getLayerNames()
Layer[]
getLayers()
MultiLayerConfiguration
getLayerWiseConfigurations()
Get the configuration for the networkprotected static WorkspaceConfiguration
getLayerWorkingMemWSConfig(int numWorkingMemCycles)
Double
getLearningRate(int layerNumber)
Get the current learning rate, for the specified layer, from the network.Collection<TrainingListener>
getListeners()
Get theTrainingListener
s set for this network, if anyINDArray
getMask()
INDArray
getMaskArray()
int
getnLayers()
Get the number of layers in the networkConvexOptimizer
getOptimizer()
Returns this models optimizerLayer
getOutputLayer()
Get the output layer - i.e., the last layer in the netwokINDArray
getParam(String param)
Get one parameter array for the network.
In MultiLayerNetwork, parameters are keyed like "0_W" and "0_b" to mean "weights of layer index 0" and "biases of layer index 0" respectively.Collection<TrainingListener>
getTrainingListeners()
Deprecated.UsegetListeners()
Updater
getUpdater()
Get the updater for this MultiLayerNetworkUpdater
getUpdater(boolean initializeIfReq)
Gradient
gradient()
Get the gradient.Pair<Gradient,Double>
gradientAndScore()
Get the gradient and scoreprotected boolean
hasAFrozenLayer()
void
incrementEpochCount()
Increment the epoch count (in the underlyingMultiLayerConfiguration
by 1).void
init()
Initialize the MultiLayerNetwork.void
init(INDArray parameters, boolean cloneParametersArray)
Initialize the MultiLayerNetwork, optionally with an existing parameters array.void
initGradientsView()
This method: initializes the flattened gradients array (used in backprop) and sets the appropriate subset in all layers.INDArray
input()
The input/feature matrix for the modelprotected void
intializeConfigurations()
boolean
isInitCalled()
boolean
isPretrainLayer()
Returns true if the layer can be trained in an unsupervised/pretrain manner (AE, VAE, etc)int
layerInputSize(int layer)
Return the input size (number of inputs) for the specified layer.
Note that the meaning of the "input size" can depend on the type of layer.int
layerSize(int layer)
Return the layer size (number of units) for the specified layer.
Note that the meaning of the "layer size" can depend on the type of layer.static MultiLayerNetwork
load(File f, boolean loadUpdater)
Restore a MultiLayerNetwork to a file, saved usingsave(File)
orModelSerializer
String
memoryInfo(int minibatch, InputType inputType)
Generate information regarding memory use for the network, for the given input type and minibatch size.int
numLabels()
Deprecated.Will be removed in a future releaselong
numParams()
Returns the number of parameters in the networklong
numParams(boolean backwards)
Returns the number of parameters in the network<T> T
output(@NonNull INDArray inputs, INDArray inputMasks, INDArray labelMasks, @NonNull OutputAdapter<T> outputAdapter)
This method uses provided OutputAdapter to return custom object built from INDArray PLEASE NOTE: This method uses dedicated Workspace for output generation to avoid redundant allocationsINDArray
output(INDArray input)
Perform inference on the provided input/features - i.e., perform forward pass using the provided input/features and return the output of the final layer.INDArray
output(INDArray input, boolean train)
Perform inference on the provided input/features - i.e., perform forward pass using the provided input/features and return the output of the final layer.INDArray
output(INDArray input, boolean train, MemoryWorkspace outputWorkspace)
Get the network output, which is optionally placed in the specified memory workspace.
If no memory workspace is provided, the output will be detached (not in any workspace).
If a memory workspace is provided, the output activation array (i.e., the INDArray returned by this method) will be placed in the specified workspace.INDArray
output(INDArray input, boolean train, INDArray featuresMask, INDArray labelsMask)
Calculate the output of the network, with masking arrays.INDArray
output(INDArray input, boolean train, INDArray featuresMask, INDArray labelsMask, MemoryWorkspace outputWorkspace)
Get the network output, which is optionally placed in the specified memory workspace.
If no memory workspace is provided, the output will be detached (not in any workspace).
If a memory workspace is provided, the output activation array (i.e., the INDArray returned by this method) will be placed in the specified workspace.INDArray
output(INDArray input, Layer.TrainingMode train)
Perform inference on the provided input/features - i.e., perform forward pass using the provided input/features and return the output of the final layer.INDArray
output(DataSetIterator iterator)
Equivalent tooutput(DataSetIterator, boolean)
with train=falseINDArray
output(DataSetIterator iterator, boolean train)
Generate the output for all examples/batches in the input iterator, and concatenate them into a single array.protected INDArray
outputOfLayerDetached(boolean train, @NonNull FwdPassType fwdPassType, int layerIndex, @NonNull INDArray input, INDArray featureMask, INDArray labelsMask, MemoryWorkspace outputWorkspace)
Provide the output of the specified layer, detached from any workspace.INDArray
params()
Returns a 1 x m vector where the vector is composed of a flattened vector of all of the parameters in the network.
SeegetParam(String)
andparamTable()
for a more useful/interpretable representation of the parameters.
Note that the parameter vector is not a copy, and changes to the returned INDArray will impact the network parameters.INDArray
params(boolean backwardOnly)
Deprecated.To be removed.Map<String,INDArray>
paramTable()
Return a map of all parameters in the network.Map<String,INDArray>
paramTable(boolean backpropParamsOnly)
Returns a map of all parameters in the network as perparamTable()
.
Optionally (with backpropParamsOnly=true) only the 'backprop' parameters are returned - that is, any parameters involved only in unsupervised layerwise pretraining not standard inference/backprop are excluded from the returned list.int[]
predict(INDArray d)
Usable only for classification networks in conjunction with OutputLayer.List<String>
predict(DataSet dataSet)
As perpredict(INDArray)
but the returned values are looked up from the list of label names in the provided DataSetvoid
pretrain(DataSetIterator iter)
Perform layerwise pretraining for one epoch - seepretrain(DataSetIterator, int)
void
pretrain(DataSetIterator iter, int numEpochs)
Perform layerwise unsupervised training on all pre-trainable layers in the network (VAEs, Autoencoders, etc), for the specified number of epochs each.void
pretrainLayer(int layerIdx, INDArray features)
Perform layerwise unsupervised training on a single pre-trainable layer in the network (VAEs, Autoencoders, etc)
If the specified layer index (0 to numLayers - 1) is not a pretrainable layer, this is a no-op.void
pretrainLayer(int layerIdx, DataSetIterator iter)
Fit for one epoch - seepretrainLayer(int, DataSetIterator, int)
void
pretrainLayer(int layerIdx, DataSetIterator iter, int numEpochs)
Perform layerwise unsupervised training on a single pre-trainable layer in the network (VAEs, Autoencoders, etc) for the specified number of epochs
If the specified layer index (0 to numLayers - 1) is not a pretrainable layer, this is a no-op.List<INDArray>
rnnActivateUsingStoredState(INDArray input, boolean training, boolean storeLastForTBPTT)
Similar to rnnTimeStep and feedForward() methods.void
rnnClearPreviousState()
Clear the previous state of the RNN layers (if any).Map<String,INDArray>
rnnGetPreviousState(int layer)
Get the state of the RNN layer, as used in rnnTimeStep().void
rnnSetPreviousState(int layer, Map<String,INDArray> state)
Set the state of the RNN layer.INDArray
rnnTimeStep(INDArray input)
If this MultiLayerNetwork contains one or more RNN layers: conduct forward pass (prediction) but using previous stored state for any RNN layers.INDArray
rnnTimeStep(INDArray input, MemoryWorkspace outputWorkspace)
SeernnTimeStep(INDArray)
for details
If no memory workspace is provided, the output will be detached (not in any workspace).
If a memory workspace is provided, the output activation array (i.e., the INDArray returned by this method) will be placed in the specified workspace.void
save(File f)
Save the MultiLayerNetwork to a file.void
save(File f, boolean saveUpdater)
Save the MultiLayerNetwork to a file.double
score()
Score of the model (relative to the objective function) - previously calculated on the last minibatchdouble
score(DataSet data)
Sets the input and labels and calculates the score (value of the output layer loss function plus l1/l2 if applicable) for the prediction with respect to the true labels
This is equivalent toscore(DataSet, boolean)
with training==false.double
score(DataSet data, boolean training)
Sets the input and labels and calculates the score (value of the output layer loss function plus l1/l2 if applicable) for the prediction with respect to the true labelsINDArray
scoreExamples(DataSetIterator iter, boolean addRegularizationTerms)
As perscoreExamples(DataSet, boolean)
- the outputs (example scores) for all DataSets in the iterator are concatenatedINDArray
scoreExamples(DataSet data, boolean addRegularizationTerms)
Calculate the score for each example in a DataSet individually.void
setBackpropGradientsViewArray(INDArray gradients)
Set the gradients array as a view of the full (backprop) network parameters NOTE: this is intended to be used internally in MultiLayerNetwork and ComputationGraph, not by users.void
setCacheMode(CacheMode mode)
This method sets specified CacheMode for all layers within networkvoid
setConf(NeuralNetConfiguration conf)
Setter for the configurationvoid
setEpochCount(int epochCount)
Set the current epoch count (number of epochs passed ) for the layer/networkvoid
setGradientsAccumulator(GradientsAccumulator accumulator)
This method allows you to specificy GradientsAccumulator instance to be used with this model
PLEASE NOTE: Do not use this method unless you understand how to use GradientsAccumulator & updates sharing.
PLEASE NOTE: Do not use this method on standalone modelvoid
setIndex(int index)
Set the layer index.void
setInput(INDArray input)
Set the input array for the networkvoid
setInput(INDArray input, LayerWorkspaceMgr mgr)
Set the layer input.void
setInputMiniBatchSize(int size)
Set current/last input mini-batch size.
Used for score and gradient calculations.void
setIterationCount(int iterationCount)
Set the current iteration count (number of parameter updates) for the layer/networkvoid
setLabels(INDArray labels)
void
setLastEtlTime(long time)
Set the last ETL time in milliseconds, for informational/reporting purposes.void
setLayerMaskArrays(INDArray featuresMaskArray, INDArray labelsMaskArray)
Set the mask arrays for features and labels.void
setLayers(Layer[] layers)
void
setLayerWiseConfigurations(MultiLayerConfiguration layerWiseConfigurations)
This method is intended for internal/developer use only.void
setLearningRate(double newLr)
Set the learning rate for all layers in the network to the specified value.void
setLearningRate(int layerNumber, double newLr)
Set the learning rate for a single layer in the network to the specified value.void
setLearningRate(int layerNumber, ISchedule newLr)
Set the learning rate schedule for a single layer in the network to the specified value.
Note also thatsetLearningRate(ISchedule)
should also be used in preference, when all layers need to be set to a new LR schedule.
This schedule will replace any/all existing schedules, and also any fixed learning rate values.
Note also that the iteration/epoch counts will not be reset.void
setLearningRate(ISchedule newLr)
Set the learning rate schedule for all layers in the network to the specified schedule.void
setListeners(Collection<TrainingListener> listeners)
Set the trainingListeners for the ComputationGraph (and all layers in the network)void
setListeners(TrainingListener... listeners)
Set the trainingListeners for the ComputationGraph (and all layers in the network)void
setMask(INDArray mask)
void
setMaskArray(INDArray maskArray)
Set the mask array.void
setParam(String key, INDArray val)
Set the values of a single parameter.void
setParameters(INDArray params)
void
setParams(INDArray params)
Set the parameters for this model.void
setParamsViewArray(INDArray params)
Set the initial parameters array as a view of the full (backprop) network parameters NOTE: this is intended to be used internally in MultiLayerNetwork and ComputationGraph, not by users.void
setParamTable(Map<String,INDArray> paramTable)
Set the parameters of the netowrk.void
setScore(double score)
Intended for developer/internal usevoid
setUpdater(Updater updater)
Set the updater for the MultiLayerNetworkString
summary()
String detailing the architecture of the multilayernetwork.String
summary(InputType inputType)
String detailing the architecture of the multilayernetwork.protected void
synchronizeIterEpochCounts()
ComputationGraph
toComputationGraph()
Convert this MultiLayerNetwork to a ComputationGraphLayer.Type
type()
Returns the layer typevoid
update(Gradient gradient)
Update layer weights and biases with gradient changevoid
update(INDArray gradient, String paramType)
Perform one update applying the gradientprotected void
update(Task task)
boolean
updaterDivideByMinibatch(String paramName)
Intended for internal usevoid
updateRnnStateWithTBPTTState()
Intended for internal/developer useINDArray
updaterState()
This method returns updater state (if applicable), null otherwiseprotected void
validateArrayWorkspaces(LayerWorkspaceMgr mgr, INDArray array, ArrayType arrayType, int layerIdx, boolean isPreprocessor, String op)
-
-
-
Field Detail
-
layers
protected Layer[] layers
-
layerMap
protected LinkedHashMap<String,Layer> layerMap
-
input
protected INDArray input
-
labels
protected INDArray labels
-
initCalled
protected boolean initCalled
-
trainingListeners
protected Collection<TrainingListener> trainingListeners
-
defaultConfiguration
protected NeuralNetConfiguration defaultConfiguration
-
layerWiseConfigurations
protected MultiLayerConfiguration layerWiseConfigurations
-
gradient
protected Gradient gradient
-
score
protected double score
-
initDone
protected boolean initDone
-
flattenedParams
protected INDArray flattenedParams
-
flattenedGradients
protected transient INDArray flattenedGradients
-
clearTbpttState
protected boolean clearTbpttState
-
lastEtlTime
protected transient ThreadLocal<Long> lastEtlTime
-
mask
protected INDArray mask
-
layerIndex
protected int layerIndex
-
solver
protected transient Solver solver
-
WS_LAYER_WORKING_MEM
protected static final String WS_LAYER_WORKING_MEM
Workspace for working memory for a single layer: forward pass and backward pass Note that this is opened/closed once per op (activate/backpropGradient call)- See Also:
- Constant Field Values
-
WS_ALL_LAYERS_ACT
protected static final String WS_ALL_LAYERS_ACT
Workspace for storing all layers' activations - used only to store activations (layer inputs) as part of backprop Not used for inference- See Also:
- Constant Field Values
-
WS_LAYER_ACT_1
protected static final String WS_LAYER_ACT_1
Next 2 workspaces: used for: (a) Inference: holds activations for one layer only (b) Backprop: holds activation gradients for one layer only In both cases, they are opened and closed on every second layer- See Also:
- Constant Field Values
-
WS_LAYER_ACT_2
protected static final String WS_LAYER_ACT_2
- See Also:
- Constant Field Values
-
WS_OUTPUT_MEM
protected static final String WS_OUTPUT_MEM
Workspace for output methods that use OutputAdapter- See Also:
- Constant Field Values
-
WS_RNN_LOOP_WORKING_MEM
protected static final String WS_RNN_LOOP_WORKING_MEM
Workspace for working memory in RNNs - opened and closed once per RNN time step- See Also:
- Constant Field Values
-
WS_LAYER_WORKING_MEM_CONFIG
protected WorkspaceConfiguration WS_LAYER_WORKING_MEM_CONFIG
-
WS_ALL_LAYERS_ACT_CONFIG
protected static final WorkspaceConfiguration WS_ALL_LAYERS_ACT_CONFIG
-
WS_LAYER_ACT_X_CONFIG
protected WorkspaceConfiguration WS_LAYER_ACT_X_CONFIG
-
WS_RNN_LOOP_WORKING_MEM_CONFIG
protected static final WorkspaceConfiguration WS_RNN_LOOP_WORKING_MEM_CONFIG
-
-
Constructor Detail
-
MultiLayerNetwork
public MultiLayerNetwork(MultiLayerConfiguration conf)
-
MultiLayerNetwork
public MultiLayerNetwork(String conf, INDArray params)
Initialize the network based on the configuration (a MultiLayerConfiguration in JSON format) and parameters array- Parameters:
conf
- the configuration jsonparams
- the parameters for the network
-
MultiLayerNetwork
public MultiLayerNetwork(MultiLayerConfiguration conf, INDArray params)
Initialize the network based on the configuration and parameters array- Parameters:
conf
- the configurationparams
- the parameters
-
-
Method Detail
-
getLayerWorkingMemWSConfig
protected static WorkspaceConfiguration getLayerWorkingMemWSConfig(int numWorkingMemCycles)
-
getLayerActivationWSConfig
protected static WorkspaceConfiguration getLayerActivationWSConfig(int numLayers)
-
setCacheMode
public void setCacheMode(CacheMode mode)
This method sets specified CacheMode for all layers within network- Specified by:
setCacheMode
in interfaceLayer
- Parameters:
mode
-
-
setLastEtlTime
public void setLastEtlTime(long time)
Set the last ETL time in milliseconds, for informational/reporting purposes. Generally used internally.- Parameters:
time
- ETL time
-
getLastEtlTime
public long getLastEtlTime()
Get the last ETL time. This in informational, and is the amount of time in milliseconds that was required to obtain the last DataSet/MultiDataSet during fitting. A value consistently above 0 may indicate a data feeding bottleneck, or no asynchronous data prefetching (async prefetch is enabled by default)- Returns:
- The last ETL time in milliseconds, if avaliable (or 0 if not)
-
intializeConfigurations
protected void intializeConfigurations()
-
pretrain
public void pretrain(DataSetIterator iter)
Perform layerwise pretraining for one epoch - seepretrain(DataSetIterator, int)
-
pretrain
public void pretrain(DataSetIterator iter, int numEpochs)
Perform layerwise unsupervised training on all pre-trainable layers in the network (VAEs, Autoencoders, etc), for the specified number of epochs each. For example, if numEpochs=3, then layer 0 will be fit for 3 epochs, followed by layer 1 for 3 epochs, and so on.
Note that pretraining will be performed on one layer after the other. To perform unsupervised training on a single layer, usepretrainLayer(int, DataSetIterator)
- Parameters:
iter
- Training data
-
pretrainLayer
public void pretrainLayer(int layerIdx, DataSetIterator iter)
Fit for one epoch - seepretrainLayer(int, DataSetIterator, int)
-
pretrainLayer
public void pretrainLayer(int layerIdx, DataSetIterator iter, int numEpochs)
Perform layerwise unsupervised training on a single pre-trainable layer in the network (VAEs, Autoencoders, etc) for the specified number of epochs
If the specified layer index (0 to numLayers - 1) is not a pretrainable layer, this is a no-op.- Parameters:
layerIdx
- Index of the layer to train (0 to numLayers-1)iter
- Training datanumEpochs
- Number of epochs to fit the specified layer for
-
pretrainLayer
public void pretrainLayer(int layerIdx, INDArray features)
Perform layerwise unsupervised training on a single pre-trainable layer in the network (VAEs, Autoencoders, etc)
If the specified layer index (0 to numLayers - 1) is not a pretrainable layer, this is a no-op.- Parameters:
layerIdx
- Index of the layer to train (0 to numLayers-1)features
- Training data array
-
batchSize
public int batchSize()
Description copied from interface:Model
The current inputs batch size
-
conf
public NeuralNetConfiguration conf()
Description copied from interface:Model
The configuration for the neural network
-
setConf
public void setConf(NeuralNetConfiguration conf)
Description copied from interface:Model
Setter for the configuration
-
input
public INDArray input()
Description copied from interface:Model
The input/feature matrix for the model
-
getOptimizer
public ConvexOptimizer getOptimizer()
Description copied from interface:Model
Returns this models optimizer- Specified by:
getOptimizer
in interfaceModel
- Specified by:
getOptimizer
in interfaceNeuralNetwork
- Returns:
- this models optimizer
-
getParam
public INDArray getParam(String param)
Get one parameter array for the network.
In MultiLayerNetwork, parameters are keyed like "0_W" and "0_b" to mean "weights of layer index 0" and "biases of layer index 0" respectively. Numbers increment sequentially, and the suffixes ("W", "b" etc) depend on the layer type, and are defined in the relevant parameter initializers for each layer.
Note that the returned INDArrays are views of the underlying network parameters, so modifications of the returned arrays will impact the parameters of the network.- Specified by:
getParam
in interfaceModel
- Parameters:
param
- the key of the parameter- Returns:
- The specified parameter array for the network
- See Also:
paramTable() method, for a map of all parameters
-
paramTable
public Map<String,INDArray> paramTable()
Return a map of all parameters in the network. Parameter names are as described ingetParam(String)
. As pergetParam(String)
the returned arrays are views - modifications to these will impact the underlying network parameters- Specified by:
paramTable
in interfaceModel
- Returns:
- A map of all parameters in the network
-
paramTable
public Map<String,INDArray> paramTable(boolean backpropParamsOnly)
Returns a map of all parameters in the network as perparamTable()
.
Optionally (with backpropParamsOnly=true) only the 'backprop' parameters are returned - that is, any parameters involved only in unsupervised layerwise pretraining not standard inference/backprop are excluded from the returned list.- Specified by:
paramTable
in interfaceModel
- Specified by:
paramTable
in interfaceTrainable
- Parameters:
backpropParamsOnly
- If true, return backprop params only. If false: return all params- Returns:
- Parameters for the network
-
updaterDivideByMinibatch
public boolean updaterDivideByMinibatch(String paramName)
Intended for internal use- Specified by:
updaterDivideByMinibatch
in interfaceTrainable
- Parameters:
paramName
- Name of the parameter- Returns:
- True if gradients should be divided by minibatch (most params); false otherwise (edge cases like batch norm mean/variance estimates)
-
setParamTable
public void setParamTable(Map<String,INDArray> paramTable)
Set the parameters of the netowrk. Note that the parameter keys must match the format as described ingetParam(String)
andparamTable()
. Note that the values of the parameters used as an argument to this method are copied - i.e., it is safe to later modify/reuse the values in the provided paramTable without this impacting the network.- Specified by:
setParamTable
in interfaceModel
- Parameters:
paramTable
- Parameters to set
-
setParam
public void setParam(String key, INDArray val)
Set the values of a single parameter. SeesetParamTable(Map)
andgetParam(String)
for more details.
-
getLayerWiseConfigurations
public MultiLayerConfiguration getLayerWiseConfigurations()
Get the configuration for the network- Returns:
- Network configuration
-
setLayerWiseConfigurations
public void setLayerWiseConfigurations(MultiLayerConfiguration layerWiseConfigurations)
This method is intended for internal/developer use only.
-
init
public void init()
Initialize the MultiLayerNetwork. This should be called once before the network is used. This is functionally equivalent to callinginit(null, false)
.- Specified by:
init
in interfaceModel
- Specified by:
init
in interfaceNeuralNetwork
- See Also:
init(INDArray, boolean)
-
init
public void init(INDArray parameters, boolean cloneParametersArray)
Initialize the MultiLayerNetwork, optionally with an existing parameters array. If an existing parameters array is specified, it will be used (and the values will not be modified) in the network; if no parameters array is specified, parameters will be initialized randomly according to the network configuration.- Parameters:
parameters
- Network parameter. May be null. If null: randomly initialize.cloneParametersArray
- Whether the parameter array (if any) should be cloned, or used directly
-
setGradientsAccumulator
public void setGradientsAccumulator(GradientsAccumulator accumulator)
This method allows you to specificy GradientsAccumulator instance to be used with this model
PLEASE NOTE: Do not use this method unless you understand how to use GradientsAccumulator & updates sharing.
PLEASE NOTE: Do not use this method on standalone model- Parameters:
accumulator
- Gradient accumulator to use for the network
-
isInitCalled
public boolean isInitCalled()
-
initGradientsView
public void initGradientsView()
This method: initializes the flattened gradients array (used in backprop) and sets the appropriate subset in all layers. As a general rule, this shouldn't ever need to be called manually when doing training via fit(DataSet) or fit(DataSetIterator)
-
activationFromPrevLayer
protected INDArray activationFromPrevLayer(int curr, INDArray input, boolean training, LayerWorkspaceMgr mgr)
-
activateSelectedLayers
public INDArray activateSelectedLayers(int from, int to, INDArray input)
Calculate activation for few layers at once. Suitable for autoencoder partial activation. In example: in 10-layer deep autoencoder, layers 0 - 4 inclusive are used for encoding part, and layers 5-9 inclusive are used for decoding part.- Parameters:
from
- first layer to be activated, inclusiveto
- last layer to be activated, inclusive- Returns:
- the activation from the last layer
-
feedForward
public List<INDArray> feedForward(INDArray input, boolean train)
Compute all layer activations, from input to output of the output layer. Note that the input is included in the list: thus feedForward(in,train).get(0) is the inputs, .get(1) is the activations of layer 0, and so on.- Parameters:
train
- Training: if true, perform forward pass/inference at training time. Usually, inference is performed with train = false. This impacts whether dropout etc is applied or not.- Returns:
- The list of activations for each layer, including the input
-
feedForward
public List<INDArray> feedForward(boolean train)
Compute activations from input to output of the output layer. As perfeedForward(INDArray, boolean)
but using the inputs that have previously been set usingsetInput(INDArray)
- Returns:
- the list of activations for each layer
-
feedForward
public List<INDArray> feedForward(boolean train, boolean clearInputs)
Perform feed-forward, optionally (not) clearing the layer input arrays.
Note: when using clearInputs=false, there can be some performance and memory overhead: this is because the arrays are defined outside of workspaces (which are enabled by default) - otherwise, old/invalidated arrays could still be accessed after calling this method. Consequently: Don't use clearInputs=false unless you have a use case that requires them to remain after feed-forward has been completed- Parameters:
train
- training mode (true) or test mode (false)clearInputs
- If false: don't clear the layer inputs- Returns:
- Activations from feed-forward
-
feedForwardToLayer
public List<INDArray> feedForwardToLayer(int layerNum, INDArray input)
Compute the activations from the input to the specified layer.
To compute activations for all layers, use feedForward(...) methods
Note: output list includes the original input. So list.get(0) is always the original input, and list.get(i+1) is the activations of the ith layer.- Parameters:
layerNum
- Index of the last layer to calculate activations for. Layers are zero-indexed. feedForwardToLayer(i,input) will return the activations for layers 0..i (inclusive)input
- Input to the network- Returns:
- list of activations.
-
feedForwardToLayer
public List<INDArray> feedForwardToLayer(int layerNum, INDArray input, boolean train)
Compute the activations from the input to the specified layer.
To compute activations for all layers, use feedForward(...) methods
Note: output list includes the original input. So list.get(0) is always the original input, and list.get(i+1) is the activations of the ith layer.- Parameters:
layerNum
- Index of the last layer to calculate activations for. Layers are zero-indexed. feedForwardToLayer(i,input) will return the activations for layers 0..i (inclusive)input
- Input to the networktrain
- true for training, false for test (i.e., false if using network after training)- Returns:
- list of activations.
-
feedForwardToLayer
public List<INDArray> feedForwardToLayer(int layerNum, boolean train)
Compute the activations from the input to the specified layer, using the currently set input for the network.
To compute activations for all layers, use feedForward(...) methods
Note: output list includes the original input. So list.get(0) is always the original input, and list.get(i+1) is the activations of the ith layer.- Parameters:
layerNum
- Index of the last layer to calculate activations for. Layers are zero-indexed. feedForwardToLayer(i,input) will return the activations for layers 0..i (inclusive)train
- true for training, false for test (i.e., false if using network after training)- Returns:
- list of activations.
-
validateArrayWorkspaces
protected void validateArrayWorkspaces(LayerWorkspaceMgr mgr, INDArray array, ArrayType arrayType, int layerIdx, boolean isPreprocessor, String op)
-
ffToLayerActivationsDetached
protected List<INDArray> ffToLayerActivationsDetached(boolean train, @NonNull @NonNull FwdPassType fwdPassType, boolean storeLastForTBPTT, int layerIndex, @NonNull @NonNull INDArray input, INDArray fMask, INDArray lMask, boolean clearInputs)
Feed-forward through the network - returning all array activations in a list, detached from any workspace. Note that no workspace should be active externally when calling this method (an exception will be thrown if a workspace is open externally)- Parameters:
train
- Training mode (true) or test/inference mode (false)fwdPassType
- Type of forward pass to perform (STANDARD or RNN_ACTIVATE_WITH_STORED_STATE only)storeLastForTBPTT
- ONLY used if fwdPassType == FwdPassType.RNN_ACTIVATE_WITH_STORED_STATElayerIndex
- Index (inclusive) to stop forward pass at. For all layers, use numLayers-1input
- Input to the networkfMask
- Feature mask array. May be null.lMask
- Label mask array. May be null.clearInputs
- Whether the layer inputs should be cleared- Returns:
- List of activations (including the input), detached from any workspace
-
ffToLayerActivationsInWs
protected List<INDArray> ffToLayerActivationsInWs(int layerIndex, @NonNull @NonNull FwdPassType fwdPassType, boolean storeLastForTBPTT, @NonNull @NonNull INDArray input, INDArray fMask, INDArray lMask)
Feed-forward through the network at training time - returning a list of all activations in a workspace (WS_ALL_LAYERS_ACT) if workspaces are enabled for training; or detached if no workspaces are used.
Note: if using workspaces for training, this method requires that WS_ALL_LAYERS_ACT is open externally.
If using NO workspaces, requires that no external workspace is open
Note that this method does NOT clear the inputs to each layer - instead, they are in the WS_ALL_LAYERS_ACT workspace for use in later backprop.- Parameters:
layerIndex
- Index (inclusive) to stop forward pass at. For all layers, use numLayers-1fwdPassType
- Type of forward pass to perform (STANDARD or RNN_ACTIVATE_WITH_STORED_STATE only)storeLastForTBPTT
- ONLY used if fwdPassType == FwdPassType.RNN_ACTIVATE_WITH_STORED_STATEinput
- Input to networkfMask
- Feature mask array. May be nulllMask
- Label mask aray. May be null.- Returns:
-
outputOfLayerDetached
protected INDArray outputOfLayerDetached(boolean train, @NonNull @NonNull FwdPassType fwdPassType, int layerIndex, @NonNull @NonNull INDArray input, INDArray featureMask, INDArray labelsMask, MemoryWorkspace outputWorkspace)
Provide the output of the specified layer, detached from any workspace. This is most commonly used at inference/test time, and is more memory efficient thanffToLayerActivationsDetached(boolean, FwdPassType, boolean, int, INDArray, INDArray, INDArray, boolean)
andffToLayerActivationsInWs(int, FwdPassType, boolean, INDArray, INDArray, INDArray)
.
This method clears all layer inputs. NOTE: in general, no workspaces should be activated externally for this method! This method handles the workspace activation as required- Parameters:
train
- Training mode (true) or test/inference mode (false)fwdPassType
- Type of forward pass to perform (STANDARD, RNN_TIMESTEP or RNN_ACTIVATE_WITH_STORED_STATE)layerIndex
- Index (inclusive) to stop forward pass at. For all layers, use numLayers-1input
- Input to the networkfeatureMask
- Input/feature mask array. May be null.labelsMask
- Labels mask array. May be nulloutputWorkspace
- Optional - if provided, outputs should be placed in this workspace. NOTE: this workspace must be open- Returns:
- Output of the specified layer, detached from any workspace
-
feedForward
public List<INDArray> feedForward()
Compute activations of all layers from input (inclusive) to output of the final/output layer. Equivalent to callingfeedForward(boolean)
with train=false- Returns:
- the list of activations for each layer, including the input
-
feedForward
public List<INDArray> feedForward(INDArray input)
Compute activations of all layers from input (inclusive) to output of the final/output layer. Equivalent to callingfeedForward(INDArray, boolean)
with train = false- Returns:
- the list of activations for each layer, including the input
-
feedForward
public List<INDArray> feedForward(INDArray input, INDArray featuresMask, INDArray labelsMask)
Compute the activations from the input to the output layer, given mask arrays (that may be null) The masking arrays are used in situations such an one-to-many and many-to-one rucerrent neural network (RNN) designs, as well as for supporting time series of varying lengths within the same minibatch for RNNs. Other than mask arrays, this is equivalent to callingfeedForward(INDArray, boolean)
with train = false
-
gradient
public Gradient gradient()
Description copied from interface:Model
Get the gradient. Note that this method will not calculate the gradient, it will rather return the gradient that has been computed before. For calculating the gradient, seeModel.computeGradientAndScore(LayerWorkspaceMgr)
} .
-
gradientAndScore
public Pair<Gradient,Double> gradientAndScore()
Description copied from interface:Model
Get the gradient and score- Specified by:
gradientAndScore
in interfaceModel
- Returns:
- the gradient and score
-
clone
public MultiLayerNetwork clone()
Clone the MultiLayerNetwork
-
hasAFrozenLayer
protected boolean hasAFrozenLayer()
-
params
@Deprecated public INDArray params(boolean backwardOnly)
Deprecated.To be removed. Useparams()
instead
-
params
public INDArray params()
Returns a 1 x m vector where the vector is composed of a flattened vector of all of the parameters in the network.
SeegetParam(String)
andparamTable()
for a more useful/interpretable representation of the parameters.
Note that the parameter vector is not a copy, and changes to the returned INDArray will impact the network parameters.
-
setParams
public void setParams(INDArray params)
Set the parameters for this model. This expects a linear ndarray which then be unpacked internally relative to the expected ordering of the model.
See also:setParamTable(Map)
andsetParam(String, INDArray)
-
setParamsViewArray
public void setParamsViewArray(INDArray params)
Description copied from interface:Model
Set the initial parameters array as a view of the full (backprop) network parameters NOTE: this is intended to be used internally in MultiLayerNetwork and ComputationGraph, not by users.- Specified by:
setParamsViewArray
in interfaceModel
- Parameters:
params
- a 1 x nParams row vector that is a view of the larger (MLN/CG) parameters array
-
getGradientsViewArray
public INDArray getGradientsViewArray()
- Specified by:
getGradientsViewArray
in interfaceModel
- Specified by:
getGradientsViewArray
in interfaceTrainable
- Returns:
- 1D gradients view array
-
setBackpropGradientsViewArray
public void setBackpropGradientsViewArray(INDArray gradients)
Description copied from interface:Model
Set the gradients array as a view of the full (backprop) network parameters NOTE: this is intended to be used internally in MultiLayerNetwork and ComputationGraph, not by users.- Specified by:
setBackpropGradientsViewArray
in interfaceModel
- Parameters:
gradients
- a 1 x nParams row vector that is a view of the larger (MLN/CG) gradients array
-
getConfig
public TrainingConfig getConfig()
-
numParams
public long numParams()
Returns the number of parameters in the network
-
numParams
public long numParams(boolean backwards)
Returns the number of parameters in the network
-
f1Score
public double f1Score(DataSet data)
Sets the input and labels and returns the F1 score for the prediction with respect to the true labels- Specified by:
f1Score
in interfaceClassifier
- Parameters:
data
- the data to score- Returns:
- the score for the given input,label pairs
-
fit
public void fit(@NonNull @NonNull DataSetIterator iterator, int numEpochs)
Perform minibatch training on all minibatches in the DataSetIterator, for the specified number of epochs. Equvalent to callingfit(DataSetIterator)
numEpochs times in a loop- Parameters:
iterator
- Training data (DataSetIterator). Iterator must support resettingnumEpochs
- Number of training epochs, >= 1
-
fit
public void fit(DataSetIterator iterator)
Perform minibatch training on all minibatches in the DataSetIterator for 1 epoch.
Note that this method does not do layerwise pretraining.
For pretraining use method pretrain..pretrain(DataSetIterator)
- Specified by:
fit
in interfaceClassifier
- Specified by:
fit
in interfaceNeuralNetwork
- Parameters:
iterator
- Training data (DataSetIterator)
-
calculateGradients
public Pair<Gradient,INDArray> calculateGradients(@NonNull @NonNull INDArray features, @NonNull @NonNull INDArray label, INDArray fMask, INDArray labelMask)
Calculate parameter gradients and input activation gradients given the input and labels, and optionally mask arrays- Parameters:
features
- Features for gradient calculationlabel
- Labels for gradientfMask
- Features mask array (may be null)labelMask
- Label mask array (may be null)- Returns:
- A pair of gradient arrays: parameter gradients (in Gradient object) and input activation gradients
-
calcBackpropGradients
protected Pair<Gradient,INDArray> calcBackpropGradients(INDArray epsilon, boolean withOutputLayer, boolean tbptt, boolean returnInputActGrad)
Calculate gradients and errors. Used in two places: (a) backprop (for standard multi layer network learning) (b) backpropGradient (layer method, for when MultiLayerNetwork is used as a layer)- Parameters:
epsilon
- Errors (technically errors .* activations). Not used if withOutputLayer = truewithOutputLayer
- if true: assume last layer is output layer, and calculate errors based on labels. In this case, the epsilon input is not used (may/should be null). If false: calculate backprop gradientsreturnInputActGrad
- If true: terun the input activation gradients (detached). False: don't return- Returns:
- Gradients and the error (epsilon) at the input
-
doTruncatedBPTT
protected void doTruncatedBPTT(INDArray input, INDArray labels, INDArray featuresMaskArray, INDArray labelsMaskArray, LayerWorkspaceMgr workspaceMgr)
-
updateRnnStateWithTBPTTState
public void updateRnnStateWithTBPTTState()
Intended for internal/developer use
-
getListeners
public Collection<TrainingListener> getListeners()
Get theTrainingListener
s set for this network, if any- Specified by:
getListeners
in interfaceLayer
- Returns:
- listeners set for this network
-
getTrainingListeners
@Deprecated public Collection<TrainingListener> getTrainingListeners()
Deprecated.UsegetListeners()
-
setListeners
public void setListeners(Collection<TrainingListener> listeners)
Description copied from interface:Model
Set the trainingListeners for the ComputationGraph (and all layers in the network)- Specified by:
setListeners
in interfaceLayer
- Specified by:
setListeners
in interfaceModel
-
addListeners
public void addListeners(TrainingListener... listeners)
This method ADDS additional TrainingListener to existing listeners- Specified by:
addListeners
in interfaceModel
- Parameters:
listeners
-
-
setListeners
public void setListeners(TrainingListener... listeners)
Description copied from interface:Model
Set the trainingListeners for the ComputationGraph (and all layers in the network)- Specified by:
setListeners
in interfaceLayer
- Specified by:
setListeners
in interfaceModel
-
predict
public int[] predict(INDArray d)
Usable only for classification networks in conjunction with OutputLayer. Cannot be used with RnnOutputLayer, CnnLossLayer, or networks used for regression.
To get the raw output activations of the output layer, useoutput(INDArray)
or similar.
Equivalent to argmax(this.output(input)): Returns the predicted class indices corresponding to the predictions for each example in the features array.- Specified by:
predict
in interfaceClassifier
- Parameters:
d
- The input features to perform inference on- Returns:
- The predicted class index for each example
-
predict
public List<String> predict(DataSet dataSet)
As perpredict(INDArray)
but the returned values are looked up from the list of label names in the provided DataSet- Specified by:
predict
in interfaceClassifier
- Parameters:
dataSet
- the examples to classify- Returns:
- the labels for each example
-
fit
public void fit(INDArray data, INDArray labels)
Fit the model for one iteration on the provided data- Specified by:
fit
in interfaceClassifier
- Parameters:
data
- the examples to classify (one example in each row)labels
- the example labels(a binary outcome matrix)
-
fit
public void fit(INDArray features, INDArray labels, INDArray featuresMask, INDArray labelsMask)
Fit the model for one iteration on the provided data- Parameters:
features
- the examples to classify (one example in each row)labels
- the example labels(a binary outcome matrix)featuresMask
- The mask array for the features (used for variable length time series, etc). May be null.labelsMask
- The mask array for the labels (used for variable length time series, etc). May be null.
-
fit
public void fit(INDArray data, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:Model
Fit the model to the given data
-
fit
public void fit(DataSet data)
Fit the model for one iteration on the provided data- Specified by:
fit
in interfaceClassifier
- Specified by:
fit
in interfaceNeuralNetwork
- Parameters:
data
- the data to train on
-
fit
public void fit(INDArray examples, int[] labels)
Fit the model for one iteration on the provided data- Specified by:
fit
in interfaceClassifier
- Parameters:
examples
- the examples to classify (one example in each row)labels
- the labels for each example (the number of labels must match
-
output
public INDArray output(INDArray input, Layer.TrainingMode train)
Perform inference on the provided input/features - i.e., perform forward pass using the provided input/features and return the output of the final layer.- Parameters:
input
- Input to the networktrain
- whether the output is test or train. This mainly affect hyper parameters such as dropout and batch normalization, which have different behaviour for test vs. train- Returns:
- The network predictions - i.e., the activations of the final layer
-
output
public INDArray output(INDArray input, boolean train)
Perform inference on the provided input/features - i.e., perform forward pass using the provided input/features and return the output of the final layer.- Parameters:
input
- Input to the networktrain
- whether the output is test or train. This mainly affect hyper parameters such as dropout and batch normalization, which have different behaviour for test vs. train- Returns:
- The network predictions - i.e., the activations of the final layer
-
output
public INDArray output(INDArray input, boolean train, INDArray featuresMask, INDArray labelsMask)
Calculate the output of the network, with masking arrays. The masking arrays are used in situations such as one-to-many and many-to-one recurrent neural network (RNN) designs, as well as for supporting time series of varying lengths within the same minibatch.
-
output
public INDArray output(INDArray input, boolean train, MemoryWorkspace outputWorkspace)
Get the network output, which is optionally placed in the specified memory workspace.
If no memory workspace is provided, the output will be detached (not in any workspace).
If a memory workspace is provided, the output activation array (i.e., the INDArray returned by this method) will be placed in the specified workspace. This workspace must be opened by the user before calling this method - and the user is responsible for (a) closing this workspace, and (b) ensuring the output array is not used out of scope (i.e., not used after closing the workspace to which it belongs - as this is likely to cause either an exception when used, or a crash).- Parameters:
input
- Input to the networktrain
- True for train, false otherwiseoutputWorkspace
- May be null. If not null: the workspace MUST be opened before calling this method.- Returns:
- The output/activations from the network (either detached or in the specified workspace if provided)
-
output
public INDArray output(INDArray input, boolean train, INDArray featuresMask, INDArray labelsMask, MemoryWorkspace outputWorkspace)
Get the network output, which is optionally placed in the specified memory workspace.
If no memory workspace is provided, the output will be detached (not in any workspace).
If a memory workspace is provided, the output activation array (i.e., the INDArray returned by this method) will be placed in the specified workspace. This workspace must be opened by the user before calling this method - and the user is responsible for (a) closing this workspace, and (b) ensuring the output array is not used out of scope (i.e., not used after closing the workspace to which it belongs - as this is likely to cause either an exception when used, or a crash).- Parameters:
input
- Input to the networktrain
- True for train, false otherwiseoutputWorkspace
- May be null. If not null: the workspace MUST be opened before calling this method.- Returns:
- The output/activations from the network (either detached or in the specified workspace if provided)
-
output
public <T> T output(@NonNull @NonNull INDArray inputs, INDArray inputMasks, INDArray labelMasks, @NonNull @NonNull OutputAdapter<T> outputAdapter)
This method uses provided OutputAdapter to return custom object built from INDArray PLEASE NOTE: This method uses dedicated Workspace for output generation to avoid redundant allocations- Type Parameters:
T
- T extends Object- Parameters:
inputs
- Input arrays to the netwonkinputMasks
- Optional input mask arrays (may be null)labelMasks
- Optional label mask arrays (may be nulloutputAdapter
- OutputAdapterinstance - Returns:
- T instance produced by OutputAdapter
-
output
public INDArray output(INDArray input)
Perform inference on the provided input/features - i.e., perform forward pass using the provided input/features and return the output of the final layer. Equivalent tooutput(INDArray, boolean)
with train=false - i.e., this method is used for inference.- Parameters:
input
- Input to the network- Returns:
- The network predictions - i.e., the activations of the final layer
-
output
public INDArray output(DataSetIterator iterator, boolean train)
Generate the output for all examples/batches in the input iterator, and concatenate them into a single array. Seeoutput(INDArray)
NOTE 1: The output array can require a considerable amount of memory for iterators with a large number of examples
NOTE 2: This method cannot be used for variable length time series outputs, as this would require padding arrays for some outputs, or returning a mask array (which cannot be done with this method). For variable length time series applications, use one of the other output methods. This method also cannot be used with fully convolutional networks with different output sizes (for example, segmentation on different input image sizes).- Parameters:
iterator
- Data to pass through the network- Returns:
- output for all examples in the iterator, concatenated into a
-
output
public INDArray output(DataSetIterator iterator)
Equivalent tooutput(DataSetIterator, boolean)
with train=false
-
f1Score
public double f1Score(INDArray input, INDArray labels)
Perform inference and then calculate the F1 score of the output(input) vs. the labels.- Specified by:
f1Score
in interfaceClassifier
- Parameters:
input
- the input to perform inference withlabels
- the true labels- Returns:
- the score for the given input,label pairs
-
numLabels
@Deprecated public int numLabels()
Deprecated.Will be removed in a future releaseDescription copied from interface:Classifier
Returns the number of possible labels- Specified by:
numLabels
in interfaceClassifier
- Returns:
- the number of possible labels for this classifier
-
score
public double score(DataSet data)
Sets the input and labels and calculates the score (value of the output layer loss function plus l1/l2 if applicable) for the prediction with respect to the true labels
This is equivalent toscore(DataSet, boolean)
with training==false.- Parameters:
data
- the data to score- Returns:
- the score for the given input,label pairs
- See Also:
score(DataSet, boolean)
-
score
public double score(DataSet data, boolean training)
Sets the input and labels and calculates the score (value of the output layer loss function plus l1/l2 if applicable) for the prediction with respect to the true labels- Parameters:
data
- data to calculate score fortraining
- If true: score during training. If false: score at test time. This can affect the application of certain features, such as dropout and dropconnect (which are applied at training time only)- Returns:
- the score (value of the loss function)
-
scoreExamples
public INDArray scoreExamples(DataSetIterator iter, boolean addRegularizationTerms)
As perscoreExamples(DataSet, boolean)
- the outputs (example scores) for all DataSets in the iterator are concatenated
-
scoreExamples
public INDArray scoreExamples(DataSet data, boolean addRegularizationTerms)
Calculate the score for each example in a DataSet individually. Unlikescore(DataSet)
andscore(DataSet, boolean)
this method does not average/sum over examples. This method allows for examples to be scored individually (at test time only), which may be useful for example for autoencoder architectures and the like.
Each row of the output (assuming addRegularizationTerms == true) is equivalent to calling score(DataSet) with a single example.- Parameters:
data
- The data to scoreaddRegularizationTerms
- If true: add l1/l2 regularization terms (if any) to the score. If false: don't add regularization terms- Returns:
- An INDArray (column vector) of size input.numRows(); the ith entry is the score (loss value) of the ith example
-
fit
public void fit()
Description copied from interface:Model
All models have a fit method
-
update
public void update(INDArray gradient, String paramType)
Description copied from interface:Model
Perform one update applying the gradient
-
score
public double score()
Score of the model (relative to the objective function) - previously calculated on the last minibatch
-
setScore
public void setScore(double score)
Intended for developer/internal use
-
computeGradientAndScore
public void computeGradientAndScore(LayerWorkspaceMgr layerWorkspaceMgr)
Description copied from interface:Model
Update the score- Specified by:
computeGradientAndScore
in interfaceModel
-
computeGradientAndScore
public void computeGradientAndScore()
-
clear
public void clear()
Clear the inputs. Clears optimizer state.
-
applyConstraints
public void applyConstraints(int iteration, int epoch)
Description copied from interface:Model
Apply any constraints to the model- Specified by:
applyConstraints
in interfaceModel
-
setInput
public void setInput(INDArray input)
Set the input array for the network- Parameters:
input
- Input array to set
-
setInput
public void setInput(INDArray input, LayerWorkspaceMgr mgr)
Description copied from interface:Layer
Set the layer input.
-
getOutputLayer
public Layer getOutputLayer()
Get the output layer - i.e., the last layer in the netwok- Returns:
-
setParameters
public void setParameters(INDArray params)
-
getDefaultConfiguration
public NeuralNetConfiguration getDefaultConfiguration()
Intended for internal/developer use
-
getLabels
public INDArray getLabels()
-
getInput
public INDArray getInput()
-
setLabels
public void setLabels(INDArray labels)
- Parameters:
labels
- Labels to set
-
getnLayers
public int getnLayers()
Get the number of layers in the network- Returns:
- the number of layers in the network
-
getLayers
public Layer[] getLayers()
- Returns:
- The layers in the network
-
getLayer
public Layer getLayer(int i)
-
setLayers
public void setLayers(Layer[] layers)
-
getMask
public INDArray getMask()
-
setMask
public void setMask(INDArray mask)
-
getMaskArray
public INDArray getMaskArray()
- Specified by:
getMaskArray
in interfaceLayer
-
isPretrainLayer
public boolean isPretrainLayer()
Description copied from interface:Layer
Returns true if the layer can be trained in an unsupervised/pretrain manner (AE, VAE, etc)- Specified by:
isPretrainLayer
in interfaceLayer
- Returns:
- true if the layer can be pretrained (using fit(INDArray), false otherwise
-
clearNoiseWeightParams
public void clearNoiseWeightParams()
- Specified by:
clearNoiseWeightParams
in interfaceLayer
-
allowInputModification
public void allowInputModification(boolean allow)
Description copied from interface:Layer
A performance optimization: mark whether the layer is allowed to modify its input array in-place. In many cases, this is totally safe - in others, the input array will be shared by multiple layers, and hence it's not safe to modify the input array. This is usually used by ops such as dropout.- Specified by:
allowInputModification
in interfaceLayer
- Parameters:
allow
- If true: the input array is safe to modify. If false: the input array should be copied before it is modified (i.e., in-place modifications are un-safe)
-
feedForwardMaskArray
public Pair<INDArray,MaskState> feedForwardMaskArray(INDArray maskArray, MaskState currentMaskState, int minibatchSize)
Description copied from interface:Layer
Feed forward the input mask array, setting in the layer as appropriate. This allows different layers to handle masks differently - for example, bidirectional RNNs and normal RNNs operate differently with masks (the former sets activations to 0 outside of the data present region (and keeps the mask active for future layers like dense layers), whereas normal RNNs don't zero out the activations/errors )instead relying on backpropagated error arrays to handle the variable length case.
This is also used for example for networks that contain global pooling layers, arbitrary preprocessors, etc.- Specified by:
feedForwardMaskArray
in interfaceLayer
- Parameters:
maskArray
- Mask array to setcurrentMaskState
- Current state of the mask - seeMaskState
minibatchSize
- Current minibatch size. Needs to be known as it cannot always be inferred from the activations array due to reshaping (such as a DenseLayer within a recurrent neural network)- Returns:
- New mask array after this layer, along with the new mask state.
-
getHelper
public LayerHelper getHelper()
-
type
public Layer.Type type()
Description copied from interface:Layer
Returns the layer type
-
activate
public INDArray activate(Layer.TrainingMode training)
Equivalent tooutput(INDArray)
using the input set viasetInput(INDArray)
-
activate
public INDArray activate(INDArray input, Layer.TrainingMode training)
Equivalent tooutput(INDArray, TrainingMode)
-
backpropGradient
public Pair<Gradient,INDArray> backpropGradient(INDArray epsilon, LayerWorkspaceMgr workspaceMgr)
Description copied from interface:Layer
Calculate the gradient relative to the error in the next layer- Specified by:
backpropGradient
in interfaceLayer
- Parameters:
epsilon
- w^(L+1)*delta^(L+1). Or, equiv: dC/da, i.e., (dC/dz)*(dz/da) = dC/da, where C is cost function a=sigma(z) is activation.workspaceMgr
- Workspace manager- Returns:
- Pair
where Gradient is gradient for this layer, INDArray is epsilon (activation gradient) needed by next layer, but before element-wise multiply by sigmaPrime(z). So for standard feed-forward layer, if this layer is L, then return.getSecond() == dL/dIn = (w^(L)*(delta^(L))^T)^T. Note that the returned array should be placed in the ArrayType.ACTIVATION_GRAD
workspace via the workspace manager
-
setIndex
public void setIndex(int index)
Description copied from interface:Layer
Set the layer index.
-
getIndex
public int getIndex()
Description copied from interface:Layer
Get the layer index.
-
getIterationCount
public int getIterationCount()
- Specified by:
getIterationCount
in interfaceLayer
- Returns:
- The current iteration count (number of parameter updates) for the layer/network
-
getEpochCount
public int getEpochCount()
- Specified by:
getEpochCount
in interfaceLayer
- Returns:
- The current epoch count (number of training epochs passed) for the layer/network
-
setIterationCount
public void setIterationCount(int iterationCount)
Description copied from interface:Layer
Set the current iteration count (number of parameter updates) for the layer/network- Specified by:
setIterationCount
in interfaceLayer
-
setEpochCount
public void setEpochCount(int epochCount)
Description copied from interface:Layer
Set the current epoch count (number of epochs passed ) for the layer/network- Specified by:
setEpochCount
in interfaceLayer
-
calcRegularizationScore
public double calcRegularizationScore(boolean backpropParamsOnly)
Description copied from interface:Layer
Calculate the regularization component of the score, for the parameters in this layer
For example, the L1, L2 and/or weight decay components of the loss function- Specified by:
calcRegularizationScore
in interfaceLayer
- Parameters:
backpropParamsOnly
- If true: calculate regularization score based on backprop params only. If false: calculate based on all params (including pretrain params, if any)- Returns:
- the regularization score of
-
update
public void update(Gradient gradient)
Description copied from interface:Model
Update layer weights and biases with gradient change
-
activate
public INDArray activate(boolean training, LayerWorkspaceMgr mgr)
Description copied from interface:Layer
Perform forward pass and return the activations array with the last set input- Specified by:
activate
in interfaceLayer
- Parameters:
training
- training or test modemgr
- Workspace manager- Returns:
- the activation (layer output) of the last specified input. Note that the returned array should be placed
in the
ArrayType.ACTIVATIONS
workspace via the workspace manager
-
activate
public INDArray activate(INDArray input, boolean training, LayerWorkspaceMgr mgr)
Description copied from interface:Layer
Perform forward pass and return the activations array with the specified input- Specified by:
activate
in interfaceLayer
- Parameters:
input
- the input to usetraining
- train or test modemgr
- Workspace manager.- Returns:
- Activations array. Note that the returned array should be placed in the
ArrayType.ACTIVATIONS
workspace via the workspace manager
-
setInputMiniBatchSize
public void setInputMiniBatchSize(int size)
Description copied from interface:Layer
Set current/last input mini-batch size.
Used for score and gradient calculations. Mini batch size may be different from getInput().size(0) due to reshaping operations - for example, when using RNNs with DenseLayer and OutputLayer. Called automatically during forward pass.- Specified by:
setInputMiniBatchSize
in interfaceLayer
-
getInputMiniBatchSize
public int getInputMiniBatchSize()
Description copied from interface:Layer
Get current/last input mini-batch size, as set by setInputMiniBatchSize(int)- Specified by:
getInputMiniBatchSize
in interfaceLayer
- See Also:
Layer.setInputMiniBatchSize(int)
-
setMaskArray
public void setMaskArray(INDArray maskArray)
Description copied from interface:Layer
Set the mask array. Note: In general,Layer.feedForwardMaskArray(INDArray, MaskState, int)
should be used in preference to this.- Specified by:
setMaskArray
in interfaceLayer
- Parameters:
maskArray
- Mask array to set
-
rnnTimeStep
public INDArray rnnTimeStep(INDArray input)
If this MultiLayerNetwork contains one or more RNN layers: conduct forward pass (prediction) but using previous stored state for any RNN layers. The activations for the final step are also stored in the RNN layers for use next time rnnTimeStep() is called.
This method can be used to generate output one or more steps at a time instead of always having to do forward pass from t=0. Example uses are for streaming data, and for generating samples from network output one step at a time (where samples are then fed back into the network as input)
If no previous state is present in RNN layers (i.e., initially or after calling rnnClearPreviousState()), the default initialization (usually 0) is used.
Supports mini-batch (i.e., multiple predictions/forward pass in parallel) as well as for single examples.- Parameters:
input
- Input to network. May be for one or multiple time steps. For single time step: input has shape [miniBatchSize,inputSize] or [miniBatchSize,inputSize,1]. miniBatchSize=1 for single example.
For multiple time steps: [miniBatchSize,inputSize,inputTimeSeriesLength]- Returns:
- Output activations. If output is RNN layer (such as RnnOutputLayer): if input has shape [miniBatchSize,inputSize]
i.e., is 2d, output has shape [miniBatchSize,outputSize] (i.e., also 2d).
Otherwise output is 3d [miniBatchSize,outputSize,inputTimeSeriesLength] when using RnnOutputLayer. - See Also:
For outputting the activations in the specified workspace
-
rnnTimeStep
public INDArray rnnTimeStep(INDArray input, MemoryWorkspace outputWorkspace)
SeernnTimeStep(INDArray)
for details
If no memory workspace is provided, the output will be detached (not in any workspace).
If a memory workspace is provided, the output activation array (i.e., the INDArray returned by this method) will be placed in the specified workspace. This workspace must be opened by the user before calling this method - and the user is responsible for (a) closing this workspace, and (b) ensuring the output array is not used out of scope (i.e., not used after closing the workspace to which it belongs - as this is likely to cause either an exception when used, or a crash).- Parameters:
input
- Input activationsoutputWorkspace
- Output workspace. May be null- Returns:
- The output/activations from the network (either detached or in the specified workspace if provided)
-
rnnGetPreviousState
public Map<String,INDArray> rnnGetPreviousState(int layer)
Get the state of the RNN layer, as used in rnnTimeStep().- Parameters:
layer
- Number/index of the layer.- Returns:
- Hidden state, or null if layer is not an RNN layer
-
rnnSetPreviousState
public void rnnSetPreviousState(int layer, Map<String,INDArray> state)
Set the state of the RNN layer.- Parameters:
layer
- The number/index of the layer.state
- The state to set the specified layer to
-
rnnClearPreviousState
public void rnnClearPreviousState()
Clear the previous state of the RNN layers (if any).
-
rnnActivateUsingStoredState
public List<INDArray> rnnActivateUsingStoredState(INDArray input, boolean training, boolean storeLastForTBPTT)
Similar to rnnTimeStep and feedForward() methods. Difference here is that this method:
(a) like rnnTimeStep does forward pass using stored state for RNN layers, and
(b) unlike rnnTimeStep does not modify the RNN layer state
Therefore multiple calls to this method with the same input should have the same output.
Typically used during training only. Use rnnTimeStep for prediction/forward pass at test time.- Parameters:
input
- Input to networktraining
- Whether training or notstoreLastForTBPTT
- set to true if used as part of truncated BPTT training- Returns:
- Activations for each layer (including input, as per feedforward() etc)
-
getUpdater
public Updater getUpdater()
Get the updater for this MultiLayerNetwork- Returns:
- Updater for MultiLayerNetwork
-
getUpdater
public Updater getUpdater(boolean initializeIfReq)
-
setUpdater
public void setUpdater(Updater updater)
Set the updater for the MultiLayerNetwork
-
setLayerMaskArrays
public void setLayerMaskArrays(INDArray featuresMaskArray, INDArray labelsMaskArray)
Set the mask arrays for features and labels. Mask arrays are typically used in situations such as one-to-many and many-to-one learning with recurrent neural networks, as well as for supporting time series of varying lengths within the same minibatch.
For example, with RNN data sets with input of shape [miniBatchSize,nIn,timeSeriesLength] and outputs of shape [miniBatchSize,nOut,timeSeriesLength], the features and mask arrays will have shape [miniBatchSize,timeSeriesLength] and contain values 0 or 1 at each element (to specify whether a given input/example is present - or merely padding - at a given time step).
NOTE: This method is not usually used directly. Instead, methods such asfeedForward(INDArray, INDArray, INDArray)
andoutput(INDArray, boolean, INDArray, INDArray)
handle setting of masking internally.- Parameters:
featuresMaskArray
- Mask array for features (input)labelsMaskArray
- Mask array for labels (output)- See Also:
clearLayerMaskArrays()
-
clearLayerMaskArrays
public void clearLayerMaskArrays()
Remove the mask arrays from all layers.
SeesetLayerMaskArrays(INDArray, INDArray)
for details on mask arrays.
-
evaluate
public <T extends Evaluation> T evaluate(@NonNull @NonNull DataSetIterator iterator)
Evaluate the network (classification performance)- Parameters:
iterator
- Iterator to evaluate on- Returns:
- Evaluation object; results of evaluation on all examples in the data set
-
evaluate
public Evaluation evaluate(@NonNull @NonNull MultiDataSetIterator iterator)
Evaluate the network (classification performance). Can only be used with MultiDataSetIterator instances with a single input/output array- Parameters:
iterator
- Iterator to evaluate on- Returns:
- Evaluation object; results of evaluation on all examples in the data set
-
evaluateRegression
public <T extends RegressionEvaluation> T evaluateRegression(DataSetIterator iterator)
Evaluate the network for regression performance- Parameters:
iterator
- Data to evaluate on- Returns:
- Regression evaluation
-
evaluateRegression
public RegressionEvaluation evaluateRegression(MultiDataSetIterator iterator)
Evaluate the network for regression performance Can only be used with MultiDataSetIterator instances with a single input/output array- Parameters:
iterator
- Data to evaluate on
-
evaluateROC
@Deprecated public <T extends ROC> T evaluateROC(DataSetIterator iterator)
Deprecated.To be removed - useevaluateROC(DataSetIterator, int)
to enforce selection of appropriate ROC/threshold configuration
-
evaluateROC
public <T extends ROC> T evaluateROC(DataSetIterator iterator, int rocThresholdSteps)
Evaluate the network (must be a binary classifier) on the specified data, using theROC
class- Parameters:
iterator
- Data to evaluate onrocThresholdSteps
- Number of threshold steps to use withROC
- see that class for details.- Returns:
- ROC evaluation on the given dataset
-
evaluateROCMultiClass
@Deprecated public <T extends ROCMultiClass> T evaluateROCMultiClass(DataSetIterator iterator)
Deprecated.To be removed - useevaluateROCMultiClass(DataSetIterator, int)
to enforce selection of appropriate ROC/threshold configuration
-
evaluateROCMultiClass
public <T extends ROCMultiClass> T evaluateROCMultiClass(DataSetIterator iterator, int rocThresholdSteps)
Evaluate the network on the specified data, using theROCMultiClass
class- Parameters:
iterator
- Data to evaluate onrocThresholdSteps
- Number of threshold steps to use withROCMultiClass
- Returns:
- Multi-class ROC evaluation on the given dataset
-
doEvaluation
public <T extends IEvaluation> T[] doEvaluation(DataSetIterator iterator, T... evaluations)
Perform evaluation using an arbitrary IEvaluation instance.- Specified by:
doEvaluation
in interfaceNeuralNetwork
- Parameters:
iterator
- data to evaluate on
-
doEvaluationHelper
public <T extends IEvaluation> T[] doEvaluationHelper(DataSetIterator iterator, T... evaluations)
-
evaluate
public Evaluation evaluate(DataSetIterator iterator, List<String> labelsList)
Evaluate the network on the provided data set. Used for evaluating the performance of classifiers- Parameters:
iterator
- Data to undertake evaluation on- Returns:
- Evaluation object, summarizing the results of the evaluation on the provided DataSetIterator
-
updaterState
public INDArray updaterState()
Description copied from interface:NeuralNetwork
This method returns updater state (if applicable), null otherwise- Specified by:
updaterState
in interfaceNeuralNetwork
- Returns:
-
fit
public void fit(MultiDataSet dataSet)
Description copied from interface:NeuralNetwork
This method fits model with a given MultiDataSet- Specified by:
fit
in interfaceNeuralNetwork
-
fit
public void fit(@NonNull @NonNull MultiDataSetIterator iterator, int numEpochs)
Perform minibatch training on all minibatches in the MultiDataSetIterator, for the specified number of epochs. Equvalent to callingfit(MultiDataSetIterator)
numEpochs times in a loop- Parameters:
iterator
- Training data (DataSetIterator). Iterator must support resettingnumEpochs
- Number of training epochs, >= 1
-
fit
public void fit(MultiDataSetIterator iterator)
Perform minibatch training on all minibatches in the MultiDataSetIterator.
Note: The MultiDataSets in the MultiDataSetIterator must have exactly 1 input and output array (as MultiLayerNetwork only supports 1 input and 1 output)- Specified by:
fit
in interfaceNeuralNetwork
- Parameters:
iterator
- Training data (DataSetIterator). Iterator must support resetting
-
doEvaluation
public <T extends IEvaluation> T[] doEvaluation(MultiDataSetIterator iterator, T[] evaluations)
Description copied from interface:NeuralNetwork
This method executes evaluation of the model against given iterator and evaluation implementations- Specified by:
doEvaluation
in interfaceNeuralNetwork
-
evaluate
public Evaluation evaluate(DataSetIterator iterator, List<String> labelsList, int topN)
Evaluate the network (for classification) on the provided data set, with top N accuracy in addition to standard accuracy. For 'standard' accuracy evaluation only, use topN = 1- Parameters:
iterator
- Iterator (data) to evaluate onlabelsList
- List of labels. May be null.topN
- N value for top N accuracy evaluation- Returns:
- Evaluation object, summarizing the results of the evaluation on the provided DataSetIterator
-
update
protected void update(Task task)
-
summary
public String summary()
String detailing the architecture of the multilayernetwork. Columns are LayerIndex with layer type, nIn, nOut, Total number of parameters and the Shapes of the parameters Will also give information about frozen layers, if any.- Returns:
- Summary as a string
- See Also:
memoryInfo(int, InputType)
-
summary
public String summary(InputType inputType)
String detailing the architecture of the multilayernetwork. Will also display activation size when given an input type. Columns are LayerIndex with layer type, nIn, nOut, Total number of parameters, Shapes of the parameters, Input activation shape, Output activation shape Will also give information about frozen layers, if any.- Returns:
- Summary as a string
- See Also:
memoryInfo(int, InputType)
-
memoryInfo
public String memoryInfo(int minibatch, InputType inputType)
Generate information regarding memory use for the network, for the given input type and minibatch size. Note that when using workspaces or CuDNN, the network should be trained for some iterations so that the memory workspaces have time to initialize. Without this, the memory requirements during training may be underestimated. Note also that this is the same information that is generated during an OOM crash when training or performing inference.- Parameters:
minibatch
- Minibatch size to estimate memory forinputType
- Input type to the network- Returns:
- A String with information about network memory use information
-
clearLayersStates
public void clearLayersStates()
This method just makes sure there's no state preserved within layers
-
incrementEpochCount
public void incrementEpochCount()
Increment the epoch count (in the underlyingMultiLayerConfiguration
by 1). Note that this is done automatically when using iterator-based fitting methods, such asfit(DataSetIterator)
. However, when using non-iterator fit methods (DataSet, INDArray/INDArray etc), the network has no way to know when one epoch ends and another starts. In such situations, this method can be used to increment the epoch counter.
Note that the epoch counter is used for situations such as some learning rate schedules, and the like. The current epoch count can be obtained usingMultiLayerConfiguration.getLayerwiseConfiguration().getEpochCount()
-
synchronizeIterEpochCounts
protected void synchronizeIterEpochCounts()
-
save
public void save(File f) throws IOException
Save the MultiLayerNetwork to a file. Restore usingload(File, boolean)
. Note that this saves the updater (i.e., the state array for momentum/Adam/rmsprop etc), which is desirable if further training will be undertaken.- Parameters:
f
- File to save the network to- Throws:
IOException
- See Also:
ModelSerializer for more details (and saving/loading via streams)
,save(File, boolean)
-
save
public void save(File f, boolean saveUpdater) throws IOException
Save the MultiLayerNetwork to a file. Restore usingload(File, boolean)
.- Parameters:
f
- File to save the network tosaveUpdater
- If true: save the updater (i.e., the state array for momentum/Adam/rmsprop etc), which should usually be saved if further training is required- Throws:
IOException
- See Also:
ModelSerializer for more details (and saving/loading via streams)
,save(File, boolean)
-
load
public static MultiLayerNetwork load(File f, boolean loadUpdater) throws IOException
Restore a MultiLayerNetwork to a file, saved usingsave(File)
orModelSerializer
- Parameters:
f
- File to load the network fromloadUpdater
- If true: load the updater if it is available (i.e., the state array for momentum/Adam/rmsprop etc) - use false if no further training is required, or true if further training will be undertaken- Throws:
IOException
- See Also:
ModelSerializer for more details (and saving/loading via streams)
-
toComputationGraph
public ComputationGraph toComputationGraph()
Convert this MultiLayerNetwork to a ComputationGraph- Returns:
- ComputationGraph equivalent to this network (including parameters and updater state)
-
convertDataType
public MultiLayerNetwork convertDataType(@NonNull @NonNull DataType dataType)
Return a copy of the network with the parameters and activations set to use the specified (floating point) data type. If the existing datatype is the same as the requested dataype, the original network will be returned unchanged. Only floating point datatypes (DOUBLE, FLOAT, HALF) may be used.- Parameters:
dataType
- Datatype to convert the network to- Returns:
- The network, set to use the specified datatype for the parameters and activations
-
setLearningRate
public void setLearningRate(double newLr)
Set the learning rate for all layers in the network to the specified value. Note that if any learning rate schedules are currently present, these will be removed in favor of the new (fixed) learning rate.
Note: This method not free from a performance point of view: a proper learning rate schedule should be used in preference to calling this method at every iteration.- Parameters:
newLr
- New learning rate for all layers- See Also:
setLearningRate(ISchedule)
,setLearningRate(int, double)
-
setLearningRate
public void setLearningRate(ISchedule newLr)
Set the learning rate schedule for all layers in the network to the specified schedule. This schedule will replace any/all existing schedules, and also any fixed learning rate values.
Note that the iteration/epoch counts will not be reset. UseMultiLayerConfiguration#setIterationCount(int)
andMultiLayerConfiguration.setEpochCount(int)
if this is required- Parameters:
newLr
- New learning rate schedule for all layers- See Also:
setLearningRate(ISchedule)
,setLearningRate(int, double)
-
setLearningRate
public void setLearningRate(int layerNumber, double newLr)
Set the learning rate for a single layer in the network to the specified value. Note that if any learning rate schedules are currently present, these will be removed in favor of the new (fixed) learning rate.
Note: This method not free from a performance point of view: a proper learning rate schedule should be used in preference to calling this method at every iteration. Note also thatsetLearningRate(double)
should also be used in preference, when all layers need to be set to a new LR- Parameters:
layerNumber
- Number of the layer to set the LR fornewLr
- New learning rate for a single layer- See Also:
setLearningRate(ISchedule)
,setLearningRate(int, double)
-
setLearningRate
public void setLearningRate(int layerNumber, ISchedule newLr)
Set the learning rate schedule for a single layer in the network to the specified value.
Note also thatsetLearningRate(ISchedule)
should also be used in preference, when all layers need to be set to a new LR schedule.
This schedule will replace any/all existing schedules, and also any fixed learning rate values.
Note also that the iteration/epoch counts will not be reset. UseMultiLayerConfiguration#setIterationCount(int)
andMultiLayerConfiguration.setEpochCount(int)
if this is required- Parameters:
layerNumber
- Number of the layer to set the LR schedule fornewLr
- New learning rate for a single layer- See Also:
setLearningRate(ISchedule)
,setLearningRate(int, double)
-
getLearningRate
public Double getLearningRate(int layerNumber)
Get the current learning rate, for the specified layer, from the network. Note: If the layer has no learning rate (no parameters, or an updater without a learning rate) then null is returned- Parameters:
layerNumber
- Layer number to get the learning rate for- Returns:
- Learning rate for the specified layer, or null
-
layerSize
public int layerSize(int layer)
Return the layer size (number of units) for the specified layer.
Note that the meaning of the "layer size" can depend on the type of layer. For example:
- DenseLayer, OutputLayer, recurrent layers: number of units (nOut configuration option)
- ConvolutionLayer: the channels (number of channels)
- Subsampling layers, global pooling layers, etc: size of 0 is always returned- Parameters:
layer
- Index of the layer to get the size of. Must be in range 0 to nLayers-1 inclusive- Returns:
- Size of the layer
-
layerInputSize
public int layerInputSize(int layer)
Return the input size (number of inputs) for the specified layer.
Note that the meaning of the "input size" can depend on the type of layer. For example:
- DenseLayer, OutputLayer, etc: the feature vector size (nIn configuration option)
- Recurrent layers: the feature vector size per time step (nIn configuration option)
- ConvolutionLayer: the channels (number of channels)
- Subsampling layers, global pooling layers, etc: size of 0 is always returned- Parameters:
layer
- Index of the layer to get the size of. Must be in range 0 to nLayers-1 inclusive- Returns:
- Size of the layer
-
equals
public boolean equals(Object obj)
Indicates whether some other object is "equal to" this one.The
equals
method implements an equivalence relation on non-null object references:- It is reflexive: for any non-null reference value
x
,x.equals(x)
should returntrue
. - It is symmetric: for any non-null reference values
x
andy
,x.equals(y)
should returntrue
if and only ify.equals(x)
returnstrue
. - It is transitive: for any non-null reference values
x
,y
, andz
, ifx.equals(y)
returnstrue
andy.equals(z)
returnstrue
, thenx.equals(z)
should returntrue
. - It is consistent: for any non-null reference values
x
andy
, multiple invocations ofx.equals(y)
consistently returntrue
or consistently returnfalse
, provided no information used inequals
comparisons on the objects is modified. - For any non-null reference value
x
,x.equals(null)
should returnfalse
.
The
equals
method for classObject
implements the most discriminating possible equivalence relation on objects; that is, for any non-null reference valuesx
andy
, this method returnstrue
if and only ifx
andy
refer to the same object (x == y
has the valuetrue
).Note that it is generally necessary to override the
hashCode
method whenever this method is overridden, so as to maintain the general contract for thehashCode
method, which states that equal objects must have equal hash codes.- Overrides:
equals
in classObject
- Parameters:
obj
- the reference object with which to compare.- Returns:
true
if this object is the same as the obj argument;false
otherwise.- See Also:
Object.hashCode()
,HashMap
- It is reflexive: for any non-null reference value
-
-