Class FineTuneConfiguration.Builder
- java.lang.Object
-
- org.deeplearning4j.nn.transferlearning.FineTuneConfiguration.Builder
-
- Enclosing class:
- FineTuneConfiguration
public static class FineTuneConfiguration.Builder extends Object
-
-
Field Summary
Fields Modifier and Type Field Description protected List<Regularization>
regularization
protected List<Regularization>
regularizationBias
protected boolean
removeL1
protected boolean
removeL1Bias
protected boolean
removeL2
protected boolean
removeL2Bias
protected boolean
removeWD
protected boolean
removeWDBias
-
Constructor Summary
Constructors Constructor Description Builder()
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description FineTuneConfiguration.Builder
activation(Activation activation)
Activation function / neuron non-linearityFineTuneConfiguration.Builder
activation(IActivation activationFn)
Activation function / neuron non-linearityFineTuneConfiguration.Builder
backprop(boolean backprop)
FineTuneConfiguration.Builder
backpropType(BackpropType backpropType)
The type of backprop.FineTuneConfiguration.Builder
biasInit(double biasInit)
Constant for bias initialization.FineTuneConfiguration.Builder
biasUpdater(IUpdater biasUpdater)
Gradient updater configuration, for the biases only.FineTuneConfiguration
build()
FineTuneConfiguration.Builder
constraints(List<LayerConstraint> constraints)
Set constraints to be applied to all layers.FineTuneConfiguration.Builder
convolutionMode(ConvolutionMode convolutionMode)
Sets the convolution mode for convolutional layers, which impacts padding and output sizes.FineTuneConfiguration.Builder
cudnnAlgoMode(ConvolutionLayer.AlgoMode cudnnAlgoMode)
Sets the cuDNN algo mode for convolutional layers, which impacts performance and memory usage of cuDNN.FineTuneConfiguration.Builder
dist(Distribution dist)
Deprecated.FineTuneConfiguration.Builder
dropout(IDropout dropout)
Set the dropoutFineTuneConfiguration.Builder
dropOut(double inputRetainProbability)
Dropout probability.FineTuneConfiguration.Builder
gradientNormalization(GradientNormalization gradientNormalization)
Gradient normalization strategy.FineTuneConfiguration.Builder
gradientNormalizationThreshold(double gradientNormalizationThreshold)
Threshold for gradient normalization, only used for GradientNormalization.ClipL2PerLayer, GradientNormalization.ClipL2PerParamType, and GradientNormalization.ClipElementWiseAbsoluteValue
Not used otherwise.
L2 threshold for first two types of clipping, or absolute value threshold for last type of clippingFineTuneConfiguration.Builder
inferenceWorkspaceMode(WorkspaceMode inferenceWorkspaceMode)
This method defines Workspace mode being used during inference:
NONE: workspace won't be used
ENABLED: workspaces will be used for inference (reduced memory and better performance)FineTuneConfiguration.Builder
l1(double l1)
L1 regularization coefficient for the weights (excluding biases)FineTuneConfiguration.Builder
l1Bias(double l1Bias)
L1 regularization coefficient for the bias parametersFineTuneConfiguration.Builder
l2(double l2)
L2 regularization coefficient for the weights (excluding biases)
Note: Generally,WeightDecay
(set viaweightDecay(double,boolean)
should be preferred to L2 regularization.FineTuneConfiguration.Builder
l2Bias(double l2Bias)
L2 regularization coefficient for the bias parameters
Note: Generally,WeightDecay
(set viaweightDecayBias(double,boolean)
should be preferred to L2 regularization.FineTuneConfiguration.Builder
maxNumLineSearchIterations(int maxNumLineSearchIterations)
FineTuneConfiguration.Builder
miniBatch(boolean miniBatch)
Whether scores and gradients should be divided by the minibatch size.
Most users should leave this ast he default value of true.FineTuneConfiguration.Builder
minimize(boolean minimize)
FineTuneConfiguration.Builder
optimizationAlgo(OptimizationAlgorithm optimizationAlgo)
FineTuneConfiguration.Builder
pretrain(boolean pretrain)
FineTuneConfiguration.Builder
seed(int seed)
RNG seed for reproducibilityFineTuneConfiguration.Builder
seed(long seed)
RNG seed for reproducibilityFineTuneConfiguration.Builder
stepFunction(StepFunction stepFunction)
FineTuneConfiguration.Builder
tbpttBackLength(int tbpttBackLength)
When doing truncated BPTT: how many steps of backward should we do?
Only applicable when doing backpropType(BackpropType.TruncatedBPTT)
This is the k2 parameter on pg23 of http://www.cs.utoronto.ca/~ilya/pubs/ilya_sutskever_phd_thesis.pdfFineTuneConfiguration.Builder
tbpttFwdLength(int tbpttFwdLength)
When doing truncated BPTT: how many steps of forward pass should we do before doing (truncated) backprop?
Only applicable when doing backpropType(BackpropType.TruncatedBPTT)
Typically tBPTTForwardLength parameter is same as the tBPTTBackwardLength parameter, but may be larger than it in some circumstances (but never smaller)
Ideally your training data time series length should be divisible by this This is the k1 parameter on pg23 of http://www.cs.utoronto.ca/~ilya/pubs/ilya_sutskever_phd_thesis.pdfFineTuneConfiguration.Builder
trainingWorkspaceMode(WorkspaceMode trainingWorkspaceMode)
This method defines Workspace mode being used during training: NONE: workspace won't be used ENABLED: workspaces will be used for training (reduced memory and better performance)FineTuneConfiguration.Builder
updater(Updater updater)
Deprecated.FineTuneConfiguration.Builder
updater(IUpdater updater)
Gradient updater configuration.FineTuneConfiguration.Builder
weightDecay(double coefficient)
Add weight decay regularization for the network parameters (excluding biases).
This applies weight decay with multiplying the learning rate - seeWeightDecay
for more details.FineTuneConfiguration.Builder
weightDecay(double coefficient, boolean applyLR)
Add weight decay regularization for the network parameters (excluding biases).FineTuneConfiguration.Builder
weightDecayBias(double coefficient)
Weight decay for the biases only - seeweightDecay(double)
for more details.FineTuneConfiguration.Builder
weightDecayBias(double coefficient, boolean applyLR)
Weight decay for the biases only - seeweightDecay(double)
for more detailsFineTuneConfiguration.Builder
weightInit(Distribution distribution)
Set weight initialization scheme to random sampling via the specified distribution.FineTuneConfiguration.Builder
weightInit(IWeightInit weightInit)
Weight initialization scheme to use, for initial weight valuesFineTuneConfiguration.Builder
weightInit(WeightInit weightInit)
Weight initialization scheme to use, for initial weight valuesFineTuneConfiguration.Builder
weightNoise(IWeightNoise weightNoise)
Set the weight noise (such asDropConnect
andWeightNoise
)
-
-
-
Field Detail
-
regularization
protected List<Regularization> regularization
-
regularizationBias
protected List<Regularization> regularizationBias
-
removeL2
protected boolean removeL2
-
removeL2Bias
protected boolean removeL2Bias
-
removeL1
protected boolean removeL1
-
removeL1Bias
protected boolean removeL1Bias
-
removeWD
protected boolean removeWD
-
removeWDBias
protected boolean removeWDBias
-
-
Method Detail
-
activation
public FineTuneConfiguration.Builder activation(IActivation activationFn)
Activation function / neuron non-linearity
-
activation
public FineTuneConfiguration.Builder activation(Activation activation)
Activation function / neuron non-linearity
-
weightInit
public FineTuneConfiguration.Builder weightInit(IWeightInit weightInit)
Weight initialization scheme to use, for initial weight values- See Also:
IWeightInit
-
weightInit
public FineTuneConfiguration.Builder weightInit(WeightInit weightInit)
Weight initialization scheme to use, for initial weight values- See Also:
WeightInit
-
weightInit
public FineTuneConfiguration.Builder weightInit(Distribution distribution)
Set weight initialization scheme to random sampling via the specified distribution. Equivalent to:.weightInit(new WeightInitDistribution(distribution))
- Parameters:
distribution
- Distribution to use for weight initialization
-
biasInit
public FineTuneConfiguration.Builder biasInit(double biasInit)
Constant for bias initialization. Default: 0.0- Parameters:
biasInit
- Constant for bias initialization
-
dist
@Deprecated public FineTuneConfiguration.Builder dist(Distribution dist)
Deprecated.Distribution to sample initial weights from. Equivalent to:.weightInit(new WeightInitDistribution(distribution))
-
l1
public FineTuneConfiguration.Builder l1(double l1)
L1 regularization coefficient for the weights (excluding biases)
-
l2
public FineTuneConfiguration.Builder l2(double l2)
L2 regularization coefficient for the weights (excluding biases)
Note: Generally,WeightDecay
(set viaweightDecay(double,boolean)
should be preferred to L2 regularization. SeeWeightDecay
javadoc for further details.
-
l1Bias
public FineTuneConfiguration.Builder l1Bias(double l1Bias)
L1 regularization coefficient for the bias parameters
-
l2Bias
public FineTuneConfiguration.Builder l2Bias(double l2Bias)
L2 regularization coefficient for the bias parameters
Note: Generally,WeightDecay
(set viaweightDecayBias(double,boolean)
should be preferred to L2 regularization. SeeWeightDecay
javadoc for further details.
-
weightDecay
public FineTuneConfiguration.Builder weightDecay(double coefficient)
Add weight decay regularization for the network parameters (excluding biases).
This applies weight decay with multiplying the learning rate - seeWeightDecay
for more details.- Parameters:
coefficient
- Weight decay regularization coefficient- See Also:
weightDecay(double, boolean)
-
weightDecay
public FineTuneConfiguration.Builder weightDecay(double coefficient, boolean applyLR)
Add weight decay regularization for the network parameters (excluding biases). SeeWeightDecay
for more details.- Parameters:
coefficient
- Weight decay regularization coefficientapplyLR
- Whether the learning rate should be multiplied in when performing weight decay updates. SeeWeightDecay
for more details.- See Also:
weightDecay(double, boolean)
-
weightDecayBias
public FineTuneConfiguration.Builder weightDecayBias(double coefficient)
Weight decay for the biases only - seeweightDecay(double)
for more details. This applies weight decay with multiplying the learning rate.- Parameters:
coefficient
- Weight decay regularization coefficient- See Also:
weightDecayBias(double, boolean)
-
weightDecayBias
public FineTuneConfiguration.Builder weightDecayBias(double coefficient, boolean applyLR)
Weight decay for the biases only - seeweightDecay(double)
for more details- Parameters:
coefficient
- Weight decay regularization coefficient
-
dropout
public FineTuneConfiguration.Builder dropout(IDropout dropout)
Set the dropout- Parameters:
dropout
- Dropout, such asDropout
,GaussianDropout
,GaussianNoise
etc
-
dropOut
public FineTuneConfiguration.Builder dropOut(double inputRetainProbability)
Dropout probability. This is the probability ofretaining each input activation value for a layer. dropOut(x) will keep an input activation with probability x, and set to 0 with probability 1-x.
dropOut(0.0) is a special value / special case - when set to 0.0., dropout is disabled (not applied). Note that a dropout value of 1.0 is functionally equivalent to no dropout: i.e., 100% probability of retaining each input activation.
Note 1: Dropout is applied at training time only - and is automatically not applied at test time (for evaluation, etc)
Note 2: This sets the probability per-layer. Care should be taken when setting lower values for complex networks (too much information may be lost with aggressive (very low) dropout values).
Note 3: Frequently, dropout is not applied to (or, has higher retain probability for) input (first layer) layers. Dropout is also often not applied to output layers. This needs to be handled MANUALLY by the user - set .dropout(0) on those layers when using global dropout setting.
Note 4: Implementation detail (most users can ignore): DL4J uses inverted dropout, as described here: http://cs231n.github.io/neural-networks-2/- Parameters:
inputRetainProbability
- Dropout probability (probability of retaining each input activation value for a layer)- See Also:
dropout(IDropout)
-
weightNoise
public FineTuneConfiguration.Builder weightNoise(IWeightNoise weightNoise)
Set the weight noise (such asDropConnect
andWeightNoise
)- Parameters:
weightNoise
- Weight noise instance to use
-
updater
public FineTuneConfiguration.Builder updater(IUpdater updater)
- Parameters:
updater
- Updater to use
-
updater
@Deprecated public FineTuneConfiguration.Builder updater(Updater updater)
Deprecated.
-
biasUpdater
public FineTuneConfiguration.Builder biasUpdater(IUpdater biasUpdater)
Gradient updater configuration, for the biases only. If not set, biases will use the updater as set byupdater(IUpdater)
- Parameters:
biasUpdater
- Updater to use for bias parameters
-
miniBatch
public FineTuneConfiguration.Builder miniBatch(boolean miniBatch)
Whether scores and gradients should be divided by the minibatch size.
Most users should leave this ast he default value of true.
-
maxNumLineSearchIterations
public FineTuneConfiguration.Builder maxNumLineSearchIterations(int maxNumLineSearchIterations)
-
seed
public FineTuneConfiguration.Builder seed(long seed)
RNG seed for reproducibility- Parameters:
seed
- RNG seed to use
-
seed
public FineTuneConfiguration.Builder seed(int seed)
RNG seed for reproducibility- Parameters:
seed
- RNG seed to use
-
optimizationAlgo
public FineTuneConfiguration.Builder optimizationAlgo(OptimizationAlgorithm optimizationAlgo)
-
stepFunction
public FineTuneConfiguration.Builder stepFunction(StepFunction stepFunction)
-
minimize
public FineTuneConfiguration.Builder minimize(boolean minimize)
-
gradientNormalization
public FineTuneConfiguration.Builder gradientNormalization(GradientNormalization gradientNormalization)
Gradient normalization strategy. Used to specify gradient renormalization, gradient clipping etc. SeeGradientNormalization
for details- Parameters:
gradientNormalization
- Type of normalization to use. Defaults to None.- See Also:
GradientNormalization
-
gradientNormalizationThreshold
public FineTuneConfiguration.Builder gradientNormalizationThreshold(double gradientNormalizationThreshold)
Threshold for gradient normalization, only used for GradientNormalization.ClipL2PerLayer, GradientNormalization.ClipL2PerParamType, and GradientNormalization.ClipElementWiseAbsoluteValue
Not used otherwise.
L2 threshold for first two types of clipping, or absolute value threshold for last type of clipping
-
convolutionMode
public FineTuneConfiguration.Builder convolutionMode(ConvolutionMode convolutionMode)
Sets the convolution mode for convolutional layers, which impacts padding and output sizes. SeeConvolutionMode
for details. Defaults to ConvolutionMode.TRUNCATE- Parameters:
convolutionMode
- Convolution mode to use
-
cudnnAlgoMode
public FineTuneConfiguration.Builder cudnnAlgoMode(ConvolutionLayer.AlgoMode cudnnAlgoMode)
Sets the cuDNN algo mode for convolutional layers, which impacts performance and memory usage of cuDNN. SeeConvolutionLayer.AlgoMode
for details. Defaults to "PREFER_FASTEST", but "NO_WORKSPACE" uses less memory.
-
constraints
public FineTuneConfiguration.Builder constraints(List<LayerConstraint> constraints)
Set constraints to be applied to all layers. Default: no constraints.
Constraints can be used to enforce certain conditions (non-negativity of parameters, max-norm regularization, etc). These constraints are applied at each iteration, after the parameters have been updated.- Parameters:
constraints
- Constraints to apply to all parameters of all layers
-
pretrain
public FineTuneConfiguration.Builder pretrain(boolean pretrain)
-
backprop
public FineTuneConfiguration.Builder backprop(boolean backprop)
-
backpropType
public FineTuneConfiguration.Builder backpropType(BackpropType backpropType)
The type of backprop. Default setting is used for most networks (MLP, CNN etc), but optionally truncated BPTT can be used for training recurrent neural networks. If using TruncatedBPTT make sure you set both tBPTTForwardLength() and tBPTTBackwardLength()- Parameters:
backpropType
- Type of backprop. Default: BackpropType.Standard
-
tbpttFwdLength
public FineTuneConfiguration.Builder tbpttFwdLength(int tbpttFwdLength)
When doing truncated BPTT: how many steps of forward pass should we do before doing (truncated) backprop?
Only applicable when doing backpropType(BackpropType.TruncatedBPTT)
Typically tBPTTForwardLength parameter is same as the tBPTTBackwardLength parameter, but may be larger than it in some circumstances (but never smaller)
Ideally your training data time series length should be divisible by this This is the k1 parameter on pg23 of http://www.cs.utoronto.ca/~ilya/pubs/ilya_sutskever_phd_thesis.pdf- Parameters:
tbpttFwdLength
- Forward length > 0, >= backwardLength
-
tbpttBackLength
public FineTuneConfiguration.Builder tbpttBackLength(int tbpttBackLength)
When doing truncated BPTT: how many steps of backward should we do?
Only applicable when doing backpropType(BackpropType.TruncatedBPTT)
This is the k2 parameter on pg23 of http://www.cs.utoronto.ca/~ilya/pubs/ilya_sutskever_phd_thesis.pdf- Parameters:
tbpttBackLength
- <= forwardLength
-
trainingWorkspaceMode
public FineTuneConfiguration.Builder trainingWorkspaceMode(WorkspaceMode trainingWorkspaceMode)
This method defines Workspace mode being used during training: NONE: workspace won't be used ENABLED: workspaces will be used for training (reduced memory and better performance)- Parameters:
trainingWorkspaceMode
- Workspace mode for training- Returns:
- Builder
-
inferenceWorkspaceMode
public FineTuneConfiguration.Builder inferenceWorkspaceMode(WorkspaceMode inferenceWorkspaceMode)
This method defines Workspace mode being used during inference:
NONE: workspace won't be used
ENABLED: workspaces will be used for inference (reduced memory and better performance)- Parameters:
inferenceWorkspaceMode
- Workspace mode for inference- Returns:
- Builder
-
build
public FineTuneConfiguration build()
-
-