WeightDecay (nd4j-api 1.0.0-beta7 API)

java.lang.Object
- org.nd4j.linalg.learning.regularization.WeightDecay

All Implemented Interfaces:

Serializable, Regularization
```
public class WeightDecay
extends Object
implements Regularization
```
WeightDecay regularization: Updater is not applied to the regularization term gradients, and (optionally) applies the learning rate. Loss: L = loss + coeff * 0.5 * sum_i w[i]^2
(Same as L2Regularization For all cases, w -= update
If applyLR == true, we have: update = updater(gradient) + lr * coeff * w
Where lr is the learning rate for the current iteration/epoch (accounting for LR schedules if present).

If applyLR == false, we have:
update = updater(gradient) + coeff * w

Note that with some learning rate schedules (such as cyclical LR schedules), it may be preferable to not have the learning rate and regularization amount linked (i.e., applyLR=false); in other learning rate schedules, it may be desirable (such as monotonically decaying learning rate schedules, to avoid the regularization gradients (updates) overwhelming the prediction loss gradients (updates) which could occur with applyLR==false combined with an arbitrarily small learning rate.

Similar to L2 regularization, but WeightDecay should usually be preferred in practice.
See https://www.fast.ai/2018/07/02/adam-weight-decay/ for further details.
The primary difference between L2 regularization and WeightDecay is that L2 regularization is applied before the updater (Adam/Nesterov/etc); WeightDecay applied after the updater.

Author:

Alex Black

See Also:

Serialized Form

Nested Class Summary
- Nested classes/interfaces inherited from interface org.nd4j.linalg.learning.regularization.Regularization
  Regularization.ApplyStep

Field Summary

Fields
Modifier and Type Field and Description

protected boolean applyLR

protected ISchedule coeff

Fields
Modifier and Type	Field and Description
`protected boolean`	`applyLR`
`protected ISchedule`	`coeff`

Constructor Summary

Constructors
Constructor and Description

WeightDecay(double coeff, boolean applyLR)

WeightDecay(@NonNull ISchedule coeff, boolean applyLR)

Constructors
Constructor and Description
`WeightDecay(double coeff, boolean applyLR)`
`WeightDecay(@NonNull ISchedule coeff, boolean applyLR)`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`apply(INDArray param, INDArray gradView, double lr, int iteration, int epoch)` Apply the regularization by modifying the gradient array in-place
`Regularization.ApplyStep`	`applyStep()`
`Regularization`	`clone()`
`double`	`score(INDArray param, int iteration, int epoch)` Calculate the loss function score component for the regularization. For example, in L2 regularization, this would return `L = 0.5 * sum_i param[i]^2` For regularization types that don't have a score component, this method can return 0.

Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - coeff
```
protected final ISchedule coeff
```
  - applyLR
```
protected final boolean applyLR
```
- Constructor Detail
  - WeightDecay
```
public WeightDecay(double coeff,
                   boolean applyLR)
```
    Parameters:
    
    coeff - Weight decay regularization coefficient
    
    applyLR - If true, multiply the regularization coefficient by the current learning rate. If false, do not multiply by LR.
  - WeightDecay
```
public WeightDecay(@NonNull
                   @NonNull ISchedule coeff,
                   boolean applyLR)
```
    Parameters:
    
    coeff - Weight decay regularization coefficient (schedule)
    
    applyLR - If true, multiply the regularization coefficient by the current learning rate. If false, do not multiply by LR.
- Method Detail
  - applyStep
```
public Regularization.ApplyStep applyStep()
```
    Specified by:
    
    applyStep in interface Regularization
    
    Returns:
    
    The step that the regularization should be applied, as defined by Regularization.ApplyStep
  - apply
```
public void apply(INDArray param,
                  INDArray gradView,
                  double lr,
                  int iteration,
                  int epoch)
```
    Description copied from interface: Regularization
    
    Apply the regularization by modifying the gradient array in-place
    
    Specified by:
    
    apply in interface Regularization
    
    Parameters:
    
    param - Input array (usually parameters)
    
    gradView - Gradient view array (should be modified/updated). Same shape and type as the input array.
    
    lr - Current learning rate
    
    iteration - Current network training iteration
    
    epoch - Current network training epoch
  - score
```
public double score(INDArray param,
                    int iteration,
                    int epoch)
```
    Description copied from interface: Regularization
    
    Calculate the loss function score component for the regularization.
    For example, in L2 regularization, this would return L = 0.5 * sum_i param[i]^2
    For regularization types that don't have a score component, this method can return 0. However, note that this may make the regularization type not gradient checkable.
    
    Specified by:
    
    score in interface Regularization
    
    Parameters:
    
    param - Input array (usually parameters)
    
    iteration - Current network training iteration
    
    epoch - Current network training epoch
    
    Returns:
    
    Loss function score component based on the input/parameters array
  - clone
```
public Regularization clone()
```
    Specified by:
    
    clone in interface Regularization
    
    Overrides:
    
    clone in class Object
    
    Returns:
    
    An independent copy of the regularization instance

Class WeightDecay

Nested Class Summary

Nested classes/interfaces inherited from interface org.nd4j.linalg.learning.regularization.Regularization

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

coeff

applyLR

Constructor Detail

WeightDecay

WeightDecay

Method Detail

applyStep

apply

score

clone