Class/Object

org.platanios.tensorflow.api.ops.training.optimizers

YellowFin

Related Docs: object YellowFin | package optimizers

Permalink

class YellowFin extends GradientDescent

Optimizer that implements the YellowFin algorithm.

Please refer to [Zhang et. al., 2017](https://arxiv.org/abs/1706.03471) for details.

Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. YellowFin
  2. GradientDescent
  3. Optimizer
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new YellowFin(learningRate: Float = 1.0f, decay: Schedule = FixedSchedule, momentum: Float = 0.0f, beta: Float = 0.999f, curvatureWindowWidth: Int = 20, zeroDebias: Boolean = true, sparsityDebias: Boolean = true, useNesterov: Boolean = false, useLocking: Boolean = false, learningRateSummaryTag: String = null, name: String = "YellowFin")

    Permalink

    learningRate

    Learning rate. Must be > 0. If used with decay, then this argument specifies the initial value of the learning rate.

    decay

    Learning rate decay method to use for each update.

    momentum

    Momentum. Must be >= 0.

    beta

    Smoothing parameter for estimations.

    curvatureWindowWidth

    Curvature window width. Must be > 1.

    zeroDebias

    If true, the moving averages will be zero-debiased.

    sparsityDebias

    The gradient norm and curvature are biased towards larger values when computed for sparse gradients. This is useful when the model is very sparse, e.g. LSTMs with word embeddings. For non-sparse CNNs, turning it off could slightly accelerate the algorithm's speed.

    useNesterov

    Boolean value indicating whether to use Nesterov acceleration or not. For details, refer to [Sutskever et. al., 2013](http://proceedings.mlr.press/v28/sutskever13.pdf).

    useLocking

    If true, the gradient descent updates will be protected by a lock. Otherwise, the behavior is undefined, but may exhibit less contention.

    learningRateSummaryTag

    Optional summary tag name to use for the learning rate value. If null, no summary is created for the learning rate. Otherwise, a scalar summary is created which can be monitored using TensorBoard.

    name

    Name for this optimizer.

    Attributes
    protected

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. def applyDense(gradient: Output, variable: variables.Variable, iteration: Option[variables.Variable]): Op

    Permalink

    Applies the updates corresponding to the provided gradient, to the provided variable.

    Applies the updates corresponding to the provided gradient, to the provided variable.

    gradient

    Gradient tensor.

    variable

    Variable.

    iteration

    Option containing current iteration in the optimization loop, if one has been provided.

    returns

    Created op that applies the provided gradient to the provided variable.

    Definition Classes
    YellowFinGradientDescentOptimizer
  5. def applyGradients(gradientsAndVariables: Seq[(OutputLike, variables.Variable)], iteration: Option[variables.Variable] = None, name: String = this.name): Op

    Permalink

    Creates an op that applies the provided gradients to the provided variables.

    Creates an op that applies the provided gradients to the provided variables.

    gradientsAndVariables

    Sequence with gradient-variable pairs.

    iteration

    Optional Variable to increment by one after the variables have been updated.

    name

    Name for the created op.

    returns

    Created op.

    Definition Classes
    YellowFinOptimizer
  6. def applySparse(gradient: OutputIndexedSlices, variable: variables.Variable, iteration: Option[variables.Variable]): Op

    Permalink

    Applies the updates corresponding to the provided gradient, to the provided variable.

    Applies the updates corresponding to the provided gradient, to the provided variable.

    The OutputIndexedSlices object specified by gradient in this function is by default pre-processed in applySparseDuplicateIndices to remove duplicate indices (refer to that function's documentation for details). Optimizers which can tolerate or have correct special cases for duplicate sparse indices may override applySparseDuplicateIndices instead of this function, avoiding that overhead.

    gradient

    Gradient tensor.

    variable

    Variable.

    iteration

    Option containing current iteration in the optimization loop, if one has been provided.

    returns

    Created op that applies the provided gradient to the provided variable.

    Definition Classes
    YellowFinGradientDescentOptimizer
  7. def applySparseDuplicateIndices(gradient: OutputIndexedSlices, variable: variables.Variable, iteration: Option[variables.Variable]): Op

    Permalink

    Applies the updates corresponding to the provided gradient (with potentially duplicate indices), to the provided variable.

    Applies the updates corresponding to the provided gradient (with potentially duplicate indices), to the provided variable.

    Optimizers which override this method must deal with OutputIndexedSlices objects such as the following: OutputIndexedSlices(indices=[0, 0], values=[1, 1], denseShape=[1]), which contain duplicate indices. The correct interpretation in that case should be: OutputIndexedSlices(values=[2], indices=[0], denseShape=[1]).

    Many optimizers deal incorrectly with repeated indices when updating based on sparse gradients (e.g. summing squares rather than squaring the sum, or applying momentum terms multiple times). Adding first is always the correct behavior, so this is enforced here by reconstructing the OutputIndexedSlices to have only unique indices, and then calling applySparse.

    Optimizers which deal correctly with repeated indices may instead override this method to avoid the induced overhead.

    gradient

    Gradient tensor.

    variable

    Variable.

    iteration

    Option containing current iteration in the optimization loop, if one has been provided.

    returns

    Created op that applies the provided gradient to the provided variable.

    Definition Classes
    GradientDescentOptimizer
  8. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  9. val beta: Float

    Permalink

    Smoothing parameter for estimations.

  10. var betaTensor: Output

    Permalink
    Attributes
    protected
  11. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  12. def computeGradients(loss: Output, lossGradients: Seq[OutputLike] = null, variables: Set[variables.Variable] = null, gradientsGatingMethod: GatingMethod = Gradients.OpGating, gradientsAggregationMethod: AggregationMethod = Gradients.AddAggregationMethod, colocateGradientsWithOps: Boolean = false): Seq[(OutputLike, variables.Variable)]

    Permalink

    Computes the gradients of loss with respect to the variables in variables, if provided, otherwise with respect to all the trainable variables in the graph where loss is defined.

    Computes the gradients of loss with respect to the variables in variables, if provided, otherwise with respect to all the trainable variables in the graph where loss is defined.

    loss

    Loss value whose gradients will be computed.

    lossGradients

    Optional gradients to back-propagate for loss.

    variables

    Optional list of variables for which to compute the gradients. Defaults to the set of trainable variables in the graph where loss is defined.

    gradientsGatingMethod

    Gating method for the gradients computation.

    gradientsAggregationMethod

    Aggregation method used to combine gradient terms.

    colocateGradientsWithOps

    Boolean value indicating whether to colocate the gradient ops with the original ops.

    returns

    Sequence of gradient-variable pairs.

    Definition Classes
    Optimizer
  13. def createSlots(variables: Seq[variables.Variable]): Unit

    Permalink

    Create all slots needed by this optimizer.

    Create all slots needed by this optimizer.

    Definition Classes
    YellowFinGradientDescentOptimizer
  14. def curvatureRange(gradNormSquaredSum: Output, sparsityAvg: Option[Output]): (Output, Output)

    Permalink
    Attributes
    protected
  15. var curvatureWindow: variables.Variable

    Permalink
    Attributes
    protected
  16. val curvatureWindowWidth: Int

    Permalink

    Curvature window width.

    Curvature window width. Must be > 1.

  17. val decay: Schedule

    Permalink

    Learning rate decay method to use for each update.

    Learning rate decay method to use for each update.

    Definition Classes
    YellowFinGradientDescent
  18. def distanceToOptimum(gradNormSquaredSum: Output, gradNormSquaredAvg: Output, sparsityAvg: Option[Output]): Output

    Permalink
    Attributes
    protected
  19. var doTune: Output

    Permalink
    Attributes
    protected
  20. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  21. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  22. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  23. def finish(updateOps: Set[Op], nameScope: String): Op

    Permalink

    Creates an op that finishes the gradients application.

    Creates an op that finishes the gradients application. This function is called from within an op creation context that uses as its name scope the name that users have chosen for the application of gradients.

    updateOps

    Set of ops needed to apply the gradients and update the variable values.

    nameScope

    Name scope to use for all the ops created by this function.

    returns

    Created op output.

    Definition Classes
    Optimizer
  24. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  25. def getLearningRate(variable: variables.Variable, iteration: Option[variables.Variable]): Output

    Permalink
    Attributes
    protected
    Definition Classes
    YellowFinGradientDescent
  26. def getMomentum(variable: variables.Variable): Output

    Permalink
    Attributes
    protected
    Definition Classes
    YellowFinGradientDescent
  27. final def getNonSlotVariable(name: String, graph: core.Graph = null): variables.Variable

    Permalink

    Gets a non-slot variable that has been added to this optimizer (or throws an error if no such non-slot variable could be found in this optimizer).

    Gets a non-slot variable that has been added to this optimizer (or throws an error if no such non-slot variable could be found in this optimizer).

    name

    Variable name.

    graph

    Graph in which the variable is defined.

    returns

    Obtained non-slot variable.

    Attributes
    protected
    Definition Classes
    Optimizer
  28. final def getNonSlotVariables: Iterable[variables.Variable]

    Permalink

    Gets all the non-slot variables that have been added to this optimizer.

    Gets all the non-slot variables that have been added to this optimizer.

    Attributes
    protected
    Definition Classes
    Optimizer
  29. final def getOrCreateNonSlotVariable(name: String, initialValue: tensors.Tensor[_ <: types.DataType], colocationOps: Set[Op] = Set.empty, ignoreExisting: Boolean = false): variables.Variable

    Permalink

    Gets or creates (and adds to this optimizer) a non-slot variable.

    Gets or creates (and adds to this optimizer) a non-slot variable.

    name

    Variable name.

    initialValue

    Variable initial value.

    colocationOps

    Set of colocation ops for the non-slot variable.

    returns

    Created non-slot variable.

    Attributes
    protected
    Definition Classes
    Optimizer
  30. final def getSlot(name: String, variable: variables.Variable): variables.Variable

    Permalink

    Gets an existing slot.

    Gets an existing slot.

    name

    Slot name.

    variable

    Slot primary variable.

    returns

    Requested slot variable, or null if it cannot be found.

    Attributes
    protected
    Definition Classes
    Optimizer
  31. final def getSlot(name: String, variable: variables.Variable, initializer: Initializer, shape: core.Shape, dataType: types.DataType, variableScope: String): variables.Variable

    Permalink

    Gets an existing slot or creates a new one if none exists, for the provided arguments.

    Gets an existing slot or creates a new one if none exists, for the provided arguments.

    name

    Slot name.

    variable

    Slot primary variable.

    initializer

    Slot variable initializer.

    shape

    Slot variable shape.

    dataType

    Slot variable data type.

    variableScope

    Name to use when scoping the variable that needs to be created for the slot.

    returns

    Requested slot variable.

    Attributes
    protected
    Definition Classes
    Optimizer
  32. def gradientsSparsity(gradients: Seq[OutputLike]): Option[Output]

    Permalink
    Attributes
    protected
  33. def gradientsVariance(gradients: Seq[OutputLike], gradNormSquaredAvg: Output, sparsityAvg: Option[Output]): Output

    Permalink
    Attributes
    protected
  34. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  35. val ignoreDuplicateSparseIndices: Boolean

    Permalink

    Boolean value indicating whether to ignore duplicate indices during sparse updates.

    Boolean value indicating whether to ignore duplicate indices during sparse updates.

    Definition Classes
    GradientDescentOptimizer
  36. var incrementStepOp: Op

    Permalink
    Attributes
    protected
  37. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  38. val learningRate: Float

    Permalink

    Learning rate.

    Learning rate. Must be > 0. If used with decay, then this argument specifies the initial value of the learning rate.

    Definition Classes
    YellowFinGradientDescent
  39. var learningRateFactorVariable: variables.Variable

    Permalink
    Attributes
    protected
  40. val learningRateSummaryTag: String

    Permalink

    Optional summary tag name to use for the learning rate value.

    Optional summary tag name to use for the learning rate value. If null, no summary is created for the learning rate. Otherwise, a scalar summary is created which can be monitored using TensorBoard.

    Definition Classes
    YellowFinGradientDescent
  41. var learningRateTensor: Output

    Permalink
    Attributes
    protected
    Definition Classes
    GradientDescent
  42. var learningRateVariable: variables.Variable

    Permalink
    Attributes
    protected
  43. final def minimize(loss: Output, lossGradients: Seq[OutputLike] = null, variables: Set[variables.Variable] = null, gradientsGatingMethod: GatingMethod = Gradients.OpGating, gradientsAggregationMethod: AggregationMethod = Gradients.AddAggregationMethod, colocateGradientsWithOps: Boolean = false, iteration: Option[variables.Variable] = None, name: String = "Minimize"): Op

    Permalink

    Creates an op that makes a step towards minimizing loss by updating the values of the variables in variables.

    Creates an op that makes a step towards minimizing loss by updating the values of the variables in variables.

    This method simply combines calls computeGradients and applyGradients. If you want to process the gradients before applying them call computeGradients and applyGradients explicitly instead of using this method.

    loss

    Loss value whose gradients will be computed.

    lossGradients

    Optional gradients to back-propagate for loss.

    variables

    Optional list of variables for which to compute the gradients. Defaults to the set of trainable variables in the graph where loss is defined.

    gradientsGatingMethod

    Gating method for the gradients computation.

    gradientsAggregationMethod

    Aggregation method used to combine gradient terms.

    colocateGradientsWithOps

    Boolean value indicating whether to colocate the gradient ops with the original ops.

    iteration

    Optional Variable to increment by one after the variables have been updated.

    name

    Name for the created op.

    returns

    Created op.

    Definition Classes
    Optimizer
  44. val momentum: Float

    Permalink

    Momentum.

    Momentum. Must be >= 0.

    Definition Classes
    YellowFinGradientDescent
  45. var momentumTensor: Output

    Permalink
    Attributes
    protected
    Definition Classes
    GradientDescent
  46. var momentumVariable: variables.Variable

    Permalink
    Attributes
    protected
  47. var movingAverage: ExponentialMovingAverage

    Permalink
    Attributes
    protected
  48. val name: String

    Permalink

    Name for this optimizer.

    Name for this optimizer.

    Definition Classes
    YellowFinGradientDescentOptimizer
  49. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  50. final val nonSlotVariables: Map[(String, Option[core.Graph]), variables.Variable]

    Permalink

    Contains variables used by some optimizers that require no slots to be stored.

    Contains variables used by some optimizers that require no slots to be stored.

    Attributes
    protected
    Definition Classes
    Optimizer
  51. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  52. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  53. def prepare(iteration: Option[variables.Variable]): Unit

    Permalink

    Creates all necessary tensors before applying the gradients.

    Creates all necessary tensors before applying the gradients. This function is called from within an op creation context that uses as its name scope the name that users have chosen for the application of gradients.

    Definition Classes
    YellowFinGradientDescentOptimizer
  54. final def slotNames: Set[String]

    Permalink

    Returns the names of all slots used by this optimizer.

    Returns the names of all slots used by this optimizer.

    Attributes
    protected
    Definition Classes
    Optimizer
  55. final val slots: Map[String, Map[variables.Variable, variables.Variable]]

    Permalink

    Some Optimizer subclasses use additional variables.

    Some Optimizer subclasses use additional variables. For example, MomentumOptimizer and AdaGradOptimizer use variables to accumulate updates. This map is where these variables are stored.

    Attributes
    protected
    Definition Classes
    Optimizer
  56. val sparsityDebias: Boolean

    Permalink

    The gradient norm and curvature are biased towards larger values when computed for sparse gradients.

    The gradient norm and curvature are biased towards larger values when computed for sparse gradients. This is useful when the model is very sparse, e.g. LSTMs with word embeddings. For non-sparse CNNs, turning it off could slightly accelerate the algorithm's speed.

  57. var step: variables.Variable

    Permalink
    Attributes
    protected
  58. val supportedDataTypes: Set[types.DataType]

    Permalink

    Supported data types for the loss function, the variables, and the gradients.

    Supported data types for the loss function, the variables, and the gradients. Subclasses should override this field allow other float types.

    Definition Classes
    Optimizer
  59. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  60. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  61. val useLocking: Boolean

    Permalink

    If true, the gradient descent updates will be protected by a lock.

    If true, the gradient descent updates will be protected by a lock. Otherwise, the behavior is undefined, but may exhibit less contention.

    Definition Classes
    YellowFinGradientDescentOptimizer
  62. val useNesterov: Boolean

    Permalink

    Boolean value indicating whether to use Nesterov acceleration or not.

    Boolean value indicating whether to use Nesterov acceleration or not. For details, refer to [Sutskever et. al., 2013](http://proceedings.mlr.press/v28/sutskever13.pdf).

    Definition Classes
    YellowFinGradientDescent
  63. final def variables: Seq[variables.Variable]

    Permalink

    Returns a sequence of variables which encode the current state of this optimizer.

    Returns a sequence of variables which encode the current state of this optimizer. The returned variables include both slot variables and non-slot global variables created by this optimizer, in the current graph.

    Definition Classes
    Optimizer
  64. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  65. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  66. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  67. def yellowFinUpdate(gradientsAndVariables: Seq[(OutputLike, variables.Variable)]): Op

    Permalink
    Attributes
    protected
  68. val zeroDebias: Boolean

    Permalink

    If true, the moving averages will be zero-debiased.

  69. final def zerosSlot(name: String, variable: variables.Variable, variableScope: String): variables.Variable

    Permalink

    Gets an existing slot or creates a new one using an initial value of zeros, if none exists.

    Gets an existing slot or creates a new one using an initial value of zeros, if none exists.

    name

    Slot name.

    variable

    Slot primary variable.

    variableScope

    Name to use when scoping the variable that needs to be created for the slot.

    returns

    Requested slot variable.

    Attributes
    protected
    Definition Classes
    Optimizer

Inherited from GradientDescent

Inherited from Optimizer

Inherited from AnyRef

Inherited from Any

Ungrouped