Class/Object

org.platanios.tensorflow.api.learn

Configuration

Related Docs: object Configuration | package learn

Permalink

case class Configuration(workingDir: Option[Path] = None, sessionConfig: Option[SessionConfig] = None, checkpointConfig: CheckpointConfig = TimeBasedCheckpoints(600, 5, 10000), randomSeed: Option[Int] = None) extends Product with Serializable

Configuration for models in the learn API, to be used by estimators.

If clusterConfig is not provided, then all distributed training related properties are set based on the TF_CONFIG environment variable, if the pertinent information is present. The TF_CONFIG environment variable is a JSON object with attributes: cluster and task.

cluster is a JSON serialized version of ClusterConfig, mapping task types (usually one of the instances of TaskType) to a list of task addresses.

task has two attributes: type and index, where type can be any of the task types in cluster. When TF_CONFIG contains said information, the following properties are set on this class:

There is a special node with taskType set as EVALUATOR, which is not part of the (training) clusterConfig. It handles the distributed evaluation job.

Example for a non-chief node:

// The TF_CONFIG environment variable contains:
// {
//   "cluster": {
//     "chief": ["host0:2222"],
//     "ps": ["host1:2222", "host2:2222"],
//     "worker": ["host3:2222", "host4:2222", "host5:2222"]}
//   "task": {
//     "type": "worker",
//     "index": 1}}
// }
val config = Configuration()
assert(config.clusterConfig == Some(ClusterConfig(Map(
  "chief" -> JobConfig.fromAddresses("host0:2222"),
  "ps" -> JobConfig.fromAddresses("host1:2222", "host2:2222"),
  "worker" -> JobConfig.fromAddresses("host3:2222", "host4:2222", "host5:2222")))))
assert(config.taskType == "worker")
assert(config.taskIndex == 1)
assert(config.master == "host4:2222")
assert(config.numParameterServers == 2)
assert(config.numWorkers == 4)
assert(!config.isChief)

Example for a chief node:

// The TF_CONFIG environment variable contains:
// {
//   "cluster": {
//     "chief": ["host0:2222"],
//     "ps": ["host1:2222", "host2:2222"],
//     "worker": ["host3:2222", "host4:2222", "host5:2222"]}
//   "task": {
//     "type": "chief",
//     "index": 0}}
// }
val config = Configuration()
assert(config.clusterConfig == Some(ClusterConfig(Map(
  "chief" -> JobConfig.fromAddresses("host0:2222"),
  "ps" -> JobConfig.fromAddresses("host1:2222", "host2:2222"),
  "worker" -> JobConfig.fromAddresses("host3:2222", "host4:2222", "host5:2222")))))
assert(config.taskType == "chief")
assert(config.taskIndex == 0)
assert(config.master == "host0:2222")
assert(config.numParameterServers == 2)
assert(config.numWorkers == 4)
assert(config.isChief)

Example for an evaluator node (an evaluator is not part of the training cluster):

// The TF_CONFIG environment variable contains:
// {
//   "cluster": {
//     "chief": ["host0:2222"],
//     "ps": ["host1:2222", "host2:2222"],
//     "worker": ["host3:2222", "host4:2222", "host5:2222"]}
//   "task": {
//     "type": "evaluator",
//     "index": 0}}
// }
val config = Configuration()
assert(config.clusterConfig == None)
assert(config.taskType == "evaluator")
assert(config.taskIndex == 0)
assert(config.master == "")
assert(config.numParameterServers == 0)
assert(config.numWorkers == 0)
assert(!config.isChief)

NOTE: If a checkpointConfig is set, maxCheckpointsToKeep might need to be adjusted accordingly, especially in distributed training. For example, using TimeBasedCheckpoints(60) without adjusting maxCheckpointsToKeep (which defaults to 5) leads to a situation that checkpoints would be garbage collected after 5 minutes. In distributed training, the evaluation job starts asynchronously and might fail to load or find the checkpoints due to a race condition.

workingDir

Directory used to save model parameters, graph, etc. It can also be used to load checkpoints for a previously saved model. If null, a temporary directory will be used.

sessionConfig

Configuration to use for the created sessions.

checkpointConfig

Configuration specifying when to save checkpoints.

randomSeed

Random seed value to be used by the TensorFlow initializers. Setting this value allows consistency between re-runs.

Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. Configuration
  2. Serializable
  3. Serializable
  4. Product
  5. Equals
  6. AnyRef
  7. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new Configuration(workingDir: Option[Path] = None, sessionConfig: Option[SessionConfig] = None, checkpointConfig: CheckpointConfig = TimeBasedCheckpoints(600, 5, 10000), randomSeed: Option[Int] = None)

    Permalink

    workingDir

    Directory used to save model parameters, graph, etc. It can also be used to load checkpoints for a previously saved model. If null, a temporary directory will be used.

    sessionConfig

    Configuration to use for the created sessions.

    checkpointConfig

    Configuration specifying when to save checkpoints.

    randomSeed

    Random seed value to be used by the TensorFlow initializers. Setting this value allows consistency between re-runs.

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. val checkpointConfig: CheckpointConfig

    Permalink

    Configuration specifying when to save checkpoints.

  6. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  7. val clusterConfig: Option[ClusterConfig]

    Permalink
  8. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  9. val evaluationMaster: String

    Permalink
  10. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  11. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  12. val isChief: Boolean

    Permalink
  13. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  14. val master: String

    Permalink
  15. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  16. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  17. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  18. val numParameterServers: Int

    Permalink
  19. val numWorkers: Int

    Permalink
  20. val randomSeed: Option[Int]

    Permalink

    Random seed value to be used by the TensorFlow initializers.

    Random seed value to be used by the TensorFlow initializers. Setting this value allows consistency between re-runs.

  21. val sessionConfig: Option[SessionConfig]

    Permalink

    Configuration to use for the created sessions.

  22. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  23. val taskIndex: Int

    Permalink
  24. val taskType: String

    Permalink
  25. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  26. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  27. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  28. val workingDir: Option[Path]

    Permalink

    Directory used to save model parameters, graph, etc.

    Directory used to save model parameters, graph, etc. It can also be used to load checkpoints for a previously saved model. If null, a temporary directory will be used.

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from AnyRef

Inherited from Any

Ungrouped