EpsilonGreedyModel

A model which does epsilon greedy style exploration. This will choose a random action with probability epsilon or an action from the defaultPolicy with probability 1 - epsilon. Note that the default policy MUST return a value between 1 and the number of actions, and if not an exception will be thrown.

A: model input type
B: model output type
modelId: a model identifier
defaultPolicy: the model to use for exploitation. This MUST be deterministic for the probability to be correct. The model must return a value in the range 1 to classLabels.size (inclusive).
epsilon: the exploration/exploitation tradeoff parameter. epsilon must be in the interval [0, 1]. 0 indicates never select an action randomly. 1 indicates always select an action randomly.
salt: a function that generates a salt for the randomization layer. This salt allows the random choice of which policy to follow to be repeatable.
classLabels: a list of class labels to output for the final type. Also note that the size of this controls the number of actions. If the submodel returns a score < 1 or > classLabels.size (note the 1 offset) then a RuntimeException will be thrown.

Linear Supertypes

Serializable, Serializable, Product, Equals, SubmodelBase[U, N, A, B], Model[A, B], (A) ⇒ B, Submodel[N, A, B], Closeable, AutoCloseable, Identifiable[ModelIdentity], AnyRef, Any

Instance Constructors

new EpsilonGreedyModel(modelId: ModelIdentity, defaultPolicy: Submodel[Int, A, U], epsilon: Float, salt: GenAggFunc[A, Long], classLabels: IndexedSeq[N], auditor: Auditor[U, N, B])

modelId
a model identifier
defaultPolicy
the model to use for exploitation. This MUST be deterministic for the probability to be correct. The model must return a value in the range 1 to classLabels.size (inclusive).
epsilon
the exploration/exploitation tradeoff parameter. epsilon must be in the interval [0, 1]. 0 indicates never select an action randomly. 1 indicates always select an action randomly.
salt
a function that generates a salt for the randomization layer. This salt allows the random choice of which policy to follow to be repeatable.
classLabels
a list of class labels to output for the final type. Also note that the size of this controls the number of actions. If the submodel returns a score < 1 or > classLabels.size (note the 1 offset) then a RuntimeException will be thrown.

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def andThen[A](g: (B) ⇒ A): (A) ⇒ A

Definition Classes
Function1
Annotations
@unspecialized()
final def apply(a: A): B

Definition Classes
SubmodelBase → Function1
final def asInstanceOf[T0]: T0

Definition Classes
Any
val auditor: Auditor[U, N, B]

Definition Classes
EpsilonGreedyModel → SubmodelBase
val classLabels: IndexedSeq[N]

a list of class labels to output for the final type.
a list of class labels to output for the final type. Also note that the size of this controls the number of actions. If the submodel returns a score < 1 or > classLabels.size (note the 1 offset) then a RuntimeException will be thrown.
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
def close(): Unit

Definition Classes
EpsilonGreedyModel → SubmodelBase → Closeable → AutoCloseable
def compose[A](g: (A) ⇒ A): (A) ⇒ B

Definition Classes
Function1
Annotations
@unspecialized()
val defaultPolicy: Submodel[Int, A, U]

the model to use for exploitation.
the model to use for exploitation. This MUST be deterministic for the probability to be correct. The model must return a value in the range 1 to classLabels.size (inclusive).
val epsilon: Float

the exploration/exploitation tradeoff parameter.
the exploration/exploitation tradeoff parameter. epsilon must be in the interval [0, 1]. 0 indicates never select an action randomly. 1 indicates always select an action randomly.
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
lazy val explorer: EpsilonGreedyExplorer[Int]
def failure(errorMsgs: ⇒ Seq[String] = Nil, missingVarNames: ⇒ Set[String] = Set.empty, subvalues: Seq[U] = Nil): Subvalue[B, N]

Attributes
protected[this]
Definition Classes
SubmodelBase
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
val modelId: ModelIdentity

a model identifier
a model identifier

Definition Classes
EpsilonGreedyModel → Identifiable
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
val salt: GenAggFunc[A, Long]

a function that generates a salt for the randomization layer.
a function that generates a salt for the randomization layer. This salt allows the random choice of which policy to follow to be repeatable.
def subvalue(a: A): Subvalue[B, N]

Definition Classes
EpsilonGreedyModel → Submodel
def success(naturalValue: N, errorMsgs: ⇒ Seq[String] = Nil, missingVarNames: ⇒ Set[String] = Set.empty, subvalues: Seq[U] = Nil, prob: ⇒ Option[Float] = None): Subvalue[B, N]

Attributes
protected[this]
Definition Classes
SubmodelBase
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
Function1 → AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Related Docs: object EpsilonGreedyModel | package exploration

case class EpsilonGreedyModel[U, N, -A, B <: U](modelId: ModelIdentity, defaultPolicy: Submodel[Int, A, U], epsilon: Float, salt: GenAggFunc[A, Long], classLabels: IndexedSeq[N], auditor: Auditor[U, N, B]) extends SubmodelBase[U, N, A, B] with Product with Serializable

Instance Constructors

new EpsilonGreedyModel(modelId: ModelIdentity, defaultPolicy: Submodel[Int, A, U], epsilon: Float, salt: GenAggFunc[A, Long], classLabels: IndexedSeq[N], auditor: Auditor[U, N, B])

Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

def andThen[A](g: (B) ⇒ A): (A) ⇒ A

final def apply(a: A): B

final def asInstanceOf[T0]: T0

val auditor: Auditor[U, N, B]

val classLabels: IndexedSeq[N]

def clone(): AnyRef

def close(): Unit

def compose[A](g: (A) ⇒ A): (A) ⇒ B

val defaultPolicy: Submodel[Int, A, U]

val epsilon: Float

final def eq(arg0: AnyRef): Boolean

lazy val explorer: EpsilonGreedyExplorer[Int]

def failure(errorMsgs: ⇒ Seq[String] = Nil, missingVarNames: ⇒ Set[String] = Set.empty, subvalues: Seq[U] = Nil): Subvalue[B, N]

def finalize(): Unit

final def getClass(): Class[_]

final def isInstanceOf[T0]: Boolean

val modelId: ModelIdentity

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

val salt: GenAggFunc[A, Long]

def subvalue(a: A): Subvalue[B, N]

def success(naturalValue: N, errorMsgs: ⇒ Seq[String] = Nil, missingVarNames: ⇒ Set[String] = Set.empty, subvalues: Seq[U] = Nil, prob: ⇒ Option[Float] = None): Subvalue[B, N]

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from SubmodelBase[U, N, A, B]

Inherited from Model[A, B]

Inherited from (A) ⇒ B

Inherited from Submodel[N, A, B]

Inherited from Closeable

Inherited from AutoCloseable

Inherited from Identifiable[ModelIdentity]

Inherited from AnyRef

Inherited from Any

Ungrouped