A model for performing bootstrap style exploration.
A model which does epsilon greedy style exploration.
A model which does epsilon greedy style exploration. This will choose a random action with probability epsilon or an action from the defaultPolicy with probability 1 - epsilon. Note that the default policy MUST return a value between 1 and the number of actions, and if not an exception will be thrown.
model input type
model output type
a model identifier
the model to use for exploitation. This MUST be deterministic for the probability to be correct.
The model must return a value in the range 1 to classLabels.size
(inclusive).
the exploration/exploitation tradeoff parameter. epsilon must be in the interval [0, 1]. 0 indicates never select an action randomly. 1 indicates always select an action randomly.
a function that generates a salt for the randomization layer. This salt allows the random choice of which policy to follow to be repeatable.
a list of class labels to output for the final type. Also note that the size of this controls the number of actions. If the submodel returns a score < 1 or > classLabels.size (note the 1 offset) then a RuntimeException will be thrown.
A model for performing bootstrap style exploration. This makes use of a number of policies. The algorithm chooses one policy and then uses the other to calculate the appropriate probability of choosing that action. Note that the models MUST return a value between 1 and the number of actions, and if not an exception will be thrown.
model input type
model output type
a model identifier
a set of models that generate Int's. These models MUST be deterministic for the probability to be correct. Each model must return a value in the range 1 to
classLabels.size
(inclusive).a function that generates a salt for the randomization layer. This salt allows the random choice of which policy to follow to be repeatable.
a list of class labels to output for the final type. Also note that the size of this controls the number of actions. If the submodel returns a score < 1 or > classLabels.size (note the 1 offset) then a RuntimeException will be thrown.