public class QAgent extends java.lang.Object implements RlAgent
RlAgent
that implements Q or Deep-Q Learning.
Deep-Q Learning estimates the total reward that will be given until the environment ends in a particular state after taking a particular action. Then, it is trained by ensuring that the prediction before taking the action match what would be predicted after taking the action. More information can be found in the paper.
It is one of the earliest successful techniques for reinforcement learning with Deep learning. It is also a good introduction to the field. However, many better techniques are commonly used now.
Constructor and Description |
---|
QAgent(Trainer trainer,
float rewardDiscount)
Constructs a
QAgent . |
QAgent(Trainer trainer,
float rewardDiscount,
Batchifier batchifier)
Constructs a
QAgent with a custom Batchifier . |
Modifier and Type | Method and Description |
---|---|
NDList |
chooseAction(RlEnv env,
boolean training)
Chooses the next action to take within the
RlEnv . |
void |
trainBatch(RlEnv.Step[] batchSteps)
Trains this
RlAgent on a batch of RlEnv.Step s. |
public QAgent(Trainer trainer, float rewardDiscount)
QAgent
.
It uses the StackBatchifier
as the default batchifier.
trainer
- the trainer for the model to learnrewardDiscount
- the reward discount to apply to rewards from future statespublic QAgent(Trainer trainer, float rewardDiscount, Batchifier batchifier)
QAgent
with a custom Batchifier
.trainer
- the trainer for the model to learnrewardDiscount
- the reward discount to apply to rewards from future statesbatchifier
- the batchifier to join inputs withpublic NDList chooseAction(RlEnv env, boolean training)
RlEnv
.chooseAction
in interface RlAgent
env
- the current environmenttraining
- true if the agent is currently traningpublic void trainBatch(RlEnv.Step[] batchSteps)
RlAgent
on a batch of RlEnv.Step
s.trainBatch
in interface RlAgent
batchSteps
- the steps to train on