QAgent (Deep Java Library 0.18.0 API specification)

java.lang.Object
- ai.djl.modality.rl.agent.QAgent

All Implemented Interfaces:

RlAgent
```
public class QAgent
extends java.lang.Object
implements RlAgent
```
An RlAgent that implements Q or Deep-Q Learning.
Deep-Q Learning estimates the total reward that will be given until the environment ends in a particular state after taking a particular action. Then, it is trained by ensuring that the prediction before taking the action match what would be predicted after taking the action. More information can be found in the paper.
It is one of the earliest successful techniques for reinforcement learning with Deep learning. It is also a good introduction to the field. However, many better techniques are commonly used now.

Constructor Summary

Constructors
Constructor	Description
`QAgent(Trainer trainer, float rewardDiscount)`	Constructs a `QAgent`.
`QAgent(Trainer trainer, float rewardDiscount, Batchifier batchifier)`	Constructs a `QAgent` with a custom `Batchifier`.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`NDList`	`chooseAction(RlEnv env, boolean training)`	Chooses the next action to take within the `RlEnv`.
`void`	`trainBatch(RlEnv.Step[] batchSteps)`	Trains this `RlAgent` on a batch of `RlEnv.Step`s.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - QAgent
```
public QAgent(Trainer trainer,
              float rewardDiscount)
```
    Constructs a QAgent.
    It uses the StackBatchifier as the default batchifier.
    
    Parameters:
    
    trainer - the trainer for the model to learn
    
    rewardDiscount - the reward discount to apply to rewards from future states
  - QAgent
```
public QAgent(Trainer trainer,
              float rewardDiscount,
              Batchifier batchifier)
```
    Constructs a QAgent with a custom Batchifier.
    
    Parameters:
    
    trainer - the trainer for the model to learn
    
    rewardDiscount - the reward discount to apply to rewards from future states
    
    batchifier - the batchifier to join inputs with
- Method Detail
  - chooseAction
```
public NDList chooseAction(RlEnv env,
                           boolean training)
```
    Chooses the next action to take within the RlEnv.
    
    Specified by:
    
    chooseAction in interface RlAgent
    
    Parameters:
    
    env - the current environment
    
    training - true if the agent is currently traning
    
    Returns:
    
    the action to take
  - trainBatch
```
public void trainBatch(RlEnv.Step[] batchSteps)
```
    Trains this RlAgent on a batch of RlEnv.Steps.
    
    Specified by:
    
    trainBatch in interface RlAgent
    
    Parameters:
    
    batchSteps - the steps to train on