Class ReinforcementHyperparameters
-
- All Implemented Interfaces:
public final class ReinforcementHyperparameters
The hyperparameters used for the reinforcement fine-tuning job.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description public final class
ReinforcementHyperparameters.Builder
A builder for ReinforcementHyperparameters.
public final class
ReinforcementHyperparameters.BatchSize
Number of examples in each batch. A larger batch size means that model parameters are updated less frequently, but with lower variance.
public final class
ReinforcementHyperparameters.ComputeMultiplier
Multiplier on amount of compute used for exploring search space during training.
public final class
ReinforcementHyperparameters.EvalInterval
The number of training steps between evaluation runs.
public final class
ReinforcementHyperparameters.EvalSamples
Number of evaluation samples to generate per training step.
public final class
ReinforcementHyperparameters.LearningRateMultiplier
Scaling factor for the learning rate. A smaller learning rate may be useful to avoid overfitting.
public final class
ReinforcementHyperparameters.NEpochs
The number of epochs to train the model for. An epoch refers to one full cycle through the training dataset.
public final class
ReinforcementHyperparameters.ReasoningEffort
Level of reasoning effort.
-
Method Summary
-
-
Method Detail
-
batchSize
final Optional<ReinforcementHyperparameters.BatchSize> batchSize()
Number of examples in each batch. A larger batch size means that model parameters are updated less frequently, but with lower variance.
-
computeMultiplier
final Optional<ReinforcementHyperparameters.ComputeMultiplier> computeMultiplier()
Multiplier on amount of compute used for exploring search space during training.
-
evalInterval
final Optional<ReinforcementHyperparameters.EvalInterval> evalInterval()
The number of training steps between evaluation runs.
-
evalSamples
final Optional<ReinforcementHyperparameters.EvalSamples> evalSamples()
Number of evaluation samples to generate per training step.
-
learningRateMultiplier
final Optional<ReinforcementHyperparameters.LearningRateMultiplier> learningRateMultiplier()
Scaling factor for the learning rate. A smaller learning rate may be useful to avoid overfitting.
-
nEpochs
final Optional<ReinforcementHyperparameters.NEpochs> nEpochs()
The number of epochs to train the model for. An epoch refers to one full cycle through the training dataset.
-
reasoningEffort
final Optional<ReinforcementHyperparameters.ReasoningEffort> reasoningEffort()
Level of reasoning effort.
-
_batchSize
final JsonField<ReinforcementHyperparameters.BatchSize> _batchSize()
Returns the raw JSON value of batchSize.
Unlike batchSize, this method doesn't throw if the JSON field has an unexpected type.
-
_computeMultiplier
final JsonField<ReinforcementHyperparameters.ComputeMultiplier> _computeMultiplier()
Returns the raw JSON value of computeMultiplier.
Unlike computeMultiplier, this method doesn't throw if the JSON field has an unexpected type.
-
_evalInterval
final JsonField<ReinforcementHyperparameters.EvalInterval> _evalInterval()
Returns the raw JSON value of evalInterval.
Unlike evalInterval, this method doesn't throw if the JSON field has an unexpected type.
-
_evalSamples
final JsonField<ReinforcementHyperparameters.EvalSamples> _evalSamples()
Returns the raw JSON value of evalSamples.
Unlike evalSamples, this method doesn't throw if the JSON field has an unexpected type.
-
_learningRateMultiplier
final JsonField<ReinforcementHyperparameters.LearningRateMultiplier> _learningRateMultiplier()
Returns the raw JSON value of learningRateMultiplier.
Unlike learningRateMultiplier, this method doesn't throw if the JSON field has an unexpected type.
-
_nEpochs
final JsonField<ReinforcementHyperparameters.NEpochs> _nEpochs()
Returns the raw JSON value of nEpochs.
Unlike nEpochs, this method doesn't throw if the JSON field has an unexpected type.
-
_reasoningEffort
final JsonField<ReinforcementHyperparameters.ReasoningEffort> _reasoningEffort()
Returns the raw JSON value of reasoningEffort.
Unlike reasoningEffort, this method doesn't throw if the JSON field has an unexpected type.
-
_additionalProperties
final Map<String, JsonValue> _additionalProperties()
-
toBuilder
final ReinforcementHyperparameters.Builder toBuilder()
-
validate
final ReinforcementHyperparameters validate()
-
builder
final static ReinforcementHyperparameters.Builder builder()
Returns a mutable builder for constructing an instance of ReinforcementHyperparameters.
-
-
-
-