Class ReinforcementHyperparameters
-
- All Implemented Interfaces:
public final class ReinforcementHyperparametersThe hyperparameters used for the reinforcement fine-tuning job.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description public final classReinforcementHyperparameters.BuilderA builder for ReinforcementHyperparameters.
public final classReinforcementHyperparameters.BatchSizeNumber of examples in each batch. A larger batch size means that model parameters are updated less frequently, but with lower variance.
public final classReinforcementHyperparameters.ComputeMultiplierMultiplier on amount of compute used for exploring search space during training.
public final classReinforcementHyperparameters.EvalIntervalThe number of training steps between evaluation runs.
public final classReinforcementHyperparameters.EvalSamplesNumber of evaluation samples to generate per training step.
public final classReinforcementHyperparameters.LearningRateMultiplierScaling factor for the learning rate. A smaller learning rate may be useful to avoid overfitting.
public final classReinforcementHyperparameters.NEpochsThe number of epochs to train the model for. An epoch refers to one full cycle through the training dataset.
public final classReinforcementHyperparameters.ReasoningEffortLevel of reasoning effort.
-
Method Summary
-
-
Method Detail
-
batchSize
final Optional<ReinforcementHyperparameters.BatchSize> batchSize()
Number of examples in each batch. A larger batch size means that model parameters are updated less frequently, but with lower variance.
-
computeMultiplier
final Optional<ReinforcementHyperparameters.ComputeMultiplier> computeMultiplier()
Multiplier on amount of compute used for exploring search space during training.
-
evalInterval
final Optional<ReinforcementHyperparameters.EvalInterval> evalInterval()
The number of training steps between evaluation runs.
-
evalSamples
final Optional<ReinforcementHyperparameters.EvalSamples> evalSamples()
Number of evaluation samples to generate per training step.
-
learningRateMultiplier
final Optional<ReinforcementHyperparameters.LearningRateMultiplier> learningRateMultiplier()
Scaling factor for the learning rate. A smaller learning rate may be useful to avoid overfitting.
-
nEpochs
final Optional<ReinforcementHyperparameters.NEpochs> nEpochs()
The number of epochs to train the model for. An epoch refers to one full cycle through the training dataset.
-
reasoningEffort
final Optional<ReinforcementHyperparameters.ReasoningEffort> reasoningEffort()
Level of reasoning effort.
-
_batchSize
final JsonField<ReinforcementHyperparameters.BatchSize> _batchSize()
Returns the raw JSON value of batchSize.
Unlike batchSize, this method doesn't throw if the JSON field has an unexpected type.
-
_computeMultiplier
final JsonField<ReinforcementHyperparameters.ComputeMultiplier> _computeMultiplier()
Returns the raw JSON value of computeMultiplier.
Unlike computeMultiplier, this method doesn't throw if the JSON field has an unexpected type.
-
_evalInterval
final JsonField<ReinforcementHyperparameters.EvalInterval> _evalInterval()
Returns the raw JSON value of evalInterval.
Unlike evalInterval, this method doesn't throw if the JSON field has an unexpected type.
-
_evalSamples
final JsonField<ReinforcementHyperparameters.EvalSamples> _evalSamples()
Returns the raw JSON value of evalSamples.
Unlike evalSamples, this method doesn't throw if the JSON field has an unexpected type.
-
_learningRateMultiplier
final JsonField<ReinforcementHyperparameters.LearningRateMultiplier> _learningRateMultiplier()
Returns the raw JSON value of learningRateMultiplier.
Unlike learningRateMultiplier, this method doesn't throw if the JSON field has an unexpected type.
-
_nEpochs
final JsonField<ReinforcementHyperparameters.NEpochs> _nEpochs()
Returns the raw JSON value of nEpochs.
Unlike nEpochs, this method doesn't throw if the JSON field has an unexpected type.
-
_reasoningEffort
final JsonField<ReinforcementHyperparameters.ReasoningEffort> _reasoningEffort()
Returns the raw JSON value of reasoningEffort.
Unlike reasoningEffort, this method doesn't throw if the JSON field has an unexpected type.
-
_additionalProperties
final Map<String, JsonValue> _additionalProperties()
-
toBuilder
final ReinforcementHyperparameters.Builder toBuilder()
-
validate
final ReinforcementHyperparameters validate()
-
builder
final static ReinforcementHyperparameters.Builder builder()
Returns a mutable builder for constructing an instance of ReinforcementHyperparameters.
-
-
-
-