Package com.judgmentlabs.judgeval
Class JudgmentClient
java.lang.Object
com.judgmentlabs.judgeval.JudgmentClient
Main client for running evaluations with Judgment Labs.
The JudgmentClient provides functionality to:
- Run evaluations with multiple examples and scorers
- Validate inputs and scorer configurations
- Poll for evaluation results
- Assert test results for automated testing
Basic Usage
JudgmentClient client = new JudgmentClient(apiKey, organizationId);
List<Example> examples = Arrays.asList(
Example.builder()
.input("What is 2+2?")
.actualOutput("4")
.expectedOutput("4")
.build());
List<BaseScorer> scorers = Arrays.asList(
AnswerCorrectnessScorer.create(0.8));
List<ScoringResult> results = client.runEvaluation(
examples, scorers, "my-project", "test-run", "gpt-4", false);
Test Mode
// Enable test assertions
List<ScoringResult> results = client.runEvaluation(
examples, scorers, "my-project", "test-run", "gpt-4", true);
// This will throw JudgmentTestError if any tests fail
- See Also:
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionrunEvaluation(Example example, BaseScorer scorer, String projectName, String evalRunName) runEvaluation(Example example, BaseScorer scorer, String projectName, String evalRunName, String model) runEvaluation(List<Example> examples, List<BaseScorer> scorers, String projectName, String evalRunName) runEvaluation(List<Example> examples, List<BaseScorer> scorers, String projectName, String evalRunName, String model, boolean assertTest) Runs an evaluation with the specified examples and scorers.
-
Constructor Details
-
JudgmentClient
-
-
Method Details
-
runEvaluation
public List<ScoringResult> runEvaluation(List<Example> examples, List<BaseScorer> scorers, String projectName, String evalRunName, String model, boolean assertTest) Runs an evaluation with the specified examples and scorers.The method performs the following validations:
- All examples must have the same field keys
- Examples must contain required parameters for all scorers
- Cannot mix local and Judgment API scorers
- All input parameters must be valid
- Parameters:
examples- the examples to evaluatescorers- the scorers to use for evaluationprojectName- the project nameevalRunName- the evaluation run namemodel- the model used for generation (can be null, will use default)assertTest- whether to assert test results and throw exceptions on failures- Returns:
- a list of scoring results for each example
- Throws:
IllegalArgumentException- if inputs are invalidJudgmentRuntimeError- if evaluation failsJudgmentTestError- if assertTest is true and any tests fail
-
runEvaluation
public List<ScoringResult> runEvaluation(List<Example> examples, List<BaseScorer> scorers, String projectName, String evalRunName) -
runEvaluation
public List<ScoringResult> runEvaluation(Example example, BaseScorer scorer, String projectName, String evalRunName, String model) -
runEvaluation
public List<ScoringResult> runEvaluation(Example example, BaseScorer scorer, String projectName, String evalRunName)
-