Package dev.braintrust.eval
Interface Scorer<INPUT,OUTPUT>
- Type Parameters:
INPUT- type of the input dataOUTPUT- type of the output data
- All Known Subinterfaces:
TracedScorer<INPUT,OUTPUT>
- All Known Implementing Classes:
ScorerBrainstoreImpl
public interface Scorer<INPUT,OUTPUT>
A scorer evaluates the result of a task against a dataset case, producing a score between 0
(inclusive) and 1 (inclusive).
-
Method Summary
Modifier and TypeMethodDescriptionstatic <INPUT,OUTPUT>
Scorer<INPUT,OUTPUT> fetchFromBraintrust(BraintrustApiClient apiClient, String projectName, String scorerSlug, String version) Deprecated.static <INPUT,OUTPUT>
Scorer<INPUT,OUTPUT> fetchFromBraintrust(BraintrustOpenApiClient apiClient, String projectName, String scorerSlug, String version) Fetch a scorer from Braintrust by project name and slug.getName()static <INPUT,OUTPUT>
Scorer<INPUT,OUTPUT> of(String scorerName, BiFunction<OUTPUT, OUTPUT, Double> scorerFn) static <INPUT,OUTPUT>
Scorer<INPUT,OUTPUT> of(String scorerName, Function<TaskResult<INPUT, OUTPUT>, Double> scorerFn) score(TaskResult<INPUT, OUTPUT> taskResult) Scores the result of a successful task execution.scoreForScorerException(Exception scorerException, TaskResult<INPUT, OUTPUT> taskResult) Provides fallback scores when this scorer'sscore(dev.braintrust.eval.TaskResult<INPUT, OUTPUT>)method threw an exception.scoreForTaskException(Exception taskException, DatasetCase<INPUT, OUTPUT> datasetCase) Provides fallback scores when the task function threw an exception.
-
Method Details
-
getName
String getName() -
score
Scores the result of a successful task execution.- Parameters:
taskResult- the task output and originating dataset case- Returns:
- one or more scores, each with a value between 0 and 1 inclusive
If this method thows, the error will be recorded on the span and
scoreForScorerException(java.lang.Exception, dev.braintrust.eval.TaskResult<INPUT, OUTPUT>)will be called as a fallback
-
scoreForTaskException
default List<Score> scoreForTaskException(Exception taskException, DatasetCase<INPUT, OUTPUT> datasetCase) Provides fallback scores when the task function threw an exception. Called instead ofscore(dev.braintrust.eval.TaskResult<INPUT, OUTPUT>)for each scorer.- Parameters:
taskException- the exception thrown by the taskdatasetCase- the dataset case that was being evaluated- Returns:
- fallback scores, or an empty list to skip scoring for this case
-
scoreForScorerException
default List<Score> scoreForScorerException(Exception scorerException, TaskResult<INPUT, OUTPUT> taskResult) Provides fallback scores when this scorer'sscore(dev.braintrust.eval.TaskResult<INPUT, OUTPUT>)method threw an exception.- Parameters:
scorerException- the exception thrown byscore(dev.braintrust.eval.TaskResult<INPUT, OUTPUT>)taskResult- the task result that was being scored- Returns:
- fallback scores, or an empty list to skip scoring for this case
-
of
static <INPUT,OUTPUT> Scorer<INPUT,OUTPUT> of(String scorerName, Function<TaskResult<INPUT, OUTPUT>, Double> scorerFn) -
of
static <INPUT,OUTPUT> Scorer<INPUT,OUTPUT> of(String scorerName, BiFunction<OUTPUT, OUTPUT, Double> scorerFn) -
fetchFromBraintrust
@Deprecated static <INPUT,OUTPUT> Scorer<INPUT,OUTPUT> fetchFromBraintrust(BraintrustApiClient apiClient, String projectName, String scorerSlug, @Nullable String version) Deprecated. -
fetchFromBraintrust
static <INPUT,OUTPUT> Scorer<INPUT,OUTPUT> fetchFromBraintrust(BraintrustOpenApiClient apiClient, String projectName, String scorerSlug, @Nullable String version) Fetch a scorer from Braintrust by project name and slug.- Parameters:
apiClient- the API client to useprojectName- the name of the project containing the scorerscorerSlug- the unique slug identifier for the scorerversion- optional version of the scorer to fetch- Returns:
- a Scorer that invokes the remote function
- Throws:
RuntimeException- if the scorer is not found
-