Provides a string representation of common errors from a classifier.
Provides precision, recall and f-score for labellings.
Implements statistical significance testing for the output of two systems by randomization.
Provides utilities for descriptive statistics, like the mean and variance.