Interface CategoricalColumn<T>

  • All Superinterfaces:
    Column<T>, Comparator<T>, Iterable<T>
    All Known Implementing Classes:
    BooleanColumn, DateColumn, DateTimeColumn, InstantColumn, IntColumn, LongColumn, ShortColumn, StringColumn, TimeColumn

    public interface CategoricalColumn<T>
    extends Column<T>
    A column type that can be summarized, or serve as a grouping variable in cross tabs or other aggregation operations.

    The column data is generally discrete, however NumberColumn implements CategoricalColumn so that it can be used to summarize when it contains ints. If you use it to summarize over a large range of floating point numbers, you will likely run out of memory.

    Supporting subtypes include: - StringColumn - BooleanColumn - DateColumn, - etc

    DateTimeColumn is not included. TimeColumn can be converted to ints without loss of data, so it does implement this interface

    • Method Detail

      • countByCategory

        default Table countByCategory()
        Returns a count of the number of elements in each category (i.e., the number of repetitions of each value) TODO: This needs to be well tested, especially for IntColumn