Class Count


  • public class Count
    extends java.lang.Object
    PTransforms to count the elements in a PCollection.

    perElement() can be used to count the number of occurrences of each distinct element in the PCollection, perKey() can be used to count the number of values per key, and globally() can be used to count the total number of elements in a PCollection.

    combineFn() can also be used manually, in combination with state and with the Combine transform.

    • Method Detail

      • globally

        public static <T> PTransform<PCollection<T>,​PCollection<java.lang.Long>> globally()
        Returns a PTransform that counts the number of elements in its input PCollection.

        Note: if the input collection uses a windowing strategy other than GlobalWindows, use Combine.globally(Count.<T>combineFn()).withoutDefaults() instead.

      • perElement

        public static <T> PTransform<PCollection<T>,​PCollection<KV<T,​java.lang.Long>>> perElement()
        Returns a PTransform that counts the number of occurrences of each element in its input PCollection.

        The returned PTransform takes a PCollection<T> and returns a PCollection<KV<T, Long>> representing a map from each distinct element of the input PCollection to the number of times that element occurs in the input. Each key in the output PCollection is unique.

        The returned transform compares two values of type T by first encoding each element using the input PCollection's Coder, then comparing the encoded bytes. Because of this, the input coder must be deterministic. (See Coder.verifyDeterministic() for more detail). Performing the comparison in this manner admits efficient parallel evaluation.

        By default, the Coder of the keys of the output PCollection is the same as the Coder of the elements of the input PCollection.

        Example of use:

        
         PCollection<String> words = ...;
         PCollection<KV<String, Long>> wordCounts =
             words.apply(Count.<String>perElement());