The 'pair' method is a simplified version of .sliding(2), returning just pairs of (T, T) values for every consecutive pair of T values in the input RDD.
The 'pair' method is a simplified version of .sliding(2), returning just pairs of (T, T) values for every consecutive pair of T values in the input RDD.
For example, calling .pair() on a (sorted) RDD of 1, 2, 3, 4
should return the following pairs (1, 2), (2, 3), (3, 4)
an RDD[(T, T)] of all consecutive pairs of values
The 'pairWithEnds' method is a variation on 'pairs', except that it returns two _extra_ pairs (relative to 'pairs') corresponding to the first and last elements of the original RDD.
The 'pairWithEnds' method is a variation on 'pairs', except that it returns two _extra_ pairs (relative to 'pairs') corresponding to the first and last elements of the original RDD. Every (t1, t2) from .pair() now becomes a (Some(t1), Some(t2)) with .pairWithEnds(). The first element is a (None, Some(t0)) and the last element is a (Some(tN), None).
For example, calling .pairWithEnds() on a (sorted) RDD of 1, 2, 3
should return the following pairs (None, Some(1)), (Some(1), Some(2)), (Some(2), Some(3)), (Some(3), None)
(This is immediately useful as a helper method inside the Coverage class, but also might be useful to other applications as well, that rely on a total ordering of the elements within a single RDD.)
an RDD[(T, T)] of all consecutive pairs of values
Replicates the Seq.sliding(int) method, where we turn an RDD[T] into an RDD[Seq[T]], where each internal Seq contains exactly 'width' values taken (in order) from the original RDD, and where all such windows are presented 'in order' in the output set.
Replicates the Seq.sliding(int) method, where we turn an RDD[T] into an RDD[Seq[T]], where each internal Seq contains exactly 'width' values taken (in order) from the original RDD, and where all such windows are presented 'in order' in the output set.
E.g. the result of 'sliding(3)' on an RDD of the elements 1, 2, 3, 4, 5
Should be an RDD of Seq(1, 2, 3), Seq(2, 3, 4), Seq(3, 4, 5)
The 'width' of the sliding window to calculate
An RDD of the sliding window values
PairingRDD provides some simple helper methods, allowing us take an RDD (presumably an RDD whose values are in some reasonable or intelligible order within and across partitions) and get paired or windowed views on that list of items.
The type of the values in the RDD