Class

com.spotify.scio.values

PairHashSCollectionFunctions

Related Doc: package values

Permalink

class PairHashSCollectionFunctions[K, V] extends AnyRef

Extra functions available on SCollections of (key, value) pairs for hash based joins through an implicit conversion.

Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. PairHashSCollectionFunctions
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new PairHashSCollectionFunctions(self: SCollection[(K, V)])

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  10. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  11. def hashFullOuterJoin[W](sideInput: SideInput[Map[K, Iterable[W]]])(implicit arg0: Coder[W], koder: Coder[K], voder: Coder[V]): SCollection[(K, (Option[V], Option[W]))]

    Permalink

    Perform a full outer join with a SideInput[Map[K, Iterable[W]]].

    Perform a full outer join with a SideInput[Map[K, Iterable[W]]].

    Example:
    1. val si = pairSCollRight.asMultiMapSingletonSideInput
      val joined1 = pairSColl1Left.hashFullOuterJoin(si)
      val joined2 = pairSColl2Left.hashFullOuterJoin(si)
  12. def hashFullOuterJoin[W](rhs: SCollection[(K, W)])(implicit arg0: Coder[W], koder: Coder[K], voder: Coder[V]): SCollection[(K, (Option[V], Option[W]))]

    Permalink

    Perform a full outer join by replicating rhs to all workers.

    Perform a full outer join by replicating rhs to all workers. The right side should be tiny and fit in memory.

  13. def hashIntersectByKey(sideInput: SideInput[Set[K]])(implicit koder: Coder[K], voder: Coder[V]): SCollection[(K, V)]

    Permalink

    Return an SCollection with the pairs from this whose keys are in the SideSet rhs.

    Return an SCollection with the pairs from this whose keys are in the SideSet rhs.

    Unlike SCollection.intersection this preserves duplicates in this.

  14. def hashIntersectByKey(rhs: SCollection[K])(implicit koder: Coder[K], voder: Coder[V]): SCollection[(K, V)]

    Permalink

    Return an SCollection with the pairs from this whose keys are in rhs given rhs is small enough to fit in memory.

    Return an SCollection with the pairs from this whose keys are in rhs given rhs is small enough to fit in memory.

    Unlike SCollection.intersection this preserves duplicates in this.

  15. def hashJoin[W](sideInput: SideInput[Map[K, Iterable[W]]])(implicit arg0: Coder[W], koder: Coder[K], voder: Coder[V]): SCollection[(K, (V, W))]

    Permalink

    Perform an inner join with a MultiMap SideInput[Map[K, Iterable[V]]

    Perform an inner join with a MultiMap SideInput[Map[K, Iterable[V]]

    The right side is tiny and fits in memory. The SideInput can be used reused for multiple joins.

    Example:
    1. val si = pairSCollRight.asMultiMapSingletonSideInput
      val joined1 = pairSColl1Left.hashJoin(si)
      val joined2 = pairSColl2Left.hashJoin(si)
  16. def hashJoin[W](rhs: SCollection[(K, W)])(implicit arg0: Coder[W], koder: Coder[K], voder: Coder[V]): SCollection[(K, (V, W))]

    Permalink

    Perform an inner join by replicating rhs to all workers.

    Perform an inner join by replicating rhs to all workers. The right side should be tiny and fit in memory.

  17. def hashLeftOuterJoin[W](sideInput: SideInput[Map[K, Iterable[W]]])(implicit arg0: Coder[W], koder: Coder[K], voder: Coder[V]): SCollection[(K, (V, Option[W]))]

    Permalink

    Perform a left outer join with a MultiMap SideInput[Map[K, Iterable[V]]

    Perform a left outer join with a MultiMap SideInput[Map[K, Iterable[V]]

    Example:
    1. val si = pairSCollRight.asMultiMapSingletonSideInput
      val joined1 = pairSColl1Left.hashLeftOuterJoin(si)
      val joined2 = pairSColl2Left.hashLeftOuterJoin(si)
  18. def hashLeftOuterJoin[W](rhs: SCollection[(K, W)])(implicit arg0: Coder[W], koder: Coder[K], voder: Coder[V]): SCollection[(K, (V, Option[W]))]

    Permalink

    Perform a left outer join by replicating rhs to all workers.

    Perform a left outer join by replicating rhs to all workers. The right side should be tiny and fit in memory.

    rhs

    The tiny SCollection[(K, W)] treated as right side of the join.

    Example:
    1. val si = pairSCollRight  // Should be tiny
      val joined = pairSColl1Left.hashLeftOuterJoin(pairSCollRight)
  19. def hashSubtractByKey(rhs: SCollection[K])(implicit koder: Coder[K], voder: Coder[V]): SCollection[(K, V)]

    Permalink

    Return an SCollection with the pairs from this whose keys are not in SCollection[V] rhs.

    Return an SCollection with the pairs from this whose keys are not in SCollection[V] rhs.

    Rhs must be small enough to fit into memory.

  20. def hashSubtractByKey(sideInput: SideInput[Set[K]])(implicit koder: Coder[K], voder: Coder[V]): SCollection[(K, V)]

    Permalink

    Return an SCollection with the pairs from this whose keys are not in SideInput[Set] rhs.

  21. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  22. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  23. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  24. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  25. val self: SCollection[(K, V)]

    Permalink
  26. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  27. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  28. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  29. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  30. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Deprecated Value Members

  1. def hashFullOuterJoin[W](sideMap: SideMap[K, W])(implicit arg0: Coder[W], koder: Coder[K], voder: Coder[V]): SCollection[(K, (Option[V], Option[W]))]

    Permalink

    Perform a full outer join with a SideMap.

    Perform a full outer join with a SideMap.

    SideMaps are deprecated in favour of SideInput[Map[K, Iterable[W]]]. Example replacement:

    val si = pairSCollRight.asMultiMapSingletonSideInput
    val joined1 = pairSColl1Left.hashFullOuterJoin(si)
    val joined2 = pairSColl2Left.hashFullOuterJoin(si)
    Annotations
    @deprecated
    Deprecated

    (Since version 0.8.0) Use SCollection[(K, V)]#hashFullOuterJoin(rhs) or SCollection[(K, V)]#hashFullOuterJoin(rhs.asMultiMapSingletonSideInput) instead.

  2. def hashIntersectByKey(sideSet: SideSet[K])(implicit koder: Coder[K], voder: Coder[V]): SCollection[(K, V)]

    Permalink

    Return an SCollection with the pairs from this whose keys are in the SideSet rhs.

    Return an SCollection with the pairs from this whose keys are in the SideSet rhs.

    Unlike SCollection.intersection this preserves duplicates in this.

    Annotations
    @deprecated
    Deprecated

    (Since version 0.8.0) Use SCollection[(K, V)]#hashIntersectByKey(rhs.asSetSingletonSideInput) instead

  3. def hashJoin[W](sideMap: SideMap[K, W])(implicit arg0: Coder[W], koder: Coder[K], voder: Coder[V]): SCollection[(K, (V, W))]

    Permalink

    Perform an inner join with a SideMap.

    Perform an inner join with a SideMap.

    SideMaps are deprecated in favor of SideInput[Map[K, Iterable[W]]]. Example replacement:

    val si = pairSCollRight.asMultiMapSingletonSideInput
    val joined1 = pairSColl1Left.hashJoin(si)
    val joined2 = pairSColl2Left.hashJoin(si)
    Annotations
    @deprecated
    Deprecated

    (Since version 0.8.0) Use SCollection[(K, V)]#hashJoin(rhs) or SCollection[(K, V)]#hashJoin(rhs.asMultiMapSingletonSideInput) instead.

  4. def hashLeftJoin[W](sideMap: SideMap[K, W])(implicit arg0: Coder[W], koder: Coder[K], voder: Coder[V]): SCollection[(K, (V, Option[W]))]

    Permalink

    Perform a left outer join with a SideMap.

    Perform a left outer join with a SideMap.

    SideMaps are deprecated in favor of SideInput[Map[K, Iterable[W]]]. Example replacement:

    val si = pairSCollRight.asMultiMapSingletonSideInput
    val joined1 = pairSColl1Left.hashLeftOuterJoin(si)
    val joined2 = pairSColl2Left.hashLeftOuterJoin(si)
    Annotations
    @deprecated
    Deprecated

    (Since version 0.8.0) Use SCollection[(K, V)]#hashLeftOuterJoin(pairSColl) or SCollection[(K, V)]#hashLeftOuterJoin(pairSColl.asMultiMapSingletonSideInput) instead.

  5. def hashLeftJoin[W](rhs: SCollection[(K, W)])(implicit arg0: Coder[W], koder: Coder[K], voder: Coder[V]): SCollection[(K, (V, Option[W]))]

    Permalink

    Perform a left outer join by replicating rhs to all workers.

    Perform a left outer join by replicating rhs to all workers. The right side should be tiny and fit in memory.

    Annotations
    @deprecated
    Deprecated

    (Since version 0.8.0) Use SCollection[(K, V)]#hashLeftOuterJoin(pairSColl) instead.

    Example:
    1. val si = pairSCollRight  // Should be tiny
      val joined = pairSColl1Left.hashLeftOuterJoin(pairSCollRight)
  6. def toSideMap(implicit koder: Coder[K], voder: Coder[V]): SideMap[K, V]

    Permalink
    Annotations
    @deprecated
    Deprecated

    (Since version 0.8.0) Use SCollection[(K, V)]#asMultiMapSingletonSideInput instead

Inherited from AnyRef

Inherited from Any

Join Operations

per key

Ungrouped