Class Join


  • @Experimental(SCHEMAS)
    public class Join
    extends java.lang.Object
    A transform that performs equijoins across two schema PCollections.

    This transform allows joins between two input PCollections simply by specifying the fields to join on. The resulting PCollection<Row> will have two fields named "lhs" and "rhs" respectively, each with the schema of the corresponding input PCollection.

    For example, the following demonstrates joining two PCollections using a natural join on the "user" and "country" fields, where both the left-hand and the right-hand PCollections have fields with these names.

     PCollection<Row> joined = pCollection1.apply(Join.innerJoin(pCollection2).using("user", "country"));
     

    If the right-hand PCollection contains fields with different names to join against, you can specify them as follows:

    PCollection<Row> joined = pCollection1.apply(Join.innerJoin(pCollection2)
           .on(FieldsEqual.left("user", "country").right("otherUser", "otherCountry")));
     

    Full outer joins, left outer joins, and right outer joins are also supported.

    • Constructor Detail

      • Join

        public Join()
    • Method Detail

      • innerJoin

        public static <LhsT,​RhsT> Join.Impl<LhsT,​RhsT> innerJoin​(PCollection<RhsT> rhs)
        Perform an inner join.
      • fullOuterJoin

        public static <LhsT,​RhsT> Join.Impl<LhsT,​RhsT> fullOuterJoin​(PCollection<RhsT> rhs)
        Perform a full outer join.
      • leftOuterJoin

        public static <LhsT,​RhsT> Join.Impl<LhsT,​RhsT> leftOuterJoin​(PCollection<RhsT> rhs)
        Perform a left outer join.
      • rightOuterJoin

        public static <LhsT,​RhsT> Join.Impl<LhsT,​RhsT> rightOuterJoin​(PCollection<RhsT> rhs)
        Perform a right outer join.
      • innerBroadcastJoin

        public static <LhsT,​RhsT> Join.Impl<LhsT,​RhsT> innerBroadcastJoin​(PCollection<RhsT> rhs)
        Perform an inner join, broadcasting the right side.
      • leftOuterBroadcastJoin

        public static <LhsT,​RhsT> Join.Impl<LhsT,​RhsT> leftOuterBroadcastJoin​(PCollection<RhsT> rhs)
        Perform a left outer join, broadcasting the right side.