Class Join
- java.lang.Object
-
- org.apache.beam.sdk.schemas.transforms.Join
-
@Experimental(SCHEMAS) public class Join extends java.lang.Object
A transform that performs equijoins across two schemaPCollection
s.This transform allows joins between two input PCollections simply by specifying the fields to join on. The resulting
PCollection<Row>
will have two fields named "lhs" and "rhs" respectively, each with the schema of the corresponding input PCollection.For example, the following demonstrates joining two PCollections using a natural join on the "user" and "country" fields, where both the left-hand and the right-hand PCollections have fields with these names.
PCollection<Row> joined = pCollection1.apply(Join.innerJoin(pCollection2).using("user", "country"));
If the right-hand PCollection contains fields with different names to join against, you can specify them as follows:
PCollection<Row> joined = pCollection1.apply(Join.innerJoin(pCollection2) .on(FieldsEqual.left("user", "country").right("otherUser", "otherCountry")));
Full outer joins, left outer joins, and right outer joins are also supported.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
Join.FieldsEqual
Predicate object to specify fields to compare when doing an equi-join.static class
Join.Impl<LhsT,RhsT>
Implementation class .
-
Constructor Summary
Constructors Constructor Description Join()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static <LhsT,RhsT>
Join.Impl<LhsT,RhsT>fullOuterJoin(PCollection<RhsT> rhs)
Perform a full outer join.static <LhsT,RhsT>
Join.Impl<LhsT,RhsT>innerBroadcastJoin(PCollection<RhsT> rhs)
Perform an inner join, broadcasting the right side.static <LhsT,RhsT>
Join.Impl<LhsT,RhsT>innerJoin(PCollection<RhsT> rhs)
Perform an inner join.static <LhsT,RhsT>
Join.Impl<LhsT,RhsT>leftOuterBroadcastJoin(PCollection<RhsT> rhs)
Perform a left outer join, broadcasting the right side.static <LhsT,RhsT>
Join.Impl<LhsT,RhsT>leftOuterJoin(PCollection<RhsT> rhs)
Perform a left outer join.static <LhsT,RhsT>
Join.Impl<LhsT,RhsT>rightOuterJoin(PCollection<RhsT> rhs)
Perform a right outer join.
-
-
-
Field Detail
-
LHS_TAG
public static final java.lang.String LHS_TAG
- See Also:
- Constant Field Values
-
RHS_TAG
public static final java.lang.String RHS_TAG
- See Also:
- Constant Field Values
-
-
Method Detail
-
innerJoin
public static <LhsT,RhsT> Join.Impl<LhsT,RhsT> innerJoin(PCollection<RhsT> rhs)
Perform an inner join.
-
fullOuterJoin
public static <LhsT,RhsT> Join.Impl<LhsT,RhsT> fullOuterJoin(PCollection<RhsT> rhs)
Perform a full outer join.
-
leftOuterJoin
public static <LhsT,RhsT> Join.Impl<LhsT,RhsT> leftOuterJoin(PCollection<RhsT> rhs)
Perform a left outer join.
-
rightOuterJoin
public static <LhsT,RhsT> Join.Impl<LhsT,RhsT> rightOuterJoin(PCollection<RhsT> rhs)
Perform a right outer join.
-
innerBroadcastJoin
public static <LhsT,RhsT> Join.Impl<LhsT,RhsT> innerBroadcastJoin(PCollection<RhsT> rhs)
Perform an inner join, broadcasting the right side.
-
leftOuterBroadcastJoin
public static <LhsT,RhsT> Join.Impl<LhsT,RhsT> leftOuterBroadcastJoin(PCollection<RhsT> rhs)
Perform a left outer join, broadcasting the right side.
-
-