Class DynamicDestinations<T,DestinationT>
- java.lang.Object
-
- org.apache.beam.sdk.io.gcp.bigquery.DynamicDestinations<T,DestinationT>
-
- All Implemented Interfaces:
java.io.Serializable
- Direct Known Subclasses:
StorageApiDynamicDestinationsTableRow
public abstract class DynamicDestinations<T,DestinationT> extends java.lang.Object implements java.io.Serializable
This class provides the most general way of specifying dynamic BigQuery table destinations. Destinations can be extracted from the input element, and stored as a custom type. Mappings are provided to convert the destination into a BigQuery table reference and a BigQuery schema. The class can read side inputs while performing these mappings.For example, consider a PCollection of events, each containing a user-id field. You want to write each user's events to a separate table with a separate schema per user. Since the user-id field is a string, you will represent the destination as a string.
events.apply(BigQueryIO.<UserEvent>write() .to(new DynamicDestinations<UserEvent, String>() { public String getDestination(ValueInSingleWindow<UserEvent> element) { return element.getValue().getUserId(); } public TableDestination getTable(String user) { return new TableDestination(tableForUser(user), "Table for user " + user); } public TableSchema getSchema(String user) { return tableSchemaForUser(user); } }) .withFormatFunction(new SerializableFunction<UserEvent, TableRow>() { public TableRow apply(UserEvent event) { return convertUserEventToTableRow(event); } }));
An instance of
DynamicDestinations
can also use side inputs usingsideInput(PCollectionView)
. The side inputs must be present ingetSideInputs()
. Side inputs are accessed in the global window, so they must be globally windowed.DestinationT
is expected to provide proper hash and equality members. Ideally it will be a compact type with an efficient coder, as these objects may be used as a key in aGroupByKey
.- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description DynamicDestinations()
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description abstract DestinationT
getDestination(@Nullable org.apache.beam.sdk.values.ValueInSingleWindow<T> element)
Returns an object that represents at a high level which table is being written to.@Nullable org.apache.beam.sdk.coders.Coder<DestinationT>
getDestinationCoder()
Returns the coder forDynamicDestinations
.abstract @Nullable com.google.api.services.bigquery.model.TableSchema
getSchema(DestinationT destination)
Returns the table schema for the destination.java.util.List<org.apache.beam.sdk.values.PCollectionView<?>>
getSideInputs()
Specifies that this object needs access to one or more side inputs.abstract TableDestination
getTable(DestinationT destination)
Returns aTableDestination
object for the destination.protected <SideInputT>
SideInputTsideInput(org.apache.beam.sdk.values.PCollectionView<SideInputT> view)
Returns the value of a given side input.
-
-
-
Method Detail
-
getSideInputs
public java.util.List<org.apache.beam.sdk.values.PCollectionView<?>> getSideInputs()
Specifies that this object needs access to one or more side inputs. This side inputs must be globally windowed, as they will be accessed from the global window.
-
sideInput
protected final <SideInputT> SideInputT sideInput(org.apache.beam.sdk.values.PCollectionView<SideInputT> view)
Returns the value of a given side input. The view must be present ingetSideInputs()
.
-
getDestination
public abstract DestinationT getDestination(@Nullable org.apache.beam.sdk.values.ValueInSingleWindow<T> element)
Returns an object that represents at a high level which table is being written to. May not return null.
-
getDestinationCoder
public @Nullable org.apache.beam.sdk.coders.Coder<DestinationT> getDestinationCoder()
Returns the coder forDynamicDestinations
. If this is not overridden, thenBigQueryIO
will look in the coder registry for a suitable coder. This must be a deterministic coder, asDynamicDestinations
will be used as a key type in aGroupByKey
.
-
getTable
public abstract TableDestination getTable(DestinationT destination)
Returns aTableDestination
object for the destination. May not return null. Return value needs to be unique to each destination: may not return the sameTableDestination
for different destinations.
-
getSchema
public abstract @Nullable com.google.api.services.bigquery.model.TableSchema getSchema(DestinationT destination)
Returns the table schema for the destination.
-
-