Class DynamicDestinations<T,​DestinationT>

  • All Implemented Interfaces:
    java.io.Serializable
    Direct Known Subclasses:
    StorageApiDynamicDestinationsTableRow

    public abstract class DynamicDestinations<T,​DestinationT>
    extends java.lang.Object
    implements java.io.Serializable
    This class provides the most general way of specifying dynamic BigQuery table destinations. Destinations can be extracted from the input element, and stored as a custom type. Mappings are provided to convert the destination into a BigQuery table reference and a BigQuery schema. The class can read side inputs while performing these mappings.

    For example, consider a PCollection of events, each containing a user-id field. You want to write each user's events to a separate table with a separate schema per user. Since the user-id field is a string, you will represent the destination as a string.

    
     events.apply(BigQueryIO.<UserEvent>write()
      .to(new DynamicDestinations<UserEvent, String>() {
            public String getDestination(ValueInSingleWindow<UserEvent> element) {
              return element.getValue().getUserId();
            }
            public TableDestination getTable(String user) {
              return new TableDestination(tableForUser(user), "Table for user " + user);
            }
            public TableSchema getSchema(String user) {
              return tableSchemaForUser(user);
            }
          })
      .withFormatFunction(new SerializableFunction<UserEvent, TableRow>() {
         public TableRow apply(UserEvent event) {
           return convertUserEventToTableRow(event);
         }
       }));
     

    An instance of DynamicDestinations can also use side inputs using sideInput(PCollectionView). The side inputs must be present in getSideInputs(). Side inputs are accessed in the global window, so they must be globally windowed.

    DestinationT is expected to provide proper hash and equality members. Ideally it will be a compact type with an efficient coder, as these objects may be used as a key in a GroupByKey.

    See Also:
    Serialized Form
    • Method Summary

      All Methods Instance Methods Abstract Methods Concrete Methods 
      Modifier and Type Method Description
      abstract DestinationT getDestination​(@Nullable org.apache.beam.sdk.values.ValueInSingleWindow<T> element)
      Returns an object that represents at a high level which table is being written to.
      @Nullable org.apache.beam.sdk.coders.Coder<DestinationT> getDestinationCoder()
      Returns the coder for DynamicDestinations.
      abstract @Nullable com.google.api.services.bigquery.model.TableSchema getSchema​(DestinationT destination)
      Returns the table schema for the destination.
      java.util.List<org.apache.beam.sdk.values.PCollectionView<?>> getSideInputs()
      Specifies that this object needs access to one or more side inputs.
      abstract TableDestination getTable​(DestinationT destination)
      Returns a TableDestination object for the destination.
      protected <SideInputT>
      SideInputT
      sideInput​(org.apache.beam.sdk.values.PCollectionView<SideInputT> view)
      Returns the value of a given side input.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • DynamicDestinations

        public DynamicDestinations()
    • Method Detail

      • getSideInputs

        public java.util.List<org.apache.beam.sdk.values.PCollectionView<?>> getSideInputs()
        Specifies that this object needs access to one or more side inputs. This side inputs must be globally windowed, as they will be accessed from the global window.
      • sideInput

        protected final <SideInputT> SideInputT sideInput​(org.apache.beam.sdk.values.PCollectionView<SideInputT> view)
        Returns the value of a given side input. The view must be present in getSideInputs().
      • getDestination

        public abstract DestinationT getDestination​(@Nullable org.apache.beam.sdk.values.ValueInSingleWindow<T> element)
        Returns an object that represents at a high level which table is being written to. May not return null.
      • getDestinationCoder

        public @Nullable org.apache.beam.sdk.coders.Coder<DestinationT> getDestinationCoder()
        Returns the coder for DynamicDestinations. If this is not overridden, then BigQueryIO will look in the coder registry for a suitable coder. This must be a deterministic coder, as DynamicDestinations will be used as a key type in a GroupByKey.
      • getSchema

        public abstract @Nullable com.google.api.services.bigquery.model.TableSchema getSchema​(DestinationT destination)
        Returns the table schema for the destination.