Class BigQueryUtils


  • public class BigQueryUtils
    extends java.lang.Object
    Utility methods for BigQuery related operations.
    • Constructor Summary

      Constructors 
      Constructor Description
      BigQueryUtils()  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static java.lang.Object convertAvroFormat​(org.apache.beam.sdk.schemas.Schema.FieldType beamFieldType, java.lang.Object avroValue, BigQueryUtils.ConversionOptions options)
      Tries to convert an Avro decoded value to a Beam field value based on the target type of the Beam field.
      static com.google.api.services.bigquery.model.TableRow convertGenericRecordToTableRow​(org.apache.avro.generic.GenericRecord record, com.google.api.services.bigquery.model.TableSchema tableSchema)  
      static org.apache.beam.sdk.schemas.Schema fromTableSchema​(com.google.api.services.bigquery.model.TableSchema tableSchema)
      Convert a BigQuery TableSchema to a Beam Schema.
      static org.apache.beam.sdk.schemas.Schema fromTableSchema​(com.google.api.services.bigquery.model.TableSchema tableSchema, BigQueryUtils.SchemaConversionOptions options)
      Convert a BigQuery TableSchema to a Beam Schema.
      static long hashSchemaDescriptorDeterministic​(com.google.protobuf.Descriptors.Descriptor descriptor)
      Hashes a schema descriptor using a deterministic hash function.
      static @Nullable org.apache.beam.runners.core.metrics.ServiceCallMetric readCallMetric​(@Nullable com.google.api.services.bigquery.model.TableReference tableReference)  
      static org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.TypedRead.FromBeamRowFunction<com.google.api.services.bigquery.model.TableRow> tableRowFromBeamRow()  
      static org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.TypedRead.ToBeamRowFunction<com.google.api.services.bigquery.model.TableRow> tableRowToBeamRow()  
      static org.apache.beam.sdk.values.Row toBeamRow​(org.apache.avro.generic.GenericRecord record, org.apache.beam.sdk.schemas.Schema schema, BigQueryUtils.ConversionOptions options)  
      static org.apache.beam.sdk.values.Row toBeamRow​(org.apache.beam.sdk.schemas.Schema rowSchema, com.google.api.services.bigquery.model.TableRow jsonBqRow)
      Tries to convert a JSON TableRow from BigQuery into a Beam Row.
      static org.apache.beam.sdk.values.Row toBeamRow​(org.apache.beam.sdk.schemas.Schema rowSchema, com.google.api.services.bigquery.model.TableSchema bqSchema, com.google.api.services.bigquery.model.TableRow jsonBqRow)
      Tries to parse the JSON TableRow from BigQuery.
      static org.apache.avro.Schema toGenericAvroSchema​(java.lang.String schemaName, java.util.List<com.google.api.services.bigquery.model.TableFieldSchema> fieldSchemas)
      Convert a list of BigQuery TableFieldSchema to Avro Schema.
      static @Nullable com.google.api.services.bigquery.model.TableReference toTableReference​(java.lang.String fullTableId)  
      static org.apache.beam.sdk.transforms.SerializableFunction<org.apache.beam.sdk.values.Row,​com.google.api.services.bigquery.model.TableRow> toTableRow()
      Convert a Beam Row to a BigQuery TableRow.
      static <T> org.apache.beam.sdk.transforms.SerializableFunction<T,​com.google.api.services.bigquery.model.TableRow> toTableRow​(org.apache.beam.sdk.transforms.SerializableFunction<T,​org.apache.beam.sdk.values.Row> toRow)
      Convert a Beam schema type to a BigQuery TableRow.
      static com.google.api.services.bigquery.model.TableRow toTableRow​(org.apache.beam.sdk.values.Row row)
      Convert a BigQuery TableRow to a Beam Row.
      static com.google.api.services.bigquery.model.TableSchema toTableSchema​(org.apache.beam.sdk.schemas.Schema schema)
      Convert a Beam Schema to a BigQuery TableSchema.
      static org.apache.beam.runners.core.metrics.ServiceCallMetric writeCallMetric​(com.google.api.services.bigquery.model.TableReference tableReference)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • BigQueryUtils

        public BigQueryUtils()
    • Method Detail

      • toTableSchema

        @Experimental(SCHEMAS)
        public static com.google.api.services.bigquery.model.TableSchema toTableSchema​(org.apache.beam.sdk.schemas.Schema schema)
        Convert a Beam Schema to a BigQuery TableSchema.
      • fromTableSchema

        @Experimental(SCHEMAS)
        public static org.apache.beam.sdk.schemas.Schema fromTableSchema​(com.google.api.services.bigquery.model.TableSchema tableSchema)
        Convert a BigQuery TableSchema to a Beam Schema.
      • fromTableSchema

        @Experimental(SCHEMAS)
        public static org.apache.beam.sdk.schemas.Schema fromTableSchema​(com.google.api.services.bigquery.model.TableSchema tableSchema,
                                                                         BigQueryUtils.SchemaConversionOptions options)
        Convert a BigQuery TableSchema to a Beam Schema.
      • toGenericAvroSchema

        @Experimental(SCHEMAS)
        public static org.apache.avro.Schema toGenericAvroSchema​(java.lang.String schemaName,
                                                                 java.util.List<com.google.api.services.bigquery.model.TableFieldSchema> fieldSchemas)
        Convert a list of BigQuery TableFieldSchema to Avro Schema.
      • tableRowToBeamRow

        public static final org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.TypedRead.ToBeamRowFunction<com.google.api.services.bigquery.model.TableRow> tableRowToBeamRow()
      • tableRowFromBeamRow

        public static final org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO.TypedRead.FromBeamRowFunction<com.google.api.services.bigquery.model.TableRow> tableRowFromBeamRow()
      • toTableRow

        public static org.apache.beam.sdk.transforms.SerializableFunction<org.apache.beam.sdk.values.Row,​com.google.api.services.bigquery.model.TableRow> toTableRow()
        Convert a Beam Row to a BigQuery TableRow.
      • toTableRow

        public static <T> org.apache.beam.sdk.transforms.SerializableFunction<T,​com.google.api.services.bigquery.model.TableRow> toTableRow​(org.apache.beam.sdk.transforms.SerializableFunction<T,​org.apache.beam.sdk.values.Row> toRow)
        Convert a Beam schema type to a BigQuery TableRow.
      • toBeamRow

        @Experimental(SCHEMAS)
        public static org.apache.beam.sdk.values.Row toBeamRow​(org.apache.avro.generic.GenericRecord record,
                                                               org.apache.beam.sdk.schemas.Schema schema,
                                                               BigQueryUtils.ConversionOptions options)
      • convertGenericRecordToTableRow

        public static com.google.api.services.bigquery.model.TableRow convertGenericRecordToTableRow​(org.apache.avro.generic.GenericRecord record,
                                                                                                     com.google.api.services.bigquery.model.TableSchema tableSchema)
      • toTableRow

        public static com.google.api.services.bigquery.model.TableRow toTableRow​(org.apache.beam.sdk.values.Row row)
        Convert a BigQuery TableRow to a Beam Row.
      • toBeamRow

        @Experimental(SCHEMAS)
        public static org.apache.beam.sdk.values.Row toBeamRow​(org.apache.beam.sdk.schemas.Schema rowSchema,
                                                               com.google.api.services.bigquery.model.TableRow jsonBqRow)
        Tries to convert a JSON TableRow from BigQuery into a Beam Row.

        Only supports basic types and arrays. Doesn't support date types or structs.

      • toBeamRow

        @Experimental(SCHEMAS)
        public static org.apache.beam.sdk.values.Row toBeamRow​(org.apache.beam.sdk.schemas.Schema rowSchema,
                                                               com.google.api.services.bigquery.model.TableSchema bqSchema,
                                                               com.google.api.services.bigquery.model.TableRow jsonBqRow)
        Tries to parse the JSON TableRow from BigQuery.

        Only supports basic types and arrays. Doesn't support date types.

      • convertAvroFormat

        public static java.lang.Object convertAvroFormat​(org.apache.beam.sdk.schemas.Schema.FieldType beamFieldType,
                                                         java.lang.Object avroValue,
                                                         BigQueryUtils.ConversionOptions options)
        Tries to convert an Avro decoded value to a Beam field value based on the target type of the Beam field.

        For the Avro formats of BigQuery types, see https://cloud.google.com/bigquery/docs/exporting-data#avro_export_details and https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-avro#avro_conversions

      • toTableReference

        public static @Nullable com.google.api.services.bigquery.model.TableReference toTableReference​(java.lang.String fullTableId)
        Parameters:
        fullTableId - - Is one of the two forms commonly used to refer to bigquery tables in the beam codebase:
        • projects/{project_id}/datasets/{dataset_id}/tables/{table_id}
        • myproject:mydataset.mytable
        • myproject.mydataset.mytable
        Returns:
        a BigQueryTableIdentifier by parsing the fullTableId. If it cannot be parsed properly null is returned.
      • readCallMetric

        public static @Nullable org.apache.beam.runners.core.metrics.ServiceCallMetric readCallMetric​(@Nullable com.google.api.services.bigquery.model.TableReference tableReference)
        Parameters:
        tableReference - - The table being read from. Can be a temporary BQ table used to read from a SQL query.
        Returns:
        a ServiceCallMetric for recording statuses for all BQ API responses related to reading elements directly from BigQuery in a process-wide metric. Such as: calls to readRows, splitReadStream, createReadSession.
      • writeCallMetric

        public static org.apache.beam.runners.core.metrics.ServiceCallMetric writeCallMetric​(com.google.api.services.bigquery.model.TableReference tableReference)
        Parameters:
        tableReference - - The table being written to.
        Returns:
        a ServiceCallMetric for recording statuses for all BQ responses related to writing elements directly to BigQuery in a process-wide metric. Such as: insertAll.
      • hashSchemaDescriptorDeterministic

        public static long hashSchemaDescriptorDeterministic​(com.google.protobuf.Descriptors.Descriptor descriptor)
        Hashes a schema descriptor using a deterministic hash function.

        Warning! These hashes are encoded into messages, so changing this function will cause pipelines to get stuck on update!