Class BigQuerySchemaIOProvider
- java.lang.Object
-
- org.apache.beam.sdk.io.gcp.bigquery.BigQuerySchemaIOProvider
-
- All Implemented Interfaces:
org.apache.beam.sdk.schemas.io.SchemaIOProvider
@Internal @Experimental @AutoService(org.apache.beam.sdk.schemas.io.SchemaIOProvider.class) public class BigQuerySchemaIOProvider extends java.lang.Object implements org.apache.beam.sdk.schemas.io.SchemaIOProvider
An implementation ofSchemaIOProvider
for reading and writing to BigQuery withBigQueryIO
. For a description of configuration options and other defaults, seeconfigurationSchema()
.This transform is still experimental, and is still subject to breaking changes.
-
-
Constructor Summary
Constructors Constructor Description BigQuerySchemaIOProvider()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description org.apache.beam.sdk.schemas.Schema
configurationSchema()
Returns the expected schema of the configuration object.org.apache.beam.sdk.io.gcp.bigquery.BigQuerySchemaIOProvider.BigQuerySchemaIO
from(java.lang.String location, org.apache.beam.sdk.values.Row configuration, @Nullable org.apache.beam.sdk.schemas.Schema dataSchema)
Produces a SchemaIO given a String representing the data's location, the schema of the data that resides there, and some IO-specific configuration object.java.lang.String
identifier()
Returns an id that uniquely represents this IO.org.apache.beam.sdk.values.PCollection.IsBounded
isBounded()
Indicates whether the PCollections produced by this transform will contain a bounded or unbounded number of elements.boolean
requiresDataSchema()
Indicates whether this transform requires a specified data schema.
-
-
-
Method Detail
-
identifier
public java.lang.String identifier()
Returns an id that uniquely represents this IO.- Specified by:
identifier
in interfaceorg.apache.beam.sdk.schemas.io.SchemaIOProvider
-
configurationSchema
public org.apache.beam.sdk.schemas.Schema configurationSchema()
Returns the expected schema of the configuration object. Note this is distinct from the schema of the data source itself. The fields are as follows:- table: Nullable String - Used for reads and writes. Specifies a table to read or write
to, in the format described in
BigQueryHelpers.parseTableSpec(java.lang.String)
. Used as an input toBigQueryIO.TypedRead.from(String)
orBigQueryIO.Write.to(String)
. - query: Nullable String - Used for reads. Specifies a query to read results from using the
BigQuery Standard SQL dialect. Used as an input to
BigQueryIO.TypedRead.fromQuery(String)
. - queryLocation: Nullable String - Used for reads. Specifies a BigQuery geographic location
where the query job will be executed. Used as an input to
BigQueryIO.TypedRead.withQueryLocation(String)
. - createDisposition: Nullable String - Used for writes. Specifies whether a table should be
created if it does not exist. Valid inputs are "Never" and "IfNeeded", corresponding to
values of
BigQueryIO.Write.CreateDisposition
. Used as an input toBigQueryIO.Write.withCreateDisposition(BigQueryIO.Write.CreateDisposition)
.
- ReadMethod - The input to
BigQueryIO.TypedRead.withMethod(BigQueryIO.TypedRead.Method)
. Defaults to EXPORT, since that is the only method that currently offers Beam Schema support. - WriteMethod - The input to
BigQueryIO.Write.withMethod(BigQueryIO.Write.Method)
. Currently defaults to STORAGE_WRITE_API.
- Specified by:
configurationSchema
in interfaceorg.apache.beam.sdk.schemas.io.SchemaIOProvider
- table: Nullable String - Used for reads and writes. Specifies a table to read or write
to, in the format described in
-
from
public org.apache.beam.sdk.io.gcp.bigquery.BigQuerySchemaIOProvider.BigQuerySchemaIO from(java.lang.String location, org.apache.beam.sdk.values.Row configuration, @Nullable org.apache.beam.sdk.schemas.Schema dataSchema)
Produces a SchemaIO given a String representing the data's location, the schema of the data that resides there, and some IO-specific configuration object.For BigQuery IO, only the configuration object is used. Location and data schema have no effect. Specifying a table and dataset is done through appropriate fields in the configuration object, and the data schema is automatically generated from either the input PCollection or schema of the BigQuery table.
- Specified by:
from
in interfaceorg.apache.beam.sdk.schemas.io.SchemaIOProvider
-
requiresDataSchema
public boolean requiresDataSchema()
Indicates whether this transform requires a specified data schema.- Specified by:
requiresDataSchema
in interfaceorg.apache.beam.sdk.schemas.io.SchemaIOProvider
- Returns:
- false
-
isBounded
public org.apache.beam.sdk.values.PCollection.IsBounded isBounded()
Indicates whether the PCollections produced by this transform will contain a bounded or unbounded number of elements.- Specified by:
isBounded
in interfaceorg.apache.beam.sdk.schemas.io.SchemaIOProvider
- Returns:
- Bounded
-
-