T - the type of the elements of the resulting PCollectionpublic class Create<T> extends PTransform<PInput,PCollection<T>>
Create<T> takes a collection of elements of type T
known when the pipeline is constructed and returns a
PCollection<T> containing the elements.
Example of use:
Pipeline p = ...;
PCollection<Integer> pc = p.apply(Create.of(3, 4, 5)).setCoder(BigEndianIntegerCoder.of());
Map<String, Integer> map = ...;
PCollection<KV<String, Integer>> pt =
p.apply(Create.of(map))
.setCoder(KvCoder.of(StringUtf8Coder.of(),
BigEndianIntegerCoder.of()));
Note that PCollection.setCoder(com.google.cloud.dataflow.sdk.coders.Coder<T>) must be called
explicitly to set the encoding of the resulting
PCollection, since Create does not infer the
encoding.
A good use for Create is when a PCollection
needs to be created without dependencies on files or other external
entities. This is especially useful during testing.
Caveat: Create only supports small in-memory datasets,
particularly when submitting jobs to the Google Cloud Dataflow
service.
Create can automatically determine the Coder to use
if all elements are the same type, and a default exists for that type.
See CoderRegistry for details
on how defaults are determined.
name| Modifier and Type | Method and Description |
|---|---|
PCollection<T> |
apply(PInput input)
Applies this
PTransform on the given Input, and returns its
Output. |
protected Coder<?> |
getDefaultOutputCoder()
Returns the default
Coder to use for the output of this
single-output PTransform, or null if
none can be inferred. |
java.lang.Iterable<T> |
getElements() |
static <T> Create<T> |
of(java.lang.Iterable<T> elems)
Returns a new
Create root transform that produces a
PCollection containing the specified elements. |
static <K,V> Create<KV<K,V>> |
of(java.util.Map<K,V> elems)
Returns a new
Create root transform that produces a
PCollection of KVs corresponding to the keys and
values of the specified Map. |
static <T> Create<T> |
of(T... elems)
Returns a new
Create root transform that produces a
PCollection containing the specified elements. |
static <T> com.google.cloud.dataflow.sdk.transforms.Create.CreateTimestamped<T> |
timestamped(java.lang.Iterable<T> values,
java.lang.Iterable<java.lang.Long> timestamps)
Returns a new root transform that produces a
PCollection containing
the specified elements with the specified timestamps. |
static <T> com.google.cloud.dataflow.sdk.transforms.Create.CreateTimestamped<T> |
timestamped(java.lang.Iterable<TimestampedValue<T>> elems)
Returns a new root transform that produces a
PCollection containing
the specified elements with the specified timestamps. |
static <T> com.google.cloud.dataflow.sdk.transforms.Create.CreateTimestamped<T> |
timestamped(TimestampedValue<T>... elems)
Returns a new root transform that produces a
PCollection containing
the specified elements with the specified timestamps. |
finishSpecifying, getCoderRegistry, getDefaultName, getDefaultOutputCoder, getInput, getKindString, getName, getOutput, getPipeline, setName, setPipeline, toString, withNamepublic static <T> Create<T> of(java.lang.Iterable<T> elems)
Create root transform that produces a
PCollection containing the specified elements.
The argument should not be modified after this is called.
The elements will have a timestamp of negative infinity, see
timestamped(java.lang.Iterable<com.google.cloud.dataflow.sdk.values.TimestampedValue<T>>) for a way of creating a PCollection
with timestamped elements.
The result of applying this transform should have its
Coder specified explicitly, via a call to
PCollection.setCoder(com.google.cloud.dataflow.sdk.coders.Coder<T>).
@SafeVarargs public static <T> Create<T> of(T... elems)
Create root transform that produces a
PCollection containing the specified elements.
The elements will have a timestamp of negative infinity, see
timestamped(java.lang.Iterable<com.google.cloud.dataflow.sdk.values.TimestampedValue<T>>) for a way of creating a PCollection
with timestamped elements.
The argument should not be modified after this is called.
The result of applying this transform should have its
Coder specified explicitly, via a call to
PCollection.setCoder(com.google.cloud.dataflow.sdk.coders.Coder<T>).
public static <K,V> Create<KV<K,V>> of(java.util.Map<K,V> elems)
Create root transform that produces a
PCollection of KVs corresponding to the keys and
values of the specified Map.
The elements will have a timestamp of negative infinity, see
timestamped(java.lang.Iterable<com.google.cloud.dataflow.sdk.values.TimestampedValue<T>>) for a way of creating a PCollection
with timestamped elements.
The result of applying this transform should have its
Coder specified explicitly, via a call to
PCollection.setCoder(com.google.cloud.dataflow.sdk.coders.Coder<T>).
public static <T> com.google.cloud.dataflow.sdk.transforms.Create.CreateTimestamped<T> timestamped(java.lang.Iterable<TimestampedValue<T>> elems)
PCollection containing
the specified elements with the specified timestamps.
The argument should not be modified after this is called.
public static <T> com.google.cloud.dataflow.sdk.transforms.Create.CreateTimestamped<T> timestamped(TimestampedValue<T>... elems)
PCollection containing
the specified elements with the specified timestamps.
The argument should not be modified after this is called.
public static <T> com.google.cloud.dataflow.sdk.transforms.Create.CreateTimestamped<T> timestamped(java.lang.Iterable<T> values,
java.lang.Iterable<java.lang.Long> timestamps)
PCollection containing
the specified elements with the specified timestamps.
The arguments should not be modified after this is called.
java.lang.IllegalArgumentException - if there are a different number of values
and timestampspublic PCollection<T> apply(PInput input)
PTransformPTransform on the given Input, and returns its
Output.
Composite transforms, which are defined in terms of other transforms, should return the output of one of the composed transforms. Non-composite transforms, which do not apply any transforms internally, should return a new unbound output and register evaluators (via backend-specific registration methods).
The default implementation throws an exception. A derived class must
either implement apply, or else each runner must supply a custom
implementation via
PipelineRunner.apply(com.google.cloud.dataflow.sdk.transforms.PTransform<Input, Output>, Input).
apply in class PTransform<PInput,PCollection<T>>public java.lang.Iterable<T> getElements()
protected Coder<?> getDefaultOutputCoder()
PTransformCoder to use for the output of this
single-output PTransform, or null if
none can be inferred.
By default, returns null.
getDefaultOutputCoder in class PTransform<PInput,PCollection<T>>