T - the type of elements handled by this coderpublic class AvroCoder<T> extends StandardCoder<T>
The Avro schema is generated using reflection on the element type, using
Avro's
org.apache.avro.reflect.ReflectData,
and encoded as part of the Coder instance.
For complete details about schema generation and how it can be controlled please see the org.apache.avro.reflect package. Only concrete classes with a no-argument constructor can be mapped to Avro records. All inherited fields that are not static or transient are used. Fields are not permitted to be null unless annotated by org.apache.avro.reflect.Nullable or a org.apache.avro.reflect.Union containing null.
To use, specify the Coder type on a PCollection:
PCollection<MyCustomElement> records =
input.apply(...)
.setCoder(AvroCoder.of(MyCustomElement.class);
or annotate the element class using @DefaultCoder.
@DefaultCoder(AvroCoder.class)
public class MyCustomElement {
...
}
The implementation attempts to determine if the Avro encoding of the given type will satisfy
the criteria of Coder.verifyDeterministic() by inspecting both the type and the
Schema provided or generated by Avro. Only coders that are deterministic can be used in
GroupByKey operations.
| Modifier and Type | Class and Description |
|---|---|
protected static class |
AvroCoder.AvroDeterminismChecker
Helper class encapsulating the various pieces of state maintained by the
recursive walk used for checking if the encoding will be deterministic.
|
Coder.Context, Coder.NonDeterministicException| Modifier | Constructor and Description |
|---|---|
protected |
AvroCoder(java.lang.Class<T> type,
org.apache.avro.Schema schema) |
| Modifier and Type | Method and Description |
|---|---|
com.google.cloud.dataflow.sdk.util.CloudObject |
asCloudObject()
Returns the
CloudObject that represents this Coder. |
org.apache.avro.io.DatumReader<T> |
createDatumReader()
Returns a new DatumReader that can be used to read from
an Avro file directly.
|
org.apache.avro.io.DatumWriter<T> |
createDatumWriter()
Returns a new DatumWriter that can be used to write to
an Avro file directly.
|
T |
decode(java.io.InputStream inStream,
Coder.Context context)
Decodes a value of type
T from the given input stream in
the given context. |
void |
encode(T value,
java.io.OutputStream outStream,
Coder.Context context)
Encodes the given value of type
T onto the given output stream
in the given context. |
java.util.List<? extends Coder<?>> |
getCoderArguments()
If this is a
Coder for a parameterized type, returns the
list of Coders being used for each of the parameters, or
returns null if this cannot be done or this is not a
parameterized type. |
org.apache.avro.Schema |
getSchema()
Returns the schema used by this coder.
|
boolean |
isDeterministic()
Deprecated.
|
static <T> AvroCoder<T> |
of(java.lang.Class<T> type)
Returns an
AvroCoder instance for the provided element type. |
static <T> AvroCoder<T> |
of(java.lang.Class<T> type,
org.apache.avro.Schema schema)
Returns an
AvroCoder instance for the provided element type
using the provided Avro schema. |
static AvroCoder<org.apache.avro.generic.GenericRecord> |
of(org.apache.avro.Schema schema)
Returns an
AvroCoder instance for the Avro schema. |
static AvroCoder<?> |
of(java.lang.String classType,
java.lang.String schema) |
void |
verifyDeterministic()
Raises an exception describing reasons why the type may not be deterministically
encoded using the given Schema, the directBinaryEncoder, and the ReflectDatumWriter
or GenericDatumWriter.
|
equals, getComponents, getEncodedElementByteSize, hashCode, isRegisterByteSizeObserverCheap, registerByteSizeObserver, toString, verifyDeterministic, verifyDeterministicprotected AvroCoder(java.lang.Class<T> type, org.apache.avro.Schema schema)
public static <T> AvroCoder<T> of(java.lang.Class<T> type)
AvroCoder instance for the provided element type.T - the element typepublic static AvroCoder<org.apache.avro.generic.GenericRecord> of(org.apache.avro.Schema schema)
AvroCoder instance for the Avro schema. The implicit
type is GenericRecord.public static <T> AvroCoder<T> of(java.lang.Class<T> type, org.apache.avro.Schema schema)
AvroCoder instance for the provided element type
using the provided Avro schema.
If the type argument is GenericRecord, the schema may be arbitrary. Otherwise, the schema must correspond to the type provided.
T - the element typepublic static AvroCoder<?> of(java.lang.String classType, java.lang.String schema) throws java.lang.ClassNotFoundException
java.lang.ClassNotFoundExceptionpublic void encode(T value, java.io.OutputStream outStream, Coder.Context context) throws java.io.IOException
CoderT onto the given output stream
in the given context.java.io.IOException - if writing to the OutputStream fails
for some reasonCoderException - if the value could not be encoded for some reasonpublic T decode(java.io.InputStream inStream, Coder.Context context) throws java.io.IOException
CoderT from the given input stream in
the given context. Returns the decoded value.java.io.IOException - if reading from the InputStream fails
for some reasonCoderException - if the value could not be decoded for some reasonpublic java.util.List<? extends Coder<?>> getCoderArguments()
CoderCoder for a parameterized type, returns the
list of Coders being used for each of the parameters, or
returns null if this cannot be done or this is not a
parameterized type.public com.google.cloud.dataflow.sdk.util.CloudObject asCloudObject()
CoderCloudObject that represents this Coder.asCloudObject in interface Coder<T>asCloudObject in class StandardCoder<T>@Deprecated public boolean isDeterministic()
public void verifyDeterministic()
throws Coder.NonDeterministicException
verifyDeterministic in interface Coder<T>verifyDeterministic in class StandardCoder<T>Coder.NonDeterministicException - if this coder is not deterministic.public org.apache.avro.io.DatumReader<T> createDatumReader()
public org.apache.avro.io.DatumWriter<T> createDatumWriter()
public org.apache.avro.Schema getSchema()