Class AvroCoder<T>

  • Type Parameters:
    T - the type of elements handled by this coder
    All Implemented Interfaces:
    java.io.Serializable
    Direct Known Subclasses:
    AvroGenericCoder

    public class AvroCoder<T>
    extends CustomCoder<T>
    A Coder using Avro binary format.

    Each instance of AvroCoder<T> encapsulates an Avro schema for objects of type T.

    The Avro schema may be provided explicitly via of(Class, Schema) or omitted via of(Class), in which case it will be inferred using Avro's ReflectData.

    For complete details about schema generation and how it can be controlled please see the org.apache.avro.reflect package. Only concrete classes with a no-argument constructor can be mapped to Avro records. All inherited fields that are not static or transient are included. Fields are not permitted to be null unless annotated by Nullable or a Union schema containing "null".

    To use, specify the Coder type on a PCollection:

    
     PCollection<MyCustomElement> records =
         input.apply(...)
              .setCoder(AvroCoder.of(MyCustomElement.class));
     

    or annotate the element class using @DefaultCoder.

    @DefaultCoder(AvroCoder.class)
     public class MyCustomElement {
         ...
     }
     

    The implementation attempts to determine if the Avro encoding of the given type will satisfy the criteria of Coder.verifyDeterministic() by inspecting both the type and the Schema provided or generated by Avro. Only coders that are deterministic can be used in GroupByKey operations.

    See Also:
    Serialized Form
    • Constructor Detail

      • AvroCoder

        protected AvroCoder​(java.lang.Class<T> type,
                            org.apache.avro.Schema schema)
      • AvroCoder

        protected AvroCoder​(java.lang.Class<T> type,
                            org.apache.avro.Schema schema,
                            boolean useReflectApi)
    • Method Detail

      • of

        public static <T> AvroCoder<T> of​(TypeDescriptor<T> type)
        Returns an AvroCoder instance for the provided element type.
        Type Parameters:
        T - the element type
      • of

        public static <T> AvroCoder<T> of​(TypeDescriptor<T> type,
                                          boolean useReflectApi)
        Returns an AvroCoder instance for the provided element type, respecting whether to use Avro's Reflect* or Specific* suite for encoding and decoding.
        Type Parameters:
        T - the element type
      • of

        public static <T> AvroCoder<T> of​(java.lang.Class<T> clazz)
        Returns an AvroCoder instance for the provided element class.
        Type Parameters:
        T - the element type
      • of

        public static AvroGenericCoder of​(org.apache.avro.Schema schema)
        Returns an AvroGenericCoder instance for the Avro schema. The implicit type is GenericRecord.
      • of

        public static <T> AvroCoder<T> of​(java.lang.Class<T> type,
                                          boolean useReflectApi)
        Returns an AvroCoder instance for the given class, respecting whether to use Avro's Reflect* or Specific* suite for encoding and decoding.
        Type Parameters:
        T - the element type
      • of

        public static <T> AvroCoder<T> of​(java.lang.Class<T> type,
                                          org.apache.avro.Schema schema)
        Returns an AvroCoder instance for the provided element type using the provided Avro schema.

        The schema must correspond to the type provided.

        Type Parameters:
        T - the element type
      • of

        public static <T> AvroCoder<T> of​(java.lang.Class<T> type,
                                          org.apache.avro.Schema schema,
                                          boolean useReflectApi)
        Returns an AvroCoder instance for the given class and schema, respecting whether to use Avro's Reflect* or Specific* suite for encoding and decoding.
        Type Parameters:
        T - the element type
      • getCoderProvider

        public static CoderProvider getCoderProvider()
        Returns a CoderProvider which uses the AvroCoder if possible for all types.

        It is unsafe to register this as a CoderProvider because Avro will reflectively accept dangerous types such as Object.

        This method is invoked reflectively from DefaultCoder.

      • getType

        public java.lang.Class<T> getType()
        Returns the type this coder encodes/decodes.
      • useReflectApi

        public boolean useReflectApi()
      • encode

        public void encode​(T value,
                           java.io.OutputStream outStream)
                    throws java.io.IOException
        Description copied from class: Coder
        Encodes the given value of type T onto the given output stream.
        Specified by:
        encode in class Coder<T>
        Throws:
        java.io.IOException - if writing to the OutputStream fails for some reason
        CoderException - if the value could not be encoded for some reason
      • decode

        public T decode​(java.io.InputStream inStream)
                 throws java.io.IOException
        Description copied from class: Coder
        Decodes a value of type T from the given input stream in the given context. Returns the decoded value.
        Specified by:
        decode in class Coder<T>
        Throws:
        java.io.IOException - if reading from the InputStream fails for some reason
        CoderException - if the value could not be decoded for some reason
      • verifyDeterministic

        public void verifyDeterministic()
                                 throws Coder.NonDeterministicException
        Description copied from class: CustomCoder
        Throw Coder.NonDeterministicException if the coding is not deterministic.

        In order for a Coder to be considered deterministic, the following must be true:

        • two values that compare as equal (via Object.equals() or Comparable.compareTo(), if supported) have the same encoding.
        • the Coder always produces a canonical encoding, which is the same for an instance of an object even if produced on different computers at different times.
        Overrides:
        verifyDeterministic in class CustomCoder<T>
        Throws:
        Coder.NonDeterministicException - when the type may not be deterministically encoded using the given Schema, the directBinaryEncoder, and the ReflectDatumWriter or GenericDatumWriter.
      • getSchema

        public org.apache.avro.Schema getSchema()
        Returns the schema used by this coder.
      • equals

        public boolean equals​(@Nullable java.lang.Object other)
        Overrides:
        equals in class java.lang.Object
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class java.lang.Object