Class AvroUtils


  • @Experimental(SCHEMAS)
    public class AvroUtils
    extends java.lang.Object
    Utils to convert AVRO records to Beam rows. Imposes a mapping between common avro types and Beam portable schemas (https://s.apache.org/beam-schemas):
       Avro                Beam Field Type
       INT         <-----> INT32
       LONG        <-----> INT64
       FLOAT       <-----> FLOAT
       DOUBLE      <-----> DOUBLE
       BOOLEAN     <-----> BOOLEAN
       STRING      <-----> STRING
       BYTES       <-----> BYTES
                   <------ LogicalType(urn="beam:logical_type:var_bytes:v1")
       FIXED       <-----> LogicalType(urn="beam:logical_type:fixed_bytes:v1")
       ARRAY       <-----> ARRAY
       ENUM        <-----> LogicalType(EnumerationType)
       MAP         <-----> MAP
       RECORD      <-----> ROW
       UNION       <-----> LogicalType(OneOfType)
       LogicalTypes.Date              <-----> LogicalType(DATE)
       LogicalTypes.TimestampMillis   <-----> DATETIME
       LogicalTypes.Decimal           <-----> DECIMAL
     
    For SQL CHAR/VARCHAR types, an Avro schema
       LogicalType({"type":"string","logicalType":"char","maxLength":MAX_LENGTH}) or
       LogicalType({"type":"string","logicalType":"varchar","maxLength":MAX_LENGTH})
     
    is used.
    • Method Detail

      • toBeamField

        public static Schema.Field toBeamField​(org.apache.avro.Schema.Field field)
        Get Beam Field from avro Field.
      • toAvroField

        public static org.apache.avro.Schema.Field toAvroField​(Schema.Field field,
                                                               java.lang.String namespace)
        Get Avro Field from Beam Field.
      • toBeamSchema

        public static Schema toBeamSchema​(org.apache.avro.Schema schema)
        Converts AVRO schema to Beam row schema.
        Parameters:
        schema - schema of type RECORD
      • toAvroSchema

        public static org.apache.avro.Schema toAvroSchema​(Schema beamSchema,
                                                          @Nullable java.lang.String name,
                                                          @Nullable java.lang.String namespace)
        Converts a Beam Schema into an AVRO schema.
      • toAvroSchema

        public static org.apache.avro.Schema toAvroSchema​(Schema beamSchema)
      • toBeamRowStrict

        public static Row toBeamRowStrict​(org.apache.avro.generic.GenericRecord record,
                                          @Nullable Schema schema)
        Strict conversion from AVRO to Beam, strict because it doesn't do widening or narrowing during conversion. If Schema is not provided, one is inferred from the AVRO schema.
      • toGenericRecord

        public static org.apache.avro.generic.GenericRecord toGenericRecord​(Row row)
        Convert from a Beam Row to an AVRO GenericRecord. The Avro Schema is inferred from the Beam schema on the row.
      • toGenericRecord

        public static org.apache.avro.generic.GenericRecord toGenericRecord​(Row row,
                                                                            @Nullable org.apache.avro.Schema avroSchema)
        Convert from a Beam Row to an AVRO GenericRecord. If a Schema is not provided, one is inferred from the Beam schema on the row.
      • getToRowFunction

        public static <T> SerializableFunction<T,​Row> getToRowFunction​(java.lang.Class<T> clazz,
                                                                             @Nullable org.apache.avro.Schema schema)
      • getFromRowFunction

        public static <T> SerializableFunction<Row,​T> getFromRowFunction​(java.lang.Class<T> clazz)
      • getSchema

        public static <T> @Nullable Schema getSchema​(java.lang.Class<T> clazz,
                                                     @Nullable org.apache.avro.Schema schema)
      • getAvroBytesToRowFunction

        public static SimpleFunction<byte[],​Row> getAvroBytesToRowFunction​(Schema beamSchema)
        Returns a function mapping encoded AVRO GenericRecords to Beam Rows.
      • getRowToAvroBytesFunction

        public static SimpleFunction<Row,​byte[]> getRowToAvroBytesFunction​(Schema beamSchema)
        Returns a function mapping Beam Rows to encoded AVRO GenericRecords.
      • schemaCoder

        public static <T> SchemaCoder<T> schemaCoder​(TypeDescriptor<T> type)
        Returns an SchemaCoder instance for the provided element type.
        Type Parameters:
        T - the element type
      • schemaCoder

        public static <T> SchemaCoder<T> schemaCoder​(java.lang.Class<T> clazz)
        Returns an SchemaCoder instance for the provided element class.
        Type Parameters:
        T - the element type
      • schemaCoder

        public static SchemaCoder<org.apache.avro.generic.GenericRecord> schemaCoder​(org.apache.avro.Schema schema)
        Returns an SchemaCoder instance for the Avro schema. The implicit type is GenericRecord.
      • schemaCoder

        public static <T> SchemaCoder<T> schemaCoder​(java.lang.Class<T> clazz,
                                                     org.apache.avro.Schema schema)
        Returns an SchemaCoder instance for the provided element type using the provided Avro schema.

        If the type argument is GenericRecord, the schema may be arbitrary. Otherwise, the schema must correspond to the type provided.

        Type Parameters:
        T - the element type
      • schemaCoder

        public static <T> SchemaCoder<T> schemaCoder​(AvroCoder<T> avroCoder)
        Returns an SchemaCoder instance based on the provided AvroCoder for the element type.
        Type Parameters:
        T - the element type
      • getFieldTypes

        public static <T> java.util.List<FieldValueTypeInformation> getFieldTypes​(java.lang.Class<T> clazz,
                                                                                  Schema schema)
        Get field types for an AVRO-generated SpecificRecord or a POJO.
      • getGetters

        public static <T> java.util.List<FieldValueGetter> getGetters​(java.lang.Class<T> clazz,
                                                                      Schema schema)
        Get generated getters for an AVRO-generated SpecificRecord or a POJO.
      • getCreator

        public static <T> SchemaUserTypeCreator getCreator​(java.lang.Class<T> clazz,
                                                           Schema schema)
        Get an object creator for an AVRO-generated SpecificRecord.
      • convertAvroFieldStrict

        public static @Nullable java.lang.Object convertAvroFieldStrict​(@Nullable java.lang.Object value,
                                                                        @Nonnull
                                                                        org.apache.avro.Schema avroSchema,
                                                                        @Nonnull
                                                                        Schema.FieldType fieldType)
        Strict conversion from AVRO to Beam, strict because it doesn't do widening or narrowing during conversion.
        Parameters:
        value - GenericRecord or any nested value
        avroSchema - schema for value
        fieldType - target beam field type
        Returns:
        value converted for Row