Class JsonToRow
- java.lang.Object
-
- org.apache.beam.sdk.transforms.JsonToRow
-
@Experimental(SCHEMAS) public class JsonToRow extends java.lang.Object
ExperimentalCreates a
PTransform
to convert input JSON objects toRows
with givenSchema
.Currently supported
Schema
field types are:Schema.TypeName.BYTE
Schema.TypeName.INT16
Schema.TypeName.INT32
Schema.TypeName.INT64
Schema.TypeName.FLOAT
Schema.TypeName.DOUBLE
Schema.TypeName.BOOLEAN
Schema.TypeName.STRING
For specifics of JSON deserialization see
RowJson.RowJsonDeserializer
.Conversion is strict, with minimal type coercion:
Booleans are only parsed from
true
orfalse
literals, not from"true"
or"false"
strings or any other values (exception is thrown in these cases).If a JSON number doesn't fit into the corresponding schema field type, an exception is be thrown. Strings are not auto-converted to numbers. Floating point numbers are not auto-converted to integral numbers. Precision loss also causes exceptions.
Only JSON string values can be parsed into
Schema.TypeName.STRING
. Numbers, booleans are not automatically converted, exceptions are thrown in these cases.If a schema field is missing from the JSON value, by default the field will be assumed to have a null value, and will be converted into a null in the row if the schema has this field being nullable. This behavior can be changed by setting the
RowJson.RowJsonDeserializer.NullBehavior
using thewithSchemaAndNullBehavior(org.apache.beam.sdk.schemas.Schema, org.apache.beam.sdk.util.RowJson.RowJsonDeserializer.NullBehavior)
. For example, setting it withRowJson.RowJsonDeserializer.NullBehavior.REQUIRE_NULL
means that JSON values must be null to be parsed as null, otherwise an error will be thrown, as with previous versions of Beam.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
JsonToRow.JsonToRowWithErrFn
static class
JsonToRow.ParseResult
The result of awithExceptionReporting(Schema)
transform.
-
Constructor Summary
Constructors Constructor Description JsonToRow()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static JsonToRow.JsonToRowWithErrFn
withExceptionReporting(Schema rowSchema)
Enable Exception Reporting support.static PTransform<PCollection<java.lang.String>,PCollection<Row>>
withSchema(Schema rowSchema)
static PTransform<PCollection<java.lang.String>,PCollection<Row>>
withSchemaAndNullBehavior(Schema rowSchema, RowJson.RowJsonDeserializer.NullBehavior nullBehavior)
-
-
-
Method Detail
-
withSchema
public static PTransform<PCollection<java.lang.String>,PCollection<Row>> withSchema(Schema rowSchema)
-
withSchemaAndNullBehavior
public static PTransform<PCollection<java.lang.String>,PCollection<Row>> withSchemaAndNullBehavior(Schema rowSchema, RowJson.RowJsonDeserializer.NullBehavior nullBehavior)
-
withExceptionReporting
@Experimental(SCHEMAS) public static JsonToRow.JsonToRowWithErrFn withExceptionReporting(Schema rowSchema)
Enable Exception Reporting support. If this value is set errors in the parsing layer are returned as Row objects within aJsonToRow.ParseResult
You can access the results by using
withExceptionReporting(Schema)
:ParseResult results = jsonPersons.apply(JsonToRow.withExceptionReporting(PERSON_SCHEMA));
Then access the parsed results via,
JsonToRow.ParseResult.getResults()
And access the failed to parse results via,
JsonToRow.ParseResult.getFailedToParseLines()
This will produce a Row with Schema
JsonToRow.JsonToRowWithErrFn.ERROR_ROW_SCHEMA
To access the reason for the failure you will need to first enable extended error reporting.
JsonToRow.JsonToRowWithErrFn.withExtendedErrorInfo()
This will provide access to the reason for the Parse failure. The call to
JsonToRow.ParseResult.getFailedToParseLines()
will produce a Row with SchemaJsonToRow.JsonToRowWithErrFn.ERROR_ROW_WITH_ERR_MSG_SCHEMA
- Returns:
JsonToRow.JsonToRowWithErrFn
-
-