Class Select
- java.lang.Object
-
- org.apache.beam.sdk.schemas.transforms.Select
-
@Experimental(SCHEMAS) public class Select extends java.lang.Object
APTransform
for selecting a subset of fields from a schema type.This transforms allows projecting out a subset of fields from a schema type. The output of this transform is of type
Row
, though that can be converted into any other type with matching schema using theConvert
transform.For example, consider the following POJO type:
@DefaultSchema(JavaFieldSchema.class) public class UserEvent { public String userId; public String eventId; public int eventType; public Location location; }
Say you want to select just the set of userId, eventId pairs from each element, you would write the following:@DefaultSchema(JavaFieldSchema.class) public class Location { public double latitude; public double longtitude; }
It's possible to select a nested field as well. For example, if you want just the location information from each element:PCollection<UserEvent> events = readUserEvents(); PCollection<Row> rows = event.apply(Select.fieldNames("userId", "eventId"));
PCollection<UserEvent> events = readUserEvents(); PCollection<Location> rows = event.apply(Select.fieldNames("location") .apply(Convert.to(Location.class));
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
Select.Fields<T>
static class
Select.Flattened<T>
APTransform
representing a flattened schema.
-
Constructor Summary
Constructors Constructor Description Select()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static <T> Select.Fields<T>
create()
static <T> Select.Fields<T>
fieldAccess(FieldAccessDescriptor fieldAccessDescriptor)
Select a set of fields described in aFieldAccessDescriptor
.static <T> Select.Fields<T>
fieldIds(java.lang.Integer... ids)
Select a set of top-level field ids from the row.static <T> Select.Fields<T>
fieldNames(java.lang.String... names)
Select a set of top-level field names from the row.static <T> Select.Flattened<T>
flattenedSchema()
Selects every leaf-level field.
-
-
-
Method Detail
-
create
public static <T> Select.Fields<T> create()
-
fieldIds
public static <T> Select.Fields<T> fieldIds(java.lang.Integer... ids)
Select a set of top-level field ids from the row.
-
fieldNames
public static <T> Select.Fields<T> fieldNames(java.lang.String... names)
Select a set of top-level field names from the row.
-
fieldAccess
public static <T> Select.Fields<T> fieldAccess(FieldAccessDescriptor fieldAccessDescriptor)
Select a set of fields described in aFieldAccessDescriptor
.This allows for nested fields to be selected as well.
-
flattenedSchema
public static <T> Select.Flattened<T> flattenedSchema()
Selects every leaf-level field. This results in a nested schema being flattened into a single top-level schema. By default nested field names will be concatenated with _ characters, though this can be overridden usingSelect.Flattened.keepMostNestedFieldName()
andSelect.Flattened.withFieldNameAs(java.lang.String, java.lang.String)
.
-
-