spata/info.fingo.spata/info.fingo.spata.schema/CSVSchema

CSVSchema

info.fingo.spata.schema.CSVSchema

See theCSVSchema companion object

final class CSVSchema[T <: Tuple]

CSV schema definition and validation utility.

Schema declares fields which are expected in CSV stream - their names and types. Fields with optional values have to be defined as Options:

val schema = CSVSchema()
 .add[String]("name")
 .add[Option[LocalDate]]("birthday")

Optional fields still have to exist in source data, only their values may be empty. Not all fields have to be declared by schema, any subset of them is sufficient.

In case of header mapping through CSVConfig.mapHeader, the names provided in schema are the final ones, after mapping.

Additionally, it is possible to specify per-field validators, posing additional requirements on CSV data:

val schema = CSVSchema()
 .add[String]("code", RegexValidator("[A-Z][A-Z0-9]+"))
 .add[BigDecimal]("price", MinValidator(0.01))

For more information on available, built-in validators or how to create additional ones see Validator.

CSV schema is verified through its validate method. It may yield an InvalidRecord, containing validation error together with original Record data or a TypedRecord, containing selected, strongly typed data - in both cases wrapped in cats.data.Validated.

Type parameters

T: tuple encoding the schema

Value parameters

columns: the typle containing typed columns with optional validators

Attributes

Companion: object
Graph
Supertypes: class Object

trait Matchable

class Any

Members list

Value members

Concrete methods

Adds field definition to schema.

Field definition consists of field name and its type. A set of field definitions constitutes a schema definition. A collection of additional Validators may be added to a field. When validating schema, validators are checked after field type verification and receive already parsed value of type declared for a field.

To get value of proper type from a field, an implicit StringParser is required. Parsers for basic types and formats are available through StringParser object. Additional ones may be provided by implementing the StringParser trait.

Optional values should be denoted by providing Option[A] as field type value. Note, that even optionals require the field to be present in the source data, only its values may be missing (empty).

The same validators, which are used to validate plain values, may be used to verify optional values. Missing value (None) is assumed correct in such a case.

This is a chaining method which allows starting with an empty schema and extending it through subsequent calls to add:

val schema = CSVSchema()
 .add[Double]("latitude", RangeValidator(-90.0, 90.0))
 .add[Double]("longitude", RangeValidator(-180.0, 180.0))

Type parameters

V: field value type

Value parameters

ev: evidence that the key is unique - it is not present in the schema yet
key: unique field name - a singleton string
validators: optional validators to check that field values comply with additional rules

Attributes

Returns: new schema definition with column (field definition) added to it

Adds field definition to schema. Does not support attaching additional validators.

Type parameters

V: field value type

Value parameters

ev: evidence that the key is unique - it is not present in the schema yet
key: unique field name - a singleton string

Attributes

Returns: new schema definition with column (field definition) added to it
See also: add[V](key: Key, validators: Validator[V]*)

Gets string representation of schema.

Attributes

Returns: short schema description
Definition Classes: Any

Validates CSV stream against schema.

For each input record the validation process:

parses all fields defined by schema to the declared type and, if successful,
runs provided validators with parsed values. The process is successful and creates a TypedRecord if values of all fields defined in schema are correctly parsed and positively validated. If any of these operations fails, an InvalidRecord is yielded.

If there are many validators defined for single field, the validation stops at first invalid result. Validation is nonetheless executed for all fields and collects errors from all of them.

CSV values which are not declared in schema are omitted. At the extremes, empty schema always proves valid, although yields empty typed records.

Type parameters

F: the effect type, with a type class providing support for logging (provided internally by spata)

Value parameters

enforcer: given value to recursively do the validation, provided by spata

Attributes

Returns: a pipe to validate Records and turn them into ValidatedRecords

In this article

Generated with