package tabular
- Alphabetic
- By Inheritance
- tabular
- PrettyPrinters
- DataSinkImplicits
- DataSourceImplicits
- MappingBuilders
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Type Members
- class BufferedDataSource extends TabularDataSource
- class ColumnDoesNotExistException extends RuntimeException
-
case class
Csv
(header: Row, dataRows: Iterator[Row]) extends Iterable[Row] with Product with Serializable
Csv encapsulates a tabular data structure, as found in a CSV file or spreadsheet.
Csv encapsulates a tabular data structure, as found in a CSV file or spreadsheet. It allows client code to:
- read CSV data from a tabular data-source (String, File, InputStream, or anything else)
- Change the column structure (add, remove, rename and reorder columns)
- Map data values according to a function (e.g. a lookup, a data-conversion, a default value)
- Apply fail-fast or reporting-only validations to each row
- Export the results
This class may be useful for:
- ETL of externally-generated data
- Pre-formatting data before Diffing it during a reconciliation. @see Diff
Csv is designed to be lazy. The datasource is traversed ONCE, when accessed via the Csv.iterator method. All other operations are declarative (restructuring, data transformations, validations, etc). Thus, the time cost of traversing the data and applying these functions is only paid once, when the client code materialises the Csv.
TODO - CAS - 07/08/2014 - A Y-shaped pipeline (spits out two CSVs) TODO - CAS - 07/08/2014 - Aggregator 1 - merge columns using a Row => Row (e.g. (firstName, surname) -> s"$firstName $surname") TODO - CAS - 07/08/2014 - Aggregator 2 - merge rows - provide a predicate for row grouping/inclusion/exclusion
- class CsvPrinter extends AnyRef
- trait DataSinkImplicits extends AnyRef
- trait DataSourceImplicits extends AnyRef
-
implicit
class
AppendableDataSource
extends AnyRef
- Definition Classes
- DataSourceImplicits
- case class DefaultParser (delimiter: Char) extends Parser with Product with Serializable
- trait Differentiator [U] extends AnyRef
- class LineReader extends Iterator[Row]
- trait MappingBuilders extends AnyRef
-
case class
Conv
[T](f: (String) ⇒ T) extends Product with Serializable
- Definition Classes
- MappingBuilders
- Annotations
- @implicitNotFound( ... )
- class NoDataInSourceException extends RuntimeException
- trait Parser extends AnyRef
- trait PrettyPrinters extends AnyRef
-
class
RegexTwoPassParser
extends Parser
Simple parser with common defaults:
Simple parser with common defaults:
- Quoted strings are preserved, even with embedded delimiters, e.g.: foo, "one, monkey, two", bar ===> Array("foo", "one, monkey, two", "bar")
- spaces around elements are ignored (unless quoted) foo , bar , " monkey one ", baz ===> Array("foo", "bar", " monkey one ", "baz")
- case class Row (data: Array[String], validationFailures: Seq[String] = Nil) extends Iterable[String] with Product with Serializable
- case class RowDiffer (header: Row, fieldComps: (String, Comparator[String])*) extends Differentiator[Row] with Product with Serializable
- class RowPrinter extends AnyRef
- class ScannerDataSource extends TabularDataSource
-
trait
TabularDataSource
extends Closeable
A way of providing data to an instance of the Csv class.
A way of providing data to an instance of the Csv class. Most needs should be met by BufferedDataSource, which has a number of implicit adapters declared in the planet7 package.
- class ValidationFailedException extends RuntimeException
Value Members
-
def
by[K](f: (String) ⇒ K)(implicit arg0: Ordering[K]): Ordering[String]
Used in sort(input, "Age" -> by(_.toInt))
-
def
combine(datasources: TabularDataSource*): TabularDataSource
- Definition Classes
- DataSourceImplicits
- def defaultTo(other: String): (String) ⇒ String
-
def
experimentalFromMemoryMappedFile(f: File): TabularDataSource
- Definition Classes
- DataSourceImplicits
-
def
experimentalFromScanner(f: File): TabularDataSource
- Definition Classes
- DataSourceImplicits
-
def
experimentalFromWholeFile(f: File): TabularDataSource
- Definition Classes
- DataSourceImplicits
- def export(csv: Csv, parser: Parser = Parsers.basic): String
-
implicit
def
fromColumnStructure(s: Seq[(String, String)]): Array[String]
Converts Seq("foo" -> "foo", "bar" -> "bar") to Seq("foo", "bar"), to make building header rows easier
-
def
fromFile(f: File, parser: Parser): TabularDataSource
- Definition Classes
- DataSourceImplicits
-
implicit
def
fromFile(f: File): TabularDataSource
- Definition Classes
- DataSourceImplicits
-
def
fromInputStream(is: InputStream, parser: Parser): TabularDataSource
- Definition Classes
- DataSourceImplicits
-
implicit
def
fromInputStream(is: InputStream): TabularDataSource
- Definition Classes
- DataSourceImplicits
-
implicit
def
fromIterable(it: Iterable[String], parser: Parser): TabularDataSource
- Definition Classes
- DataSourceImplicits
-
implicit
def
fromIterable(it: Iterable[String]): TabularDataSource
- Definition Classes
- DataSourceImplicits
-
def
fromString(s: String, parser: Parser): TabularDataSource
- Definition Classes
- DataSourceImplicits
-
implicit
def
fromString(s: String): TabularDataSource
- Definition Classes
- DataSourceImplicits
-
def
given[T1, T2, T3](colMod1: String, colMod2: String, colMod3: String): Nothing
- Definition Classes
- MappingBuilders
-
def
given[T1, T2](col1: String, col2: String)(pf: PartialFunction[(T1, T2), String])(implicit t1Conv: Conv[T1], t2Conv: Conv[T2]): (String) ⇒ (Row) ⇒ (Row) ⇒ Row
- Definition Classes
- MappingBuilders
-
def
given[T1](col1: String): Nothing
- Definition Classes
- MappingBuilders
- def ignore(columnNames: String*): (Array[String]) ⇒ Array[String]
-
def
showDiffs(left: Row, right: Row): String
- Definition Classes
- PrettyPrinters
- def sort(csv: Csv, differ: RowDiffer): Csv
- def sort(csv: Csv, fieldComps: (String, Comparator[String])*): Csv
-
implicit
val
toBD: Conv[BigDecimal]
- Definition Classes
- MappingBuilders
-
implicit
def
toColumnStructure(s: Seq[String]): Seq[(String, String)]
Converts Seq("foo", "bar") to Seq("foo" -> "foo", "bar" -> "bar") to make operations on column names easier
-
implicit
def
toColumnStructure(s: String): (String, String)
Converts Csv(data).columnStructure("Name") to Csv(data).columnStructure("Name" -> "Name")
-
implicit
val
toInt: Conv[Int]
- Definition Classes
- MappingBuilders
-
implicit
def
toRowTransformer(mapping: (String, (String) ⇒ (Row) ⇒ (Row) ⇒ Row)): (Row) ⇒ (Row) ⇒ Row
- Definition Classes
- MappingBuilders
-
implicit
def
toStringCompare(s: String): (String, Comparator[String])
Converts sort(input, "Surname") into sort(input, "Surname" -> Comparator[String])
-
implicit
def
toValidation(columnAssertion: (String, (String) ⇒ Boolean)): (Row) ⇒ (Row) ⇒ Row
Used in Cvs.assertAndAbort() and Csv.assertAndReport().
Used in Cvs.assertAndAbort() and Csv.assertAndReport(). Converts , to build simple validation checks
-
def
top5(csv: Csv): String
- Definition Classes
- PrettyPrinters
-
def
write(csv: Csv, path: String, parser: Parser = Parsers.basic): Unit
- Definition Classes
- DataSinkImplicits
- object Csv extends Serializable
- object EmptyRow extends Row
- object FieldDiffer extends Differentiator[(String, String)] with Product with Serializable
-
object
X
- Definition Classes
- MappingBuilders
- object NaiveRowDiffer extends Differentiator[Row]
- object NaiveRowOrdering extends Ordering[Row]
- object Parsers
- object RegexTwoPassParser
- object Row extends Serializable
- object StringDiffer extends Differentiator[String] with Product with Serializable
- object Validations