Class/Object

io.smartdatalake.workflow.dataobject

ExcelOptions

Related Docs: object ExcelOptions | package dataobject

Permalink

case class ExcelOptions(sheetName: Option[String] = None, numLinesToSkip: Option[Int] = None, startColumn: Option[String] = None, endColumn: Option[String] = None, rowLimit: Option[Int] = None, useHeader: Boolean = true, treatEmptyValuesAsNulls: Option[Boolean] = Some(true), inferSchema: Option[Boolean] = Some(true), timestampFormat: Option[String] = Some("dd-MM-yyyy HH:mm:ss"), dateFormat: Option[String] = None, maxRowsInMemory: Option[Int] = None, excerptSize: Option[Int] = None) extends Product with Serializable

Options passed to org.apache.spark.sql.DataFrameReader and org.apache.spark.sql.DataFrameWriter for reading and writing Microsoft Excel files. Excel support is provided by the spark-excel project (see link below).

sheetName

Optional name of the Excel Sheet to read from/write to.

numLinesToSkip

Optional number of rows in the excel spreadsheet to skip before any data is read. This option must not be set for writing.

startColumn

Optional first column in the specified Excel Sheet to read from (as string, e.g B). This option must not be set for writing.

endColumn

Optional last column in the specified Excel Sheet to read from (as string, e.g. F).

rowLimit

Optional limit of the number of rows being returned on read. This is applied after numLinesToSkip.

useHeader

If true, the first row of the excel sheet specifies the column names (default: true).

treatEmptyValuesAsNulls

Empty cells are parsed as null values (default: true).

inferSchema

Infer the schema of the excel sheet automatically (default: true).

timestampFormat

A format string specifying the format to use when writing timestamps (default: dd-MM-yyyy HH:mm:ss).

dateFormat

A format string specifying the format to use when writing dates.

maxRowsInMemory

The number of rows that are stored in memory. If set, a streaming reader is used which can help with big files.

excerptSize

Sample size for schema inference.

See also

https://github.com/crealytics/spark-excel

Linear Supertypes
Serializable, Serializable, Product, Equals, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. ExcelOptions
  2. Serializable
  3. Serializable
  4. Product
  5. Equals
  6. AnyRef
  7. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new ExcelOptions(sheetName: Option[String] = None, numLinesToSkip: Option[Int] = None, startColumn: Option[String] = None, endColumn: Option[String] = None, rowLimit: Option[Int] = None, useHeader: Boolean = true, treatEmptyValuesAsNulls: Option[Boolean] = Some(true), inferSchema: Option[Boolean] = Some(true), timestampFormat: Option[String] = Some("dd-MM-yyyy HH:mm:ss"), dateFormat: Option[String] = None, maxRowsInMemory: Option[Int] = None, excerptSize: Option[Int] = None)

    Permalink

    sheetName

    Optional name of the Excel Sheet to read from/write to.

    numLinesToSkip

    Optional number of rows in the excel spreadsheet to skip before any data is read. This option must not be set for writing.

    startColumn

    Optional first column in the specified Excel Sheet to read from (as string, e.g B). This option must not be set for writing.

    endColumn

    Optional last column in the specified Excel Sheet to read from (as string, e.g. F).

    rowLimit

    Optional limit of the number of rows being returned on read. This is applied after numLinesToSkip.

    useHeader

    If true, the first row of the excel sheet specifies the column names (default: true).

    treatEmptyValuesAsNulls

    Empty cells are parsed as null values (default: true).

    inferSchema

    Infer the schema of the excel sheet automatically (default: true).

    timestampFormat

    A format string specifying the format to use when writing timestamps (default: dd-MM-yyyy HH:mm:ss).

    dateFormat

    A format string specifying the format to use when writing dates.

    maxRowsInMemory

    The number of rows that are stored in memory. If set, a streaming reader is used which can help with big files.

    excerptSize

    Sample size for schema inference.

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. val dateFormat: Option[String]

    Permalink

    A format string specifying the format to use when writing dates.

  7. val endColumn: Option[String]

    Permalink

    Optional last column in the specified Excel Sheet to read from (as string, e.g.

    Optional last column in the specified Excel Sheet to read from (as string, e.g. F).

  8. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  9. val excerptSize: Option[Int]

    Permalink

    Sample size for schema inference.

  10. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  11. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  12. def getDataAddress: Option[String]

    Permalink
  13. val inferSchema: Option[Boolean]

    Permalink

    Infer the schema of the excel sheet automatically (default: true).

  14. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  15. val maxRowsInMemory: Option[Int]

    Permalink

    The number of rows that are stored in memory.

    The number of rows that are stored in memory. If set, a streaming reader is used which can help with big files.

  16. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  17. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  18. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  19. val numLinesToSkip: Option[Int]

    Permalink

    Optional number of rows in the excel spreadsheet to skip before any data is read.

    Optional number of rows in the excel spreadsheet to skip before any data is read. This option must not be set for writing.

  20. val rowLimit: Option[Int]

    Permalink

    Optional limit of the number of rows being returned on read.

    Optional limit of the number of rows being returned on read. This is applied after numLinesToSkip.

  21. val sheetName: Option[String]

    Permalink

    Optional name of the Excel Sheet to read from/write to.

  22. val startColumn: Option[String]

    Permalink

    Optional first column in the specified Excel Sheet to read from (as string, e.g B).

    Optional first column in the specified Excel Sheet to read from (as string, e.g B). This option must not be set for writing.

  23. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  24. val timestampFormat: Option[String]

    Permalink

    A format string specifying the format to use when writing timestamps (default: dd-MM-yyyy HH:mm:ss).

  25. def toMap(schema: Option[StructType]): Map[String, Option[Any]]

    Permalink
  26. val treatEmptyValuesAsNulls: Option[Boolean]

    Permalink

    Empty cells are parsed as null values (default: true).

  27. val useHeader: Boolean

    Permalink

    If true, the first row of the excel sheet specifies the column names (default: true).

  28. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  29. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  30. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from AnyRef

Inherited from Any

Ungrouped