Class

com.jcdecaux.setl.storage.connector

CSVConnector

Related Doc: package connector

Permalink

class CSVConnector extends FileConnector

Connector that loads CSV files and returns the result as a DataFrame.

You can set the following CSV-specific options to deal with CSV files:

Annotations
@Evolving()
Linear Supertypes
FileConnector, HasSparkSession, Connector, Logging, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. CSVConnector
  2. FileConnector
  3. HasSparkSession
  4. Connector
  5. Logging
  6. AnyRef
  7. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new CSVConnector(path: String, inferSchema: String, delimiter: String, header: String, saveMode: SaveMode)

    Permalink
  2. new CSVConnector(config: Conf)

    Permalink
  3. new CSVConnector(config: Config)

    Permalink
  4. new CSVConnector(options: Map[String, String])

    Permalink
  5. new CSVConnector(options: FileConnectorConf)

    Permalink
  6. new CSVConnector(spark: SparkSession, path: String, inferSchema: String, delimiter: String, header: String, saveMode: SaveMode)

    Permalink
    Annotations
    @deprecated
    Deprecated

    (Since version 0.3.4) use the constructor with no spark session

  7. new CSVConnector(spark: SparkSession, conf: Conf)

    Permalink
    Annotations
    @deprecated
    Deprecated

    (Since version 0.3.4) use the constructor with no spark session

  8. new CSVConnector(spark: SparkSession, config: Config)

    Permalink
    Annotations
    @deprecated
    Deprecated

    (Since version 0.3.4) use the constructor with no spark session

  9. new CSVConnector(spark: SparkSession, options: Map[String, String])

    Permalink
    Annotations
    @deprecated
    Deprecated

    (Since version 0.3.4) use the constructor with no spark session

  10. new CSVConnector(spark: SparkSession, options: FileConnectorConf)

    Permalink
    Annotations
    @deprecated
    Deprecated

    (Since version 0.3.4) use the constructor with no spark session

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. lazy val basePath: Path

    Permalink

    Get the basePath of the current path.

    Get the basePath of the current path. If the value path is a file path, then its basePath will be it's parent's path. Otherwise it will be the current path itself.

    Definition Classes
    FileConnector
  6. def canWrite: Boolean

    Permalink
    Definition Classes
    FileConnector
  7. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. def delete(): Unit

    Permalink

    Delete the current file or directory

    Delete the current file or directory

    Definition Classes
    FileConnector
  9. def dropUserDefinedSuffix: Boolean

    Permalink

    Get the boolean value of dropUserDefinedSuffix.

    Get the boolean value of dropUserDefinedSuffix.

    returns

    true if the column will be dropped, false otherwise

    Definition Classes
    FileConnector
  10. def dropUserDefinedSuffix(boo: Boolean): CSVConnector.this.type

    Permalink

    Set to true to drop the column containing user defined suffix (default name _user_defined_suffix)

    Set to true to drop the column containing user defined suffix (default name _user_defined_suffix)

    boo

    true to drop, false to keep

    Definition Classes
    FileConnector
  11. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  12. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  13. def filesToLoad(detailed: Boolean): Array[Path]

    Permalink

    List files to be loaded.

    List files to be loaded.

    If the current connector has a non-empty filename pattern, then return a list of file paths that match the pattern.

    When the filename pattern is not set: If the absolute path of this connector is a directory, return the path of the directory if detailed is set to false. Otherwise, return a list of file paths in the directory

    detailed

    true to return a list of file paths if the current absolute path is a directory

    Definition Classes
    FileConnector
  14. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  15. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  16. def getFileSystem: FileSystem

    Permalink

    Get the current filesystem based on the path URI

    Get the current filesystem based on the path URI

    Definition Classes
    FileConnector
  17. def getSize: Long

    Permalink

    Get the sum of file size

    Get the sum of file size

    returns

    size in byte

    Definition Classes
    FileConnector
  18. def getUserDefinedSuffixKey: String

    Permalink

    Get the value of user defined suffix column name

    Get the value of user defined suffix column name

    Definition Classes
    FileConnector
  19. def getWriteCount: Long

    Permalink
    Definition Classes
    FileConnector
  20. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  21. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  22. def listFiles(): Array[String]

    Permalink

    List ALL the file paths (in format of string) of the current path of connector

    List ALL the file paths (in format of string) of the current path of connector

    Definition Classes
    FileConnector
  23. def listFilesToLoad(detailed: Boolean = true): Array[String]

    Permalink

    List all the file path (in format of string) to be loaded.

    List all the file path (in format of string) to be loaded.

    If the current connector has a non-empty filename pattern, then return a list of file paths that match the pattern.

    When the filename pattern is not set: If the absolute path of this connector is a directory, return the path of the directory if detailed is set to false. Otherwise, return a list of file paths in the directory

    When the filename pattern IS set, a list of file paths will always be returned

    detailed

    true to list all file paths when the absolute path points to a directory otherwise return only the directory path.

    Definition Classes
    FileConnector
  24. def listPaths(): Array[Path]

    Permalink

    List ALL the file paths of the current path of connector

    List ALL the file paths of the current path of connector

    Definition Classes
    FileConnector
  25. def log: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  26. def logName: String

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  27. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  28. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  29. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  30. val options: FileConnectorConf

    Permalink
    Definition Classes
    CSVConnectorFileConnector
  31. def partitionBy(columns: String*): CSVConnector.this.type

    Permalink
    Definition Classes
    FileConnector
  32. def read(): DataFrame

    Permalink

    Read a DataFrame from a file with the path defined during the instantiation.

    Read a DataFrame from a file with the path defined during the instantiation.

    Definition Classes
    FileConnectorConnector
    Annotations
    @throws( s"$absolutePath doesn't exist" ) @throws( s"$absolutePath doesn't exist" )
  33. lazy val reader: DataFrameReader

    Permalink

    DataFrame reader for the current path of connector

    DataFrame reader for the current path of connector

    Definition Classes
    FileConnectorConnector
  34. def resetSuffix(force: Boolean = false): CSVConnector.this.type

    Permalink

    Reset suffix to None

    Reset suffix to None

    force

    set to true to ignore the validity check of suffix value

    Definition Classes
    FileConnector
  35. val schema: Option[StructType]

    Permalink
    Definition Classes
    FileConnector
  36. def setSuffix(suffix: Option[String]): CSVConnector.this.type

    Permalink

    The current version of FileConnector doesn't support a mix of suffix and non-suffix write when the DataFrame is partitioned.

    The current version of FileConnector doesn't support a mix of suffix and non-suffix write when the DataFrame is partitioned.

    This method will detect, in the case of a partitioned table, if user try to use both suffix write and non-suffix write

    suffix

    an option of suffix in string format

    Definition Classes
    FileConnector
  37. def setUserDefinedSuffixKey(key: String): CSVConnector.this.type

    Permalink

    Set the name of user defined suffix column (by default is _user_defined_suffix

    Set the name of user defined suffix column (by default is _user_defined_suffix

    key

    name of the new key

    Definition Classes
    FileConnector
  38. val spark: SparkSession

    Permalink
    Definition Classes
    HasSparkSession
  39. val storage: Storage

    Permalink
    Definition Classes
    CSVConnectorConnector
  40. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  41. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  42. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  43. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  44. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  45. def write(t: DataFrame): Unit

    Permalink
    Definition Classes
    FileConnectorConnector
  46. def write(df: DataFrame, suffix: Option[String]): Unit

    Permalink

    Write a DataFrame into file

    Write a DataFrame into file

    df

    dataframe to be written

    suffix

    optional, String, write the df in a sub-directory of the defined path

    Definition Classes
    FileConnectorConnector
  47. def writeToPath(df: DataFrame, filepath: String): Unit

    Permalink

    Write a DataFrame into the given path with the given save mode

    Write a DataFrame into the given path with the given save mode

    Definition Classes
    FileConnector
  48. val writer: (DataFrame) ⇒ DataFrameWriter[Row]

    Permalink

    Initialize a DataFrame writer.

    Initialize a DataFrame writer. A new writer will be initiate only if the hashcode of input DataFrame is different than the last written DataFrame.

    Definition Classes
    FileConnectorConnector

Inherited from FileConnector

Inherited from HasSparkSession

Inherited from Connector

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped