Package

com.coxautodata.waimak.rdbm

ingestion

Permalink

package ingestion

Visibility
  1. Public
  2. All

Type Members

  1. case class IncorrectUserPKException(userPKs: Seq[String], dbPKs: Seq[String]) extends Exception with Product with Serializable

    Permalink
  2. case class PostgresConnectionDetails(server: String, port: Int, databaseName: String, user: String, password: String, sslFactory: Option[String]) extends RDBMConnectionDetails with Product with Serializable

    Permalink
  3. class PostgresExtractor extends RDBMExtractor

    Permalink

    Created by Vicky Avison on 27/04/18.

  4. case class PostgresMetadata(schemaName: String, tableName: String, pkCols: String) extends Product with Serializable

    Permalink
  5. trait RDBMConnectionDetails extends AnyRef

    Permalink

    Created by Vicky Avison on 30/04/18.

  6. case class RDBMExtractionTableConfig(tableName: String, pkCols: Option[Seq[String]] = None, lastUpdatedColumn: Option[String] = None, maxRowsPerPartition: Option[Int] = None, forceRetainStorageHistory: Option[Boolean] = None) extends Product with Serializable

    Permalink

    Table configuration used for RDBM extraction

    Table configuration used for RDBM extraction

    tableName

    The name of the table

    pkCols

    Optionally, the primary key columns for this table (don't need if the implementation of RDBMExtractor is capable of getting this information itself)

    lastUpdatedColumn

    Optionally, the last updated column for this table (don't need if the implementation of RDBMExtractor is capable of getting this information itself)

    maxRowsPerPartition

    Optionally, the maximum number of rows to be read per Dataset partition for this table This number will be used to generate predicates to be passed to org.apache.spark.sql.SparkSession.read.jdbc If this is not set, the DataFrame will only have one partition. This could result in memory issues when extracting large tables. Be careful not to create too many partitions in parallel on a large cluster; otherwise Spark might crash your external database systems. You can also control the maximum number of jdbc connections to open by limiting the number of executors for your application.

    forceRetainStorageHistory

    Optionally specify whether to retain history for this table in the storage layer. Setting this to anything other than None will override the default behaviour which is:

    • if there is a lastUpdated column (either specified here or found by the RDBMExtractor) retain all history for this table
    • if there is no lastUpdated column, don't retain history for this table (history is removed when the table is compacted). The choice of this default behaviour is because, without a lastUpdatedColumn, the table will be extracted in full every time extraction is performed, causing the size of the data in storage to grow uncontrollably
  7. trait RDBMExtractor extends AnyRef

    Permalink

    Waimak RDBM connection mechanism

  8. abstract class SQLServerBaseExtractor extends RDBMExtractor

    Permalink

    A mechanism for generating Waimak actions to extract data from a SQL Server instance

  9. case class SQLServerConnectionDetails(server: String, port: Int, databaseName: String, user: String, password: String) extends RDBMConnectionDetails with Product with Serializable

    Permalink

    Connection details for a SQL Server instance

    Connection details for a SQL Server instance

    server

    the SQL Server instance

    port

    the port

    databaseName

    the database name

    user

    the user name

    password

    the password

  10. class SQLServerExtractor extends SQLServerBaseExtractor with Logging

    Permalink

    A mechanism for generating Waimak actions to extract data from a SQL Server instance where the Database Versioning doesn't allow for STRING_AGG Functionality to be used.

    A mechanism for generating Waimak actions to extract data from a SQL Server instance where the Database Versioning doesn't allow for STRING_AGG Functionality to be used. Primarily this relates to pre SQL 2016 Databases or where STRING_AGG functionality is switched off. * /

  11. case class SQLServerTableMetadata(schemaName: String, tableName: String, primaryKeys: String) extends Product with Serializable

    Permalink
  12. class SQLServerTemporalExtractor extends SQLServerBaseExtractor with Logging

    Permalink

    A mechanism for generating Waimak actions to extract data from a SQL Server instance containing temporal tables Tables can be a mixture of temporal and non-temporal - both will be handled appropriately

  13. case class SQLServerTemporalTableMetadata(schemaName: String, tableName: String, historyTableSchema: Option[String] = None, historyTableName: Option[String] = None, startColName: Option[String] = None, endColName: Option[String] = None, primaryKeys: String) extends Product with Serializable

    Permalink
  14. class SQLServerViewExtractor extends SQLServerBaseExtractor

    Permalink

    Created by Vicky Avison on 12/04/18.

  15. case class TableExtractionMetadata(schemaName: String, tableName: String, primaryKeys: Seq[String], lastUpdatedColumn: Option[String] = None) extends Product with Serializable

    Permalink

Value Members

  1. object PKsNotFoundOrProvidedException extends Exception

    Permalink
  2. object RDBMIngestionActions

    Permalink

    Created by Vicky Avison on 08/05/18.

  3. object RDBMIngestionUtils

    Permalink

    Created by Vicky Avison on 04/04/18.

Ungrouped