Package

com.coxautodata.waimak.dataflow.spark.dataquality.deequ

prefabchecks

Permalink

package prefabchecks

Visibility
  1. Public
  2. All

Type Members

  1. class CompletenessCheck extends DeequPrefabCheck[CompletenessCheckConfig]

    Permalink

    Checks the completeness of columns of a dataset against warning and critical thresholds

  2. case class CompletenessCheckConfig(columns: List[String], warningThreshold: Option[Double] = None, criticalThreshold: Option[Double] = None) extends Product with Serializable

    Permalink
  3. class GenericSQLCheck extends DeequPrefabCheck[GenericSQLCheckConfig]

    Permalink

    Allows generic SQL checks to be configured i.e.

    Allows generic SQL checks to be configured i.e. given a condition such as my_column > 5, it will assert that this is the case for all rows in the Dataset and alert otherwise

  4. case class GenericSQLCheckConfig(warningChecks: Seq[String] = Nil, criticalChecks: Seq[String] = Nil) extends Product with Serializable

    Permalink
  5. class RecentTimestampCheck extends DeequPrefabCheck[RecentTimestampCheckConfig]

    Permalink

    Checks that the most recent value in a timestamp column is within a configured number of hours to now (default is 6 hours).

    Checks that the most recent value in a timestamp column is within a configured number of hours to now (default is 6 hours). The purpose of this is to flag up when our data is unexpectedly stale.

  6. case class RecentTimestampCheckConfig(column: String, hoursToLookBack: Int = 6, alertLevel: String = "warning", nowOverride: Option[String] = None) extends Product with Serializable

    Permalink
  7. class UniquenessCheck extends DeequPrefabCheck[UniquenessCheckConfig]

    Permalink

    Checks the uniqueness of a combination of columns of a dataset against warning and critical thresholds.

    Checks the uniqueness of a combination of columns of a dataset against warning and critical thresholds. If thresholds are not configured, by default it will generate a warning alert if the combination of columns is not fully unique. N.B doesn't seem to quite work as expected when nulls form part of the column combination

  8. case class UniquenessCheckConfig(columns: List[String], warningThreshold: Option[Double] = Some(1.0), criticalThreshold: Option[Double] = None) extends Product with Serializable

    Permalink

Ungrouped