package libs
- Alphabetic
- By Inheritance
- libs
- FixedFileFormatImplicits
- SparkFunctions
- DataHelpers
- Component
- UDFUtils
- LazyLogging
- Serializable
- Serializable
- RestAPIUtils
- ProphecyDataFrame
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Type Members
-
type
Aggregate = Dataset[Row]
- Definition Classes
- Component
- class CDC extends AnyRef
- class CLIConf extends ScallopConf
- trait Component extends AnyRef
-
case class
UsesDataset(id: String, version: Int = -1) extends Annotation with StaticAnnotation with Product with Serializable
- Definition Classes
- Component
-
case class
UsesRuleset(id: String) extends Annotation with StaticAnnotation with Product with Serializable
- Definition Classes
- Component
-
case class
Visual(id: String = "ID", label: String = "Label", x: Long = 0, y: Long = 0, phase: Int = 0, mode: String = "batch", interimMode: String = "full", detailedStats: Boolean = false) extends Annotation with StaticAnnotation with Product with Serializable
- Definition Classes
- Component
- trait ConfigBase extends AnyRef
- abstract class ConfigurationFactory[C <: ConfigBase] extends AnyRef
-
type
CreateData = Dataset[Row]
- Definition Classes
- Component
-
type
DataFrame1 = Dataset[Row]
- Definition Classes
- Component
-
type
DataFrame10 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
DataFrame11 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
DataFrame12 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
DataFrame13 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
DataFrame14 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
DataFrame15 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
DataFrame16 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
DataFrame17 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
DataFrame18 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
DataFrame19 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
DataFrame2 = (DataFrame, DataFrame)
- Definition Classes
- Component
-
type
DataFrame20 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
DataFrame21 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
DataFrame22 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
DataFrame3 = (DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
DataFrame4 = (DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
DataFrame5 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
DataFrame6 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
DataFrame7 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
DataFrame8 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
DataFrame9 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
trait
DataHelpers extends LazyLogging
Helper Utilities for reading/writing data from/to different data sources.
-
type
DataQualityTest = Dataset[Row]
- Definition Classes
- Component
-
type
DatabaseInput = Dataset[Row]
- Definition Classes
- Component
-
type
Deduplicate = Dataset[Row]
- Definition Classes
- Component
- case class Description(comment: String) extends Annotation with StaticAnnotation with Product with Serializable
- implicit class ExtendedDataFrameGlobal extends ExtendedDataFrame
- implicit class ExtendedStreamingTargetGlobal extends ExtendedStreamingTarget
- trait FFAST extends Positional
- case class FFCompoundSchemaRow(compound: FFCompoundType, rows: Seq[FFSchemaRow]) extends FFSchemaRow with Product with Serializable
- sealed trait FFCompoundType extends FFAST
- case class FFConditionalSchemaRow(condition: String, schemaRow: FFSchemaRow) extends FFSchemaRow with Product with Serializable
- sealed trait FFDataFormat extends FFAST
- case class FFDateFormat(name: FFTypeName, format: Option[String], miscProperties: Map[String, Any] = Map()) extends FFDataFormat with Product with Serializable
- case class FFDateTimeFormat(name: FFTypeName, format: Option[String], miscProperties: Map[String, Any] = Map()) extends FFDataFormat with Product with Serializable
- sealed trait FFDefaultVal extends FFAST
- case class FFDoubleDefaultVal(value: Double) extends FFDefaultVal with Product with Serializable
- case class FFExpressionDefaultVal(value: CustomExpression) extends FFDefaultVal with Product with Serializable
- case class FFIncludeFileRow(filePath: String) extends FFSchemaRow with Product with Serializable
- case class FFIntDefaultVal(value: Int) extends FFDefaultVal with Product with Serializable
- case class FFNoDefaultVal() extends FFDefaultVal with Product with Serializable
- case class FFNullDefaultVal(value: Option[Any] = None) extends FFDefaultVal with Product with Serializable
- case class FFNumberArrayFormat(name: FFTypeName, precision: Option[Int], scale: Option[Int], arraySizeInfo: Option[String], miscProperties: Map[String, Any] = ...) extends FFDataFormat with Product with Serializable
- case class FFNumberFormat(name: FFTypeName, precision: Option[Int], scale: Option[Int], miscProperties: Map[String, Any] = ...) extends FFDataFormat with Product with Serializable
- case class FFRecordType(startType: String) extends FFAST with Product with Serializable
- case class FFSchemaRecord(recordType: String, rows: Seq[FFSchemaRow]) extends FFAST with Product with Serializable
- sealed trait FFSchemaRow extends FFAST
- case class FFSimpleSchemaList(rows: Seq[FFSimpleSchemaRow]) extends FFSchemaRow with Product with Serializable
- case class FFSimpleSchemaRow(name: String, format: FFDataFormat, value: FFDefaultVal) extends FFSchemaRow with Product with Serializable
- case class FFStringArrayFormat(name: FFTypeName, precision: Option[Int], arraySizeInfo: Option[String]) extends FFDataFormat with Product with Serializable
- case class FFStringDefaultVal(value: String) extends FFDefaultVal with Product with Serializable
- case class FFStringFormat(name: FFTypeName, precision: Option[Int], props: Option[Map[String, String]] = None) extends FFDataFormat with Product with Serializable
- case class FFStructArrayType(name1: String, arraySizeInfo: Option[String]) extends FFCompoundType with Product with Serializable
- case class FFStructFormat(name: FFTypeName, precision: Option[Int]) extends FFDataFormat with Product with Serializable
- case class FFStructType(name1: String) extends FFCompoundType with Product with Serializable
- case class FFTypeName(name: String, delimiter: Option[String]) extends FFAST with Product with Serializable
- case class FFTypeNameWithProperties(name: String, delimiter: Option[String], miscProperties: Map[String, Any] = Map("packed" → false)) extends FFAST with Product with Serializable
- case class FFUnionType(name: Option[String] = None) extends FFCompoundType with Product with Serializable
- case class FFUnknownFormat(name: FFTypeName, arraySizeInfo: Option[String]) extends FFDataFormat with Product with Serializable
- case class FFVoidFormat(name: FFTypeName, size: Option[Int]) extends FFDataFormat with Product with Serializable
-
type
FileInput = Dataset[Row]
- Definition Classes
- Component
-
type
FileIntermediate = Dataset[Row]
- Definition Classes
- Component
-
type
FileOutput = Unit
- Definition Classes
- Component
-
type
Filter = Dataset[Row]
- Definition Classes
- Component
- class FixedFileFormat extends FileFormat with DataSourceRegister with Serializable
- implicit class FixedFileFormatDataFrameGlobal extends FixedFileFormatDataFrame
- trait FixedFileFormatImplicits extends AnyRef
-
implicit
class
FixedFileFormatDataFrame extends AnyRef
- Definition Classes
- FixedFileFormatImplicits
-
implicit
class
FixedFileFormatSpark extends AnyRef
- Definition Classes
- FixedFileFormatImplicits
-
type
FixedFileOutput = Unit
- Definition Classes
- Component
- class FixedFormatOutputWriter extends OutputWriter
-
type
FlattenSchema = Dataset[Row]
- Definition Classes
- Component
-
type
Generate = Dataset[Row]
- Definition Classes
- Component
-
type
HashPartition = Dataset[Row]
- Definition Classes
- Component
-
type
Join = Dataset[Row]
- Definition Classes
- Component
-
type
Limit = Dataset[Row]
- Definition Classes
- Component
-
type
Lookup = UserDefinedFunction
- Definition Classes
- Component
- case class LookupDataset(datasetId: String, columnName: String) extends Annotation with StaticAnnotation with Product with Serializable
-
type
LookupFileInput = UserDefinedFunction
- Definition Classes
- Component
-
type
LookupUnit = Unit
- Definition Classes
- Component
- trait LookupUtils extends AnyRef
- class MDumpReader extends AnyRef
-
type
MultiFileRead = Dataset[Row]
- Definition Classes
- Component
-
type
MultiFileWrite = Unit
- Definition Classes
- Component
-
type
MultiFileWriteUnit = Unit
- Definition Classes
- Component
-
type
MultiJoin = Dataset[Row]
- Definition Classes
- Component
-
type
Normalize = Dataset[Row]
- Definition Classes
- Component
-
type
OrderBy = Dataset[Row]
- Definition Classes
- Component
-
type
OrderByPartition = Dataset[Row]
- Definition Classes
- Component
-
type
Prepare = Dataset[Row]
- Definition Classes
- Component
-
type
ReadSV = Dataset[Row]
- Definition Classes
- Component
-
type
Reformat = Dataset[Row]
- Definition Classes
- Component
-
type
Repartition = Dataset[Row]
- Definition Classes
- Component
-
trait
RestAPIUtils extends AnyRef
Spark utilities for handling rest api connections.
-
type
RoundRobinPartition = Dataset[Row]
- Definition Classes
- Component
-
type
RowDistributor = Dataset[Row]
- Definition Classes
- Component
-
type
RowDistributor1 = Dataset[Row]
- Definition Classes
- Component
-
type
RowDistributor10 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
RowDistributor11 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
RowDistributor12 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
RowDistributor13 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
RowDistributor14 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
RowDistributor15 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
RowDistributor16 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
RowDistributor17 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
RowDistributor18 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
RowDistributor19 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
RowDistributor2 = (DataFrame, DataFrame)
- Definition Classes
- Component
-
type
RowDistributor20 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
RowDistributor21 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
RowDistributor22 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
RowDistributor3 = (DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
RowDistributor4 = (DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
RowDistributor5 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
RowDistributor6 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
RowDistributor7 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
RowDistributor8 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
RowDistributor9 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Ruleset = Dataset[Row]
- Definition Classes
- Component
-
type
SQLStatement = Dataset[Row]
- Definition Classes
- Component
-
type
SQLStatement1 = Dataset[Row]
- Definition Classes
- Component
-
type
SQLStatement10 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SQLStatement11 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SQLStatement12 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SQLStatement13 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SQLStatement14 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SQLStatement15 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SQLStatement16 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SQLStatement17 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SQLStatement18 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SQLStatement19 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SQLStatement2 = (DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SQLStatement20 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SQLStatement21 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SQLStatement22 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SQLStatement3 = (DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SQLStatement4 = (DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SQLStatement5 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SQLStatement6 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SQLStatement7 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SQLStatement8 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SQLStatement9 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SQLStatementUnit = Unit
- Definition Classes
- Component
-
type
Scan = Dataset[Row]
- Definition Classes
- Component
-
type
SchemaTransformer = Dataset[Row]
- Definition Classes
- Component
-
type
Script = Dataset[Row]
- Definition Classes
- Component
-
type
Script1 = Dataset[Row]
- Definition Classes
- Component
-
type
Script10 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Script11 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Script12 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Script13 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Script14 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Script15 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Script16 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Script17 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Script18 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Script19 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Script2 = (DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Script20 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Script21 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Script22 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Script3 = (DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Script4 = (DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Script5 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Script6 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Script7 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Script8 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Script9 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
ScriptUnit = Unit
- Definition Classes
- Component
-
type
Select = Dataset[Row]
- Definition Classes
- Component
-
type
Sequence = Dataset[Row]
- Definition Classes
- Component
-
type
SetOperation = Dataset[Row]
- Definition Classes
- Component
-
type
Source = Dataset[Row]
- Definition Classes
- Component
-
trait
SparkFunctions extends AnyRef
Library of all spark functions which implements different abinitio functions used in abinitio workflows.
-
class
StringAsStream extends Serializable
- Definition Classes
- SparkFunctions
-
type
StreamingTarget = StreamingQuery
- Definition Classes
- Component
-
type
SubGraph = Dataset[Row]
- Definition Classes
- Component
-
type
SubGraph1 = Dataset[Row]
- Definition Classes
- Component
-
type
SubGraph10 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubGraph11 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubGraph12 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubGraph13 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubGraph14 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubGraph15 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubGraph16 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubGraph17 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubGraph18 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubGraph19 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubGraph2 = (DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubGraph20 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubGraph21 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubGraph22 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubGraph3 = (DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubGraph4 = (DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubGraph5 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubGraph6 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubGraph7 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubGraph8 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubGraph9 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubGraphUnit = Unit
- Definition Classes
- Component
-
type
Subgraph = Dataset[Row]
- Definition Classes
- Component
-
type
Subgraph1 = Dataset[Row]
- Definition Classes
- Component
-
type
Subgraph10 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Subgraph11 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Subgraph12 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Subgraph13 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Subgraph14 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Subgraph15 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Subgraph16 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Subgraph17 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Subgraph18 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Subgraph19 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Subgraph2 = (DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Subgraph20 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Subgraph21 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Subgraph22 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Subgraph3 = (DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Subgraph4 = (DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Subgraph5 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Subgraph6 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Subgraph7 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Subgraph8 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
Subgraph9 = (DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame, DataFrame)
- Definition Classes
- Component
-
type
SubgraphUnit = Unit
- Definition Classes
- Component
-
type
Target = Unit
- Definition Classes
- Component
-
trait
UDFUtils extends RestAPIUtils with Serializable with LazyLogging
Utility class with different UDFs to take care of miscellaneous tasks.
-
type
UnionAll = Dataset[Row]
- Definition Classes
- Component
-
type
Visualize = Unit
- Definition Classes
- Component
-
type
WindowFunction = Dataset[Row]
- Definition Classes
- Component
-
implicit
class
ExtendedDataFrame extends AnyRef
- Definition Classes
- ProphecyDataFrame
-
implicit
class
ExtendedStreamingTarget extends AnyRef
- Definition Classes
- ProphecyDataFrame
-
implicit
class
ProphecyDataFrameReader extends AnyRef
- Definition Classes
- ProphecyDataFrame
-
implicit
class
ProphecyDataFrameWriter[T] extends AnyRef
- Definition Classes
- ProphecyDataFrame
Abstract Value Members
-
abstract
def
getClass(): Class[_]
- Definition Classes
- Any
Concrete Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- Any
-
final
def
##(): Int
- Definition Classes
- Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- Any
-
val
InterimState: InterimStore.type
- Definition Classes
- ProphecyDataFrame
-
def
YJJJ_to_YYYYJJJ(in_date: Column, ref_date: Column): Column
Converts 1 digit julian year to 4 digits julian year.
Converts 1 digit julian year to 4 digits julian year.
- in_date
date in Julian in "YJJJ" format
- ref_date
date in "yyyyMMdd" format
- returns
a date in "YYYYJJJ"
- Definition Classes
- SparkFunctions
-
def
appendTrailer(pathInputData: String, pathInputTrailer: String, pathOutputConcatenated: String, configuration: Configuration): Unit
Appends a trailer data to every single file in the data directory.
Appends a trailer data to every single file in the data directory. A single trailer file in the
pathOutputTrailer
directory should correspond to a single data file in thepathOutputData
directory.If a trailer for a given file does not exist, the file is moved as is to the output directory.
- pathInputData
Input data files directory
- pathInputTrailer
Input trailer files directory
- pathOutputConcatenated
Output concatenated files directory
- configuration
Hadoop configuration (preferably
sparkSession.sparkContext.hadoopConfiguration
)
- Definition Classes
- DataHelpers
-
def
arrayColumn(value: String, values: String*): Column
Function to take variable number of values and create an array column out of it.
Function to take variable number of values and create an array column out of it.
- value
input value
- values
variable number of input values.
- returns
an array of column.
- Definition Classes
- UDFUtils
-
val
array_value: UserDefinedFunction
UDF to find and return element in arr sequence at passed index.
UDF to find and return element in arr sequence at passed index. If no element found then null is returned.
- Definition Classes
- UDFUtils
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
val
bigDecimalToPackedBytes: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
call_rest_api: UserDefinedFunction
Spark UDF that makes a single blocking rest API call to a given url.
Spark UDF that makes a single blocking rest API call to a given url. The result of this udf is always produced, contains a proper error if it failed at any stage, and never interrupts the job execution (unless called with invalid signature).
The default timeout can be configured through the
spark.network.timeout
Spark configuration option.Parameters:
- method - any supported HTTP1.1 method type, e.g. POST, GET. Complete list: httpMethods.
- url - valid url to which a request is going to be made
- headers - an array of "key: value" headers that are past with the request
- content - any content (by default, the supported rest api content type is application/json)
Response - a struct with the following fields:
- isSuccess - boolean, whether a successful response has been received
- status - nullable integer, status code (e.g. 404, 200, etc)
- headers - an array of
name: value
response headers (e.g. [Server: akka-http/10.1.10, Date: Tue, 07 Sep 2021 18:11:47 GMT]) - content - nullable string, response back
- error - nullable string, if the parameters passed are valid or the system failed to make a call, this field contains an error message
- Definition Classes
- RestAPIUtils
-
val
canonical_representation: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
def
castDataType(sparkSession: SparkSession, df: DataFrame, column: Column, dataType: String, replaceColumn: String): DataFrame
Function to add new typecasted column in input dataframe.
Function to add new typecasted column in input dataframe. Newly added column is typecasted version of passed column. Typecast operation is supported for string, boolean, byte, short, int, long, float, double, decimal, date, timestamp
- sparkSession
spark session
- df
input dataframe
- column
input column to be typecasted
- dataType
datatype to cast column to.
- replaceColumn
column name to be added in dataframe.
- returns
new dataframe with new typecasted column.
- Definition Classes
- UDFUtils
-
def
concatenate(sources: Seq[String], destination: String, compressToGZip: Boolean = false): Unit
Method to get data from multiple source paths and combine it into single destination path.
Method to get data from multiple source paths and combine it into single destination path.
- sources
multiple source paths from which to merge the data.
- destination
destination path to combine all data to.
- compressToGZip
flag to compress final output file into gzip format
- Definition Classes
- DataHelpers
-
def
convertInputBytesToStructType(input: Any, typeInfo: Seq[String], startByte: Int = 0): Row
Method used for abinitio's reinterpret_as function to read necessary bytes from byteArray for input data and convert into struct format as per provided in typeInfo sequence.
Method used for abinitio's reinterpret_as function to read necessary bytes from byteArray for input data and convert into struct format as per provided in typeInfo sequence.
TypeInfo can have multiple entries, each could be either decimal or string type. Depending on the argument passed within decimal or string bytes are read from input byte array.
If decimal or string argument has some integer then that many bytes are read from input byte array or if decimal or string has some string delimiter as its argument then from the current position bytes are read until string delimiter is found in input byte array.
- Definition Classes
- SparkFunctions
-
def
createDataFrameFromData(inputData: String, delimiter: String, columnName: String, columnType: String, sparkSession: SparkSession): DataFrame
Method to read values from inputData and create dataframe with column name as columnName and column type as columnType for the values in inputData delimiter by delimiter.
Method to read values from inputData and create dataframe with column name as columnName and column type as columnType for the values in inputData delimiter by delimiter.
- Definition Classes
- SparkFunctions
-
def
createLookup(name: String, df: DataFrame, spark: SparkSession, keyCols: List[String], rowCols: String*): UserDefinedFunction
Function registers 4 different UDFs with spark registry.
Function registers 4 different UDFs with spark registry. UDF for lookup_match, lookup_count, lookup_row and lookup functions are registered. This function stores the data of input dataframe in a broadcast variable, then uses this broadcast variable in different lookup functions.
lookup : This function returns the first matching row for given input keys lookup_count : This function returns the count of all matching rows for given input keys. lookup_match : This function returns 0 if there is no matching row and 1 for some matching rows for given input keys. lookup_row : This function returns all the matching rows for given input keys.
This function registers for upto 10 matching keys as input to these lookup functions.
- name
UDF Name
- df
input dataframe
- spark
spark session
- keyCols
columns to be used as keys in lookup functions.
- rowCols
schema of entire row which will be stored for each matching key.
- returns
registered UDF definitions for lookup functions. These UDF functions returns different results depending on the lookup function.
- Definition Classes
- UDFUtils
-
def
createRangeLookup(name: String, df: DataFrame, spark: SparkSession, minColumn: String, maxColumn: String, valueColumns: String*): UserDefinedFunction
Method to create UDF which looks for passed input double in input dataframe.
Method to create UDF which looks for passed input double in input dataframe. This function first loads the data of dataframe in broadcast variable and then defines a UDF which looks for input double value in the data stored in broadcast variable. If input double lies between passed col1 and col2 values then it adds corresponding row in the returned result. If value of input double doesn't lie between col1 and col2 then it simply returns null for current row in result.
- name
created UDF name
- df
input dataframe
- spark
spark session
- minColumn
column whose value to be considered as minimum in comparison.
- maxColumn
column whose value to be considered as maximum in comparison.
- valueColumns
remaining column names to be part of result.
- returns
registers UDF which in turn returns rows corresponding to each row in dataframe on which range UDF is called.
- Definition Classes
- UDFUtils
-
val
cross_join_index_range: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
def
date_add_months(inputDate: Column, months: Int): Column
Returns the internal representation of a date resulting from adding (or subtracting) a number of months to the specified date.
Returns the internal representation of a date resulting from adding (or subtracting) a number of months to the specified date.
- inputDate
in yyyy-MM-dd format
- Definition Classes
- SparkFunctions
-
def
date_difference_days(laterDate: Column, earlierDate: Column): Column
Computes number of days between two specified dates in "yyyyMMdd" format
Computes number of days between two specified dates in "yyyyMMdd" format
- laterDate
input date
- earlierDate
input date
- returns
number of days between laterDate and earlierDate or null if either one is null
- Definition Classes
- SparkFunctions
-
val
date_month_end: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
datetime_add: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
def
datetime_add_months(input: Column, months: Int): Column
Returns the internal representation of a timestamp resulting from adding (or subtracting) a number of months to the specified timestamp.
Returns the internal representation of a timestamp resulting from adding (or subtracting) a number of months to the specified timestamp.
- input
timestamp in yyyy-MM-dd HH:mm:ss.SSSS format
- Definition Classes
- SparkFunctions
-
val
datetime_difference: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
def
datetime_difference_hours(end: Column, start: Column): Column
Returns the number of hours between two specified dates in standard format yyyy-MM-dd HH:mm:ss.SSSS.
Returns the number of hours between two specified dates in standard format yyyy-MM-dd HH:mm:ss.SSSS.
- Definition Classes
- SparkFunctions
-
def
datetime_from_unixtime(seconds: Column): Column
- Definition Classes
- SparkFunctions
-
def
decimal_lpad(input: Column, len: Int, char_to_pad_with: String = "0", decimal_point_char: String = "."): Column
Method uses a java regex to identify decimal numbers from input string.
Method uses a java regex to identify decimal numbers from input string. This decimal number could be of 3 types 1. Simple integral number. e.g. 013334848. This part is identified by regex. 2. decimal number with explicit decimal point. e.g. 123456.90. This part is identified by combination of [0-9]+(\$$decimal_point_char)[0-9]+ and (0\$$decimal_point_char)[0-9]+ regex
After extracting decimal number this code checks if length of decimal number is more than len parameter or not. If length is more than len parameter then it simply returns this extracted decimal number. Otherwise it first left pad decimal number with char_to_pad_with to make its length equal to len parameter and then adjusts minus sign (-) to left most part of decimal number.
- input
input string.
- len
length of characters.
- char_to_pad_with
character to left pad with. default value is "0"
- decimal_point_char
A string that specifies the character that represents the decimal point.
- returns
a decimal string of the specified length or longer, left-padded with a specified character as needed and trimmed of leading zeros.
- Definition Classes
- SparkFunctions
-
def
decimal_lrepad(input: Column, len: Int, char_to_pad_with: String = "0", decimal_point_char: String = "."): Column
Method uses a java regex to identify decimal numbers from input string.
Method uses a java regex to identify decimal numbers from input string. This decimal number could be of 3 types 1. Simple integral number. e.g. 013334848. This part is identified by combination of [1-9][0-9]*[0-9] and [1-9]+ regex 2. decimal number with explicit decimal point. e.g. 123456.90. This part is identified by combination of [1-9][0-9]*(\\\$$decimal_point_char)[0-9]+ and (0\\\$$decimal_point_char)[0-9]*[0-9] regex
After extracting decimal number this code checks if length of decimal number is more than len parameter or not. If length is more than len parameter then it simply returns this extracted decimal number. Otherwise it first left pad decimal number with char_to_pad_with to make its length equal to len parameter and then adjusts minus sign (-) to left most part of decimal number.
- input
input string.
- len
length of characters.
- char_to_pad_with
character to left pad with. default value is "0"
- decimal_point_char
A string that specifies the character that represents the decimal point.
- returns
a decimal string of the specified length or longer, left-padded with a specified character as needed and trimmed of leading zeros.
- Definition Classes
- SparkFunctions
-
def
decimal_round(input: Column, places: Int): Column
- Definition Classes
- SparkFunctions
-
def
decimal_round_down(input: Column, right_digits: Int): Column
Function returns a value which is rounded down to right_digits number of digits to the right of decimal point.
Function returns a value which is rounded down to right_digits number of digits to the right of decimal point.
- Definition Classes
- SparkFunctions
-
def
decimal_round_up(input: Column, places: Int): Column
Returns a number rounded up to a specified number of places to the right of the decimal point.
Returns a number rounded up to a specified number of places to the right of the decimal point.
- Definition Classes
- SparkFunctions
-
def
decimal_strip(input: Column, decimal_point_char: String = "."): Column
Function uses a java regex to identify decimal numbers from input string.
Function uses a java regex to identify decimal numbers from input string. This decimal number could be of 3 types 1. Simple integral number. e.g. 013334848. This part is identified by combination of [1-9][0-9 ]*[0-9] and [1-9]+ regex 2. decimal number with explicit decimal point. e.g. 123456.90. This part is identified by combination of [1-9][0-9]*(\$$decimal_point_char)[0-9 ]+ and (0\$$decimal_point_char)[0-9 ]*[0-9] regex
After extracting decimal number this code looks for minus sign before extracted number in input and appends it with decimal number if found minus sign.
In the end it replaces all whitespaces with empty string in the final resultant decimal number.
- input
input string
- decimal_point_char
A string that specifies the character that represents the decimal point.
- returns
a decimal from a string that has been trimmed of leading zeros and non-numeric characters.
- Definition Classes
- SparkFunctions
-
def
decimal_truncate(input: Column, number_of_places: Column): Column
- Definition Classes
- SparkFunctions
-
val
decodeBytes: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
decodeString: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
decode_datetime: UserDefinedFunction
UDF to get record of type decode_datetime_type.
UDF to get record of type decode_datetime_type. This record will have all its fields populated with corresponding entries in input date/timestamp.
Returned record will have following schema.
integer(8) year; integer(8) month; integer(8) day; integer(8) hour; integer(8) minute; integer(8) second; integer(8) microsecond;
Note: Supported Input time is in yyyy-MM-dd HH:mm:ss.SSSSSS or yyyy-MM-dd HH:mm:ss or yyyy-MM-dd formats only. Additional handling is done to support timestamp retrieved from now() function call.
- Definition Classes
- SparkFunctions
-
val
decode_datetime_as_local: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
def
directory_listing(path: String, filePrefix: String): Column
- Definition Classes
- SparkFunctions
-
def
dropColumns(sparkSession: SparkSession, df: DataFrame, columns: Column*): DataFrame
Function to drop passed columns from input dataframe.
Function to drop passed columns from input dataframe.
- sparkSession
spark session
- df
input dataframe.
- columns
list of columns to be dropped from dataframe.
- returns
new dataframe with dropped columns.
- Definition Classes
- UDFUtils
-
val
encodeBytes: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
encodeString: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
encode_date: UserDefinedFunction
integer values specifying days relative to January 1, 1900.
integer values specifying days relative to January 1, 1900. This function returns the internal representation of a date given the year, month, and date. encode_date returns the internal representation of the date specified by the year 1998, the month 5, and the day 18:encode_date(1998, 5, 18) = 35931
- Definition Classes
- SparkFunctions
-
def
ends_with(input: Column, suffix: String): Column
Returns true if string columns ends with given suffix
Returns true if string columns ends with given suffix
- Definition Classes
- SparkFunctions
-
def
equals(arg0: Any): Boolean
- Definition Classes
- Any
-
val
eval: UserDefinedFunction
Method to return the result of evaluating a string expression in the context of a specified input column.
Method to return the result of evaluating a string expression in the context of a specified input column. Here input column could be struct type record, simple column, array type etc. Here expr could be reference to nested column inside input column or any expression which requires values from input column for its evaulation.
Note: Current implementation only supports scenerio where input column is of struct type and expr is simply dot separated column reference to input struct.
- Definition Classes
- SparkFunctions
-
def
executeNonSelectSQLQueries(sqlList: Seq[String], dbConnection: Connection): Unit
- Definition Classes
- DataHelpers
-
val
file_information: UserDefinedFunction
UDF to get file information for passed input file path.
UDF to get file information for passed input file path.
- Definition Classes
- SparkFunctions
-
def
findFirstElement(input: Column, default: Column = lit(null)): Column
- Definition Classes
- SparkFunctions
-
def
findFirstNonBlankElement(input: Column, default: Column): Column
- Definition Classes
- SparkFunctions
-
def
findLastElement(input: Column, default: Column = lit(null)): Column
- Definition Classes
- SparkFunctions
-
def
first_defined(expr1: Column, expr2: Column): Column
Method to identify and return first non null expression.
Method to identify and return first non null expression.
- Definition Classes
- SparkFunctions
-
val
first_defined_for_double_Udf: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
def
flattenStructSchema(schema: StructType, prefix: String = null): Array[Column]
- Definition Classes
- SparkFunctions
-
val
force_error: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
def
from_sv(input: Column, separator: String, schema: StructType): Column
- Definition Classes
- SparkFunctions
-
def
from_xml(content: Column, schema: StructType): Column
- Definition Classes
- SparkFunctions
-
def
ftpTo(remoteHost: String, userName: String, password: String, sourceFile: String, destFile: String, retryFailures: Boolean, retryCount: Int, retryPauseSecs: Int, mode: String, psCmd: String): (Boolean, Boolean, String, String)
- Definition Classes
- DataHelpers
-
def
generateDataFrameWithSequenceColumn(start: Int, end: Int, columnName: String, sparkSession: SparkSession): DataFrame
Method to create dataframe with single column containing increasing sequence id from start to end.
Method to create dataframe with single column containing increasing sequence id from start to end.
- Definition Classes
- SparkFunctions
-
def
generate_sequence(start: Int, end: Int, step: Int = 1): Column
Function to create sequence of array between two passed numbers
Function to create sequence of array between two passed numbers
- start
starting point of generated sequence
- end
terminating point of generated sequence.
- returns
column containing sequence of integers.
- Definition Classes
- SparkFunctions
-
val
generate_sequence: UserDefinedFunction
UDF to generate column with sequence of integers between two passed start and end columns.
UDF to generate column with sequence of integers between two passed start and end columns.
- Definition Classes
- SparkFunctions
-
val
getByteFromByteArray: UserDefinedFunction
UDF to get last Byte from ByteArray of input data.
UDF to get last Byte from ByteArray of input data.
- Definition Classes
- SparkFunctions
-
def
getColumnInSecondArrayByFirstNonBlankPositionInFirstArray(nonBlankEntryExpr: Column, firstArray: Column, secondArray: Column): Column
- Definition Classes
- SparkFunctions
-
def
getContentAsStream(content: String): StringAsStream
- Definition Classes
- SparkFunctions
-
def
getEmptyLogDataFrame(sparkSession: SparkSession): DataFrame
Method to get empty dataframe with below abinitio log schema.
Method to get empty dataframe with below abinitio log schema.
record string("|") node, timestamp, component, subcomponent, event_type; string("|\n") event_text; end
- Definition Classes
- DataHelpers
-
def
getFebruaryDay(year: Column): Column
Computes number of days in February month in a given year
Computes number of days in February month in a given year
- year
year whose number of days in February needs to be calculated
- returns
number of days
- Definition Classes
- SparkFunctions
-
def
getFieldFromStructByPosition(column: Column, position: Int): Column
Method to get field at specific position from struct column
Method to get field at specific position from struct column
- Definition Classes
- SparkFunctions
-
val
getIntFromByteArray: UserDefinedFunction
UDF to get integer comprising of last 4 Bytes from ByteArray of input data.
UDF to get integer comprising of last 4 Bytes from ByteArray of input data.
- Definition Classes
- SparkFunctions
-
val
getLongArrayFromByteArray: UserDefinedFunction
UDF to get long comprising of last 8 Bytes from ByteArray of input data.
UDF to get long comprising of last 8 Bytes from ByteArray of input data.
- Definition Classes
- SparkFunctions
-
val
getLongFromByteArray: UserDefinedFunction
UDF to get long comprising of last 8 Bytes from ByteArray of input data.
UDF to get long comprising of last 8 Bytes from ByteArray of input data.
- Definition Classes
- SparkFunctions
-
def
getMTimeDataframe(filepath: String, format: String, spark: SparkSession): DataFrame
- Definition Classes
- SparkFunctions
-
val
getShortFromByteArray: UserDefinedFunction
UDF to get short comprising of last 2 Bytes from ByteArray of input data.
UDF to get short comprising of last 2 Bytes from ByteArray of input data.
- Definition Classes
- SparkFunctions
-
def
hashCode(): Int
- Definition Classes
- Any
-
val
hash_MD5: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
instr_udf: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
isNullOrEmpty(input: Column): Column
Method to check if current column is null or has empty value.
Method to check if current column is null or has empty value.
- Definition Classes
- SparkFunctions
-
def
is_ascii(input: Column): Column
Checks if a string is ascii
Checks if a string is ascii
- input
column to be checked
- returns
true if the input string is ascii otherwise false
- Definition Classes
- SparkFunctions
-
def
is_blank(input: Column): Column
Method to identify if input string is a blank string or not.
Method to identify if input string is a blank string or not.
- input
input string.
- returns
return 1 if given string contains all blank character or is a zero length string, otherwise it returns 0
- Definition Classes
- SparkFunctions
-
val
is_bzero: UserDefinedFunction
Tests whether an object is composed of all binary zero bytes.
Tests whether an object is composed of all binary zero bytes. This function returns: 1. 1 if obj contains only binary zero bytes or is a zero-length string 2. 0 if obj contains any non-zero bytes 3. NULL if obj is NULL
- Definition Classes
- SparkFunctions
-
def
is_numeric_ascii(input: Column): Column
Checks if an input string contains only ascii code and numbers
Checks if an input string contains only ascii code and numbers
- input
string to be checked
- returns
true if input string contains only ascii code and numbers or null if input is null
- Definition Classes
- SparkFunctions
-
def
is_valid(input: Column, isNullable: Boolean, formatInfo: Option[Any], len: Option[Seq[Int]]): Column
Method to identify if passed input column is a valid expression after typecasting to passed dataType.
Method to identify if passed input column is a valid expression after typecasting to passed dataType. Also while typecasting if len is present then this function also makes sure the max length of input column after typecasting operation is not greater than len.
- input
input column expression to be identified if is valid.
- formatInfo
datatype to which input column expression must be typecasted. If datatype is a string then it is treated as timestamp format. If it is a list of string then it is treated as having current timestamp format and and new timestamp format to which input column needs to be typecasted.
- len
max length of input column after typecasting it to dataType.
- returns
0 if input column is not valid after typecasting or 1 if it is valid.
- Definition Classes
- SparkFunctions
-
def
is_valid(input: Column, isNullable: Boolean, formatInfo: Option[Any]): Column
- Definition Classes
- SparkFunctions
-
def
is_valid(input: Column, formatInfo: Option[Any], len: Option[Seq[Int]]): Column
- Definition Classes
- SparkFunctions
-
def
is_valid(input: Column, formatInfo: Option[Any]): Column
- Definition Classes
- SparkFunctions
-
def
is_valid(input: Column, isNullable: Boolean): Column
- Definition Classes
- SparkFunctions
-
def
is_valid(input: Column): Column
- Definition Classes
- SparkFunctions
-
def
is_valid_date(dateFormat: String, inDate: Column): Column
Validates date against a input format
Validates date against a input format
- dateFormat
A pattern such as
yyyy-MM-dd
oryyyy-MM-dd HH:mm:ss.SSSS
ordd.MM.yyyy
- inDate
Input date to be validated
- returns
true if the input date is valid otherwise false
- Definition Classes
- SparkFunctions
-
def
loadBinaryFileAsBinaryDataFrame(filePath: String, lineDelimiter: String = "\n", minPartition: Int = 1, rowName: String = "line", spark: SparkSession): DataFrame
- Definition Classes
- DataHelpers
-
def
loadBinaryFileAsStringDataFrame(filePath: String, lineDelimiter: String = "\n", charSetEncoding: String = "Cp1047", minPartition: Int = 1, rowName: String = "line", spark: SparkSession): DataFrame
- Definition Classes
- DataHelpers
-
def
loadFixedWindowBinaryFileAsDataFrame(filePath: String, lineLength: Int, minPartition: Int = 1, rowName: String = "line", spark: SparkSession): DataFrame
- Definition Classes
- DataHelpers
-
lazy val
logger: Logger
- Attributes
- protected
- Definition Classes
- LazyLogging
- Annotations
- @transient()
-
def
lookup(lookupName: String, cols: Column*): Column
By default returns only the first matching record
By default returns only the first matching record
- Definition Classes
- UDFUtils
-
def
lookup_count(lookupName: String, cols: Column*): Column
- Definition Classes
- UDFUtils
-
def
lookup_last(lookupName: String, cols: Column*): Column
Returns the last matching record
Returns the last matching record
- Definition Classes
- UDFUtils
-
def
lookup_match(lookupName: String, cols: Column*): Column
- returns
Boolean Column
- Definition Classes
- UDFUtils
-
def
lookup_nth(lookupName: String, cols: Column*): Column
- Definition Classes
- UDFUtils
-
def
lookup_range(lookupName: String, input: Column): Column
- Definition Classes
- UDFUtils
-
def
lookup_row(lookupName: String, cols: Column*): Column
- Definition Classes
- UDFUtils
-
def
lookup_row_reverse(lookupName: String, cols: Column*): Column
- Definition Classes
- UDFUtils
-
val
make_byte_flags: UserDefinedFunction
UDF to return a flag for each character if it is present or not in input String.
UDF to return a flag for each character if it is present or not in input String.
- Definition Classes
- SparkFunctions
-
def
make_constant_vector(size: Int, seedVal: Int): Array[Int]
Method to create array of size "size" containing seedVal as each entry
Method to create array of size "size" containing seedVal as each entry
- Definition Classes
- SparkFunctions
-
def
make_constant_vector(size: Int, seedVal: Column): Column
Method to create array of size "size" containing seedVal as each entry
Method to create array of size "size" containing seedVal as each entry
- Definition Classes
- SparkFunctions
-
def
measure[T](fn: ⇒ T)(caller: String = findCaller()): T
- Definition Classes
- UDFUtils
-
val
multifile_information: UserDefinedFunction
UDF to get multifile information for passed input file path.
UDF to get multifile information for passed input file path.
- Definition Classes
- SparkFunctions
-
val
murmur: UserDefinedFunction
UDF for murmur hash generation for any column type
UDF for murmur hash generation for any column type
- Definition Classes
- SparkFunctions
-
def
now(): Column
Method to get current timestamp.
Method to get current timestamp.
- returns
current timestamp in YYYYMMddHHmmssSSSSSS format.
- Definition Classes
- SparkFunctions
-
def
numberOfPartitions(in: DataFrame): Column
- Definition Classes
- SparkFunctions
-
val
number_grouping: UserDefinedFunction
udf to group input decimal into multiple groups separated by separator
udf to group input decimal into multiple groups separated by separator
- Definition Classes
- SparkFunctions
-
val
packedBytesStringToDecimal: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
packedBytesToDecimal: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
re_get_match: UserDefinedFunction
Returns the first string in a target string that matches a regular expression.
Returns the first string in a target string that matches a regular expression.
- Definition Classes
- SparkFunctions
-
val
re_get_match_with_index: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
re_index: UserDefinedFunction
UDF wrapper over re_index function.
UDF wrapper over re_index function.
- Definition Classes
- SparkFunctions
-
val
re_index_with_offset: UserDefinedFunction
Returns the first string in a target string that matches a regular expression.
Returns the first string in a target string that matches a regular expression.
- Definition Classes
- SparkFunctions
-
def
re_replace(target: Column, pattern: String, replacement: String, offset: Int = 0): Column
Replaces all substrings in a target string that match a specified regular expression.
Replaces all substrings in a target string that match a specified regular expression.
- target
A string that the function searches for a substring that matches pattern_expr.
- pattern
regular expression
- replacement
replacement string
- offset
Number of characters, from the beginning of str, to skip before searching.
- returns
a replaced string in which all substrings, which matches a specified regular expression, are replaced.
- Definition Classes
- SparkFunctions
-
def
re_replace_first(target: Column, pattern: String, replacement: String, offset: Column = lit(0)): Column
Replaces only the first regex matching occurrence in the target string.
Replaces only the first regex matching occurrence in the target string.
- target
A string that the function searches for a substring that matches pattern_expr.
- pattern
regular expression
- replacement
replacement string
- returns
a replaced string in which first substring, which matches a specified regular expression, is replaced.
- Definition Classes
- SparkFunctions
-
val
re_split_no_empty: UserDefinedFunction
UDF to split input string via pattern string and remove all empty subtrings.
UDF to split input string via pattern string and remove all empty subtrings.
- Definition Classes
- SparkFunctions
-
val
readBytesIntoInteger: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
readBytesIntoLong: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
readBytesStringIntoInteger: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
readBytesStringIntoLong: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
def
readHiveTable(spark: SparkSession, database: String, table: String, partition: String = ""): DataFrame
Method to read data from hive table.
Method to read data from hive table.
- spark
spark session
- database
hive database
- table
hive table.
- partition
hive table partition to read data specifically from if provided.
- returns
dataframe with data read from Hive Table.
- Definition Classes
- DataHelpers
-
def
readHiveTableInChunks(spark: SparkSession, database: String, table: String, partitionKey: String, partitionValue: String): DataFrame
Reads a full hive table partition, by reading every subpartition separately and performing a union on all the final DataFrames
Reads a full hive table partition, by reading every subpartition separately and performing a union on all the final DataFrames
This function is meant to temporarily solve the problem with Hive metastore crashing when querying too many partitions at the same time.
- spark
spark session
- database
hive database name
- table
hive table name
- partitionKey
top-level partition's key
- partitionValue
top-level partition's value
- returns
A complete DataFrame with the selected hive table partition
- Definition Classes
- DataHelpers
-
val
read_file: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
record_info: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
record_info_with_includes: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
def
registerAllUDFs(spark: SparkSession): Unit
- Definition Classes
- SparkFunctions
-
def
registerProphecyUdfs(spark: SparkSession): Unit
- Definition Classes
- UDFUtils
-
def
register_output_schema(portName: String, schema: StructType): Unit
- Definition Classes
- Component
-
def
remove_non_digit(input: Column): Column
Method removes any non-digit characters from the specified string column.
Method removes any non-digit characters from the specified string column.
- input
input String Column
- returns
Cleaned string column or null
- Definition Classes
- SparkFunctions
-
def
replaceBlankColumnWithNull(input: Column): Column
Method to replace String Columns with Empty value to Null.
Method to replace String Columns with Empty value to Null.
- Definition Classes
- SparkFunctions
-
def
replaceString(sparkSession: SparkSession, df: DataFrame, outputCol: String, inputCol: String, replaceWith: String, value: String, values: String*): DataFrame
Function to add new column in passed dataframe.
Function to add new column in passed dataframe. Newly added column value is decided by the presence of value corresponding to inputCol in array comprised of value and values. If inputCol is found then value of replaceWith is added in new column otherwise inputCol value is added.
- sparkSession
spark session.
- df
input dataframe.
- outputCol
name of new column to be added.
- inputCol
column name whose value is searched.
- replaceWith
value with which to replace searched value if found.
- value
element to be combined in array column
- values
all values to be combined in array column for searching purpose.
- returns
dataframe with new column with column name outputCol
- Definition Classes
- UDFUtils
-
def
replaceStringNull(sparkSession: SparkSession, df: DataFrame, outputCol: String, inputCol: String, replaceWith: String, value: String, values: String*): DataFrame
Function to add new column in passed dataframe.
Function to add new column in passed dataframe. Newly added column value is decided by the presence of value corresponding to inputCol in array comprised of value and values and null. If inputCol is found then value of replaceWith is added in new column otherwise inputCol value is added.
- sparkSession
spark session.
- df
input dataframe.
- outputCol
name of new column to be added.
- inputCol
column name whose value is searched.
- replaceWith
value with which to replace searched value if found.
- value
element to be combined in array column
- values
all values to be combined in array column for searching purpose.
- returns
dataframe with new column with column name outputCol
- Definition Classes
- UDFUtils
-
def
replaceStringWithNull(sparkSession: SparkSession, df: DataFrame, outputCol: String, inputCol: String, value: String, values: String*): DataFrame
Function to add new column in passed dataframe.
Function to add new column in passed dataframe. Newly added column value is decided by the presence of value corresponding to inputCol in array comprised of value and values and null. If inputCol is found then value of null is added in new column otherwise inputCol value is added.
- sparkSession
spark session.
- df
input dataframe.
- outputCol
name of new Column to be added.
- inputCol
column name whose value is searched.
- value
element to be combined in array column.
- values
all values to be combined in array column for searching purpose.
- returns
dataframe with new column with column name outputCol
- Definition Classes
- UDFUtils
-
def
replace_null_with_blank(input: Column): Column
- Definition Classes
- SparkFunctions
-
val
replace_string: UserDefinedFunction
UDF to find str in input sequence toBeReplaced and return replace if found.
UDF to find str in input sequence toBeReplaced and return replace if found. Otherwise str is returned.
- Definition Classes
- UDFUtils
-
val
replace_string_with_null: UserDefinedFunction
UDF to find str in input sequence toBeReplaced and return null if found.
UDF to find str in input sequence toBeReplaced and return null if found. Otherwise str is returned.
- Definition Classes
- UDFUtils
-
def
scanf_double(format: Column, value: Column): Column
- Definition Classes
- SparkFunctions
-
def
scanf_long(format: Column, value: Column): Column
- Definition Classes
- SparkFunctions
-
def
schemaRowCompareResult(row1: StructType, row2: StructType): Column
- Definition Classes
- SparkFunctions
-
def
sign_explicit(c: Column): Column
Adds an explicit sign to the number.
Adds an explicit sign to the number. E.g. 2 -> +2; -004 -> -004; 0 -> +0
- Definition Classes
- SparkFunctions
-
val
sign_explicit_Udf: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
def
sign_reserved(c: Column): Column
- Definition Classes
- SparkFunctions
-
val
sign_reserved_Udf: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
def
splitIntoMultipleColumns(sparkSession: SparkSession, df: DataFrame, colName: String, pattern: String, prefix: String = null): DataFrame
Function to split column with colName in input dataframe using split pattern into multiple columns.
Function to split column with colName in input dataframe using split pattern into multiple columns. If prefix name is provided each new generated column is prefixed with prefix followed by column number, otherwise original column name is used.
- sparkSession
spark session.
- df
input dataframe.
- colName
column in dataframe which needs to be split into multiple columns.
- pattern
regex with which column in input dataframe will be split into multiple columns.
- prefix
column prefix to be used with all newly generated columns.
- returns
new dataframe with new columns where new column values are generated after splitting original column colName.
- Definition Classes
- UDFUtils
-
val
splitIntoMultipleColumnsUdf: UserDefinedFunction
UDF to break input string into multiple string via delimiter.
UDF to break input string into multiple string via delimiter. Number of strings after split are adjusted as per passed width parameter. If number of strings are less then empty strings are added otherwise in case of more number of strings, first width number of entries are picked and remaining are discarded.
- Definition Classes
- SparkFunctions
-
def
starts_with(input: Column, prefix: String): Column
Returns true if string columns starts with given prefix
Returns true if string columns starts with given prefix
- Definition Classes
- SparkFunctions
-
def
string_char(inputStr: Column, index: Int): Column
Method to return character code of character at index position in inputStr string.
Method to return character code of character at index position in inputStr string.
- inputStr
input string
- index
location of character to get code.
- returns
integer column.
- Definition Classes
- SparkFunctions
-
val
string_cleanse: UserDefinedFunction
This implementation is incorrect.
This implementation is incorrect.
- Definition Classes
- SparkFunctions
-
def
string_compare(input1: Column, input2: Column): Column
- Definition Classes
- SparkFunctions
-
val
string_concat_in_loop: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
string_convert_explicit: UserDefinedFunction
Converts a string from one character set to another, replacing inconvertible characters with a specified string.
Converts a string from one character set to another, replacing inconvertible characters with a specified string.
- Definition Classes
- SparkFunctions
-
val
string_filter: UserDefinedFunction
Method which returns string of characters present in both of the strings in the same order as appearing in first string
Method which returns string of characters present in both of the strings in the same order as appearing in first string
- Definition Classes
- SparkFunctions
-
val
string_filter_out: UserDefinedFunction
Compares two input strings, then returns characters that appear in one string but not in the other.
Compares two input strings, then returns characters that appear in one string but not in the other.
- Definition Classes
- SparkFunctions
-
val
string_index: UserDefinedFunction
UDF to find index of seekStr in inputStr.
UDF to find index of seekStr in inputStr. Returned index will be 1 based index.
- Definition Classes
- SparkFunctions
-
val
string_index_with_offset: UserDefinedFunction
UDF to find index of seekStr in inputStr from offset index onwards.
UDF to find index of seekStr in inputStr from offset index onwards. Returned string position is 1 based position.
- Definition Classes
- SparkFunctions
-
def
string_is_alphabetic(input: Column): Column
Method which returns true if input string contains all alphabetic characters, or false otherwise.
Method which returns true if input string contains all alphabetic characters, or false otherwise.
- Definition Classes
- SparkFunctions
-
def
string_is_numeric(input: Column): Column
Method which returns true if input string contains all numeric characters, or false otherwise.
Method which returns true if input string contains all numeric characters, or false otherwise.
- Definition Classes
- SparkFunctions
-
def
string_join(column: Column, delimiter: String): Column
Concatenates the elements of column using the delimiter.
Concatenates the elements of column using the delimiter.
- Definition Classes
- SparkFunctions
-
def
string_length(input: Column): Column
- Definition Classes
- SparkFunctions
-
val
string_like: UserDefinedFunction
Method to test whether a string matches a specified pattern.
Method to test whether a string matches a specified pattern. This function returns 1 if the input string matches a specified pattern, and 0 if the string does not match the pattern.
In abinitio version % character in pattern means to match zero or more characters and _ character means matches a single character.
- Definition Classes
- SparkFunctions
-
def
string_lpad(input: Column, len: Int, pad_char: String = " "): Column
Left-pad the input string column with pad_char to a length of len.
Left-pad the input string column with pad_char to a length of len. If length of input column is more than len then returns input column unmodified.
- Definition Classes
- SparkFunctions
-
def
string_lrepad(input: Column, len: Int, char_to_pad_with: String = " "): Column
function trims the string and then pad the string with given character upto given length.
function trims the string and then pad the string with given character upto given length. if the length of trimmed string is equal to or greater than given length than it return input string
- input
input string
- len
length in number of characters.
- char_to_pad_with
A character used to pad input string to length len.
- returns
string of a specified length, trimmed of leading and trailing blanks and left-padded with a given character.
- Definition Classes
- SparkFunctions
-
def
string_pad(input: Column, len: Int, char_to_pad_with: String = " "): Column
function pads input on the right with the character char_to_pad_with to make the string length len.
function pads input on the right with the character char_to_pad_with to make the string length len. If str is already len or more characters long, the function returns input unmodified.
- Definition Classes
- SparkFunctions
-
val
string_pad: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
string_pad_with_char: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
def
string_prefix(input: Column, length: Column): Column
- Definition Classes
- SparkFunctions
-
def
string_repad(input: Column, len: Int, char_to_pad_with: String = " "): Column
function trims the string and then pad the string on right side with given character upto given length.
function trims the string and then pad the string on right side with given character upto given length. if the length of trimmed string is equal to or greater than given length than it return input string
- input
input string
- len
length in number of characters.
- char_to_pad_with
A character used to pad input string to length len.
- returns
string of a specified length, trimmed of leading and trailing blanks and left-padded with a given character.
- Definition Classes
- SparkFunctions
-
def
string_replace(input: Column, seekStr: Column, newStr: Column, offset: Column = lit(0)): Column
Function to replace occurrence of seekStr with newStr string in input string after offset characters from first character.
Function to replace occurrence of seekStr with newStr string in input string after offset characters from first character.
- input
input string on which to perform replace operation.
- seekStr
string to be replaced in input string.
- newStr
string to be used instead of seekStr in input string.
- offset
number of characters to skip from begining in input string before performing string_replace operation.
- returns
modified string where seekStr is replaced with newStr in input string.
- Definition Classes
- SparkFunctions
-
val
string_replace_first: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
string_replace_first_in_loop: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
string_replace_in_loop: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
string_representation: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
string_rindex: UserDefinedFunction
Returns the index of the first character of the last occurrence of a seek string within another input string.
Returns the index of the first character of the last occurrence of a seek string within another input string. Returned index is 1 based.
- Definition Classes
- SparkFunctions
-
val
string_rindex_with_offset: UserDefinedFunction
UDF to find index of seekStr in inputStr from end of inputStr skipping offset number of characters from end.
UDF to find index of seekStr in inputStr from end of inputStr skipping offset number of characters from end. Offset index is number of characters, from the end of str, to skip before searching. Returned string position is 1 based position.
- Definition Classes
- SparkFunctions
-
val
string_split: UserDefinedFunction
UDF to split input string via delimiter string.
UDF to split input string via delimiter string.
- Definition Classes
- SparkFunctions
-
val
string_split_no_empty: UserDefinedFunction
UDF to split input string via delimiter string and remove all empty subtrings.
UDF to split input string via delimiter string and remove all empty subtrings.
- Definition Classes
- SparkFunctions
-
def
string_substring(input: Column, start_position: Column, length: Column): Column
Method to find substring of input string.
Method to find substring of input string.
- input
string on which to find substring.
- start_position
1 based starting position to find substring from.
- length
total length of substring to be found.
- returns
substring of input string
- Definition Classes
- SparkFunctions
-
def
string_suffix(input: Column, len: Int): Column
- Definition Classes
- SparkFunctions
-
val
take_last_nth: UserDefinedFunction
UDF to return nth element from last in passed array of elements.
UDF to return nth element from last in passed array of elements. In case input sequence has less number of elements than n then first element is returned.
- Definition Classes
- UDFUtils
-
val
take_nth: UserDefinedFunction
UDF to take Nth element from beginning.
UDF to take Nth element from beginning. In case input sequence has less element than N then exception is thrown.
- Definition Classes
- UDFUtils
-
val
test_characters_all: UserDefinedFunction
UDF to identify the number of characters in inputStr which are present in charFlag
UDF to identify the number of characters in inputStr which are present in charFlag
- Definition Classes
- SparkFunctions
-
def
timezone_to_utc(timezone: String, time: Column): Column
Method to convert
Method to convert
- Definition Classes
- SparkFunctions
-
def
toString(): String
- Definition Classes
- Any
-
def
today(): Column
Method to return integer value representing number of days to today from “1-1-1990”.
Method to return integer value representing number of days to today from “1-1-1990”.
- returns
integer value
- Definition Classes
- SparkFunctions
-
val
translate_bytes: UserDefinedFunction
UDF to return a string in the native character set made up of bytes from the given map.
UDF to return a string in the native character set made up of bytes from the given map. Each byte of the result is the value of map indexed by the character code of the corresponding byte of the input string str. The function returns NULL if any argument is NULL.
- Definition Classes
- SparkFunctions
-
val
truncateMicroSeconds: UserDefinedFunction
UDF to truncate microseconds part of timestamp.
UDF to truncate microseconds part of timestamp. This is needed as abinitio and spark has some incompatibility in microseconds part of timestamp format.
- Definition Classes
- SparkFunctions
-
val
type_info: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
type_info_with_includes: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
def
unionAll(df: DataFrame*): DataFrame
Method to take union of all passed dataframes.
Method to take union of all passed dataframes.
- df
list of dataframes for which to take union of.
- returns
union of all passed input dataframes.
- Definition Classes
- DataHelpers
-
val
unique_identifier: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
url_encode_escapes: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
windowSpec: WindowSpec
- Definition Classes
- SparkFunctions
- def withSparkProperty[T](key: String, value: String, spark: SparkSession)(body: ⇒ T): T
- def withSubgraphName[T](value: String, spark: SparkSession)(body: ⇒ T): T
- def withTargetId[T](value: String, spark: SparkSession)(body: ⇒ T): T
-
def
writeDataFrame(df: DataFrame, path: String, spark: SparkSession, props: Map[String, String], format: String, partitionColumns: List[String] = Nil, bucketColumns: List[String] = Nil, numBuckets: Option[Int] = None, sortColumns: List[String] = Nil, tableName: Option[String] = None, databaseName: Option[String] = None): Unit
Method to write data passed in dataframe in specific file format.
Method to write data passed in dataframe in specific file format.
- df
dataframe containing data.
- path
path to write data to.
- spark
spark session.
- props
underlying data source specific properties.
- format
file format in which to persist data. Supported file formats are csv, text, json, parquet, orc
- partitionColumns
columns to be used for partitioning.
- bucketColumns
used to bucket the output by the given columns. If specified, the output is laid out on the file-system similar to Hive's bucketing scheme.
- numBuckets
number of buckets to be used.
- sortColumns
columns on which to order data while persisting.
- tableName
table name for persisting data.
- databaseName
database name for persisting data.
- Definition Classes
- DataHelpers
-
val
writeIntegerToBytes: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
val
writeLongToBytes: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
lazy val
write_to_log: UserDefinedFunction
UDF to write logging parameters to log port.
UDF to write logging parameters to log port.
- Definition Classes
- DataHelpers
-
val
xmlToJSON: UserDefinedFunction
- Definition Classes
- SparkFunctions
-
def
yyyyMMdd_to_YYYYJJJ(in_date: Column): Column
Converts yyyyyMMdd to YYYYJJJ
Converts yyyyyMMdd to YYYYJJJ
- in_date
date in yyyyMMdd format
- returns
a date converted to YYYYJJJ
- Definition Classes
- SparkFunctions
-
def
zip_eventInfo_arrays(column1: Column, column2: Column): Column
Method to zip two arrays with first one having event_type and second one having event_text
Method to zip two arrays with first one having event_type and second one having event_text
- Definition Classes
- SparkFunctions
- object AbinitioDMLs
- object CDC
- object Component
- object DataFrameValidator
- object DataHelpers
- object FixedFileFormatImplicits
- object FixedFormatHelper
- object FixedFormatSchemaImplicits
- object RestAPIUtils
- object SchemaUtils
- object SparkFunctions
-
object
LongSequence
- Definition Classes
- SparkFunctions
-
object
LongWrappedArray
- Definition Classes
- SparkFunctions
- object SparkTestingUtils