io.smartdatalake.workflow.action.customlogic
Optional class name implementing trait CustomDfTransformer
Optional file where scala code for transformation is loaded from. The scala code in the file needs to be a function of type fnTransformType.
Optional scala code for transformation. The scala code needs to be a function of type fnTransformType.
Optional SQL code for transformation. Use tokens %{<key>} to replace with runtimeOptions in SQL code. Example: "select * from test where run = %{runId}"
Optional pythonFile to use for python transformation. The python code can use variables inputDf, dataObjectId and options. The transformed DataFrame has to be set with setOutputDf.
Optional pythonCode to use for python transformation. The python code can use variables inputDf, dataObjectId and options. The transformed DataFrame has to be set with setOutputDf.
Options to pass to the transformation
optional tuples of [key, spark sql expression] to be added as additional options when executing transformation. The spark sql expressions are evaluated against an instance of DefaultExpressionData.
Optional class name implementing trait CustomDfTransformer
Options to pass to the transformation
Optional pythonCode to use for python transformation.
Optional pythonCode to use for python transformation. The python code can use variables inputDf, dataObjectId and options. The transformed DataFrame has to be set with setOutputDf.
Optional pythonFile to use for python transformation.
Optional pythonFile to use for python transformation. The python code can use variables inputDf, dataObjectId and options. The transformed DataFrame has to be set with setOutputDf.
optional tuples of [key, spark sql expression] to be added as additional options when executing transformation.
optional tuples of [key, spark sql expression] to be added as additional options when executing transformation. The spark sql expressions are evaluated against an instance of DefaultExpressionData.
Optional scala code for transformation.
Optional scala code for transformation. The scala code needs to be a function of type fnTransformType.
Optional file where scala code for transformation is loaded from.
Optional file where scala code for transformation is loaded from. The scala code in the file needs to be a function of type fnTransformType.
Optional SQL code for transformation.
Optional SQL code for transformation. Use tokens %{<key>} to replace with runtimeOptions in SQL code. Example: "select * from test where run = %{runId}"
Configuration of a custom Spark-DataFrame transformation between one input and one output (1:1) Define a transform function which receives a DataObjectIds, a DataFrames and a map of options and has to return a DataFrame, see also CustomDfTransformer.
Note about Python transformation: Environment with Python and PySpark needed. PySpark session is initialize and available under variables
sc
,session
,sqlContext
. Other variables available are -inputDf
: Input DataFrame -options
: Transformation options as Map[String,String] -dataObjectId
: Id of input dataObject as String Output DataFrame must be set withsetOutputDf(df)
.Optional class name implementing trait CustomDfTransformer
Optional file where scala code for transformation is loaded from. The scala code in the file needs to be a function of type fnTransformType.
Optional scala code for transformation. The scala code needs to be a function of type fnTransformType.
Optional SQL code for transformation. Use tokens %{<key>} to replace with runtimeOptions in SQL code. Example: "select * from test where run = %{runId}"
Optional pythonFile to use for python transformation. The python code can use variables inputDf, dataObjectId and options. The transformed DataFrame has to be set with setOutputDf.
Optional pythonCode to use for python transformation. The python code can use variables inputDf, dataObjectId and options. The transformed DataFrame has to be set with setOutputDf.
Options to pass to the transformation
optional tuples of [key, spark sql expression] to be added as additional options when executing transformation. The spark sql expressions are evaluated against an instance of DefaultExpressionData.