classes to register for spark kryo serialization
spark options
enable hive for spark session
enable periodic memory usage logging, see detailed configuration MemoryLogTimerConfig
enable shutdown hook logger to trace shutdown cause
Define state listeners to be registered for receiving events of the execution of SmartDataLake job
Define UDFs to be registered in spark session. The registered UDFs are available in Spark SQL transformations and expression evaluation, e.g. configuration of ExecutionModes.
Define UDFs in python to be registered in spark session. The registered UDFs are available in Spark SQL transformations but not for expression evaluation.
Define SecretProvider's to be registered.
Configure a list of exceptions for partitioned DataObject id's, which are allowed to overwrite the all partitions of a table if no partition values are set. This is used to override/avoid a protective error when using SDLSaveMode.OverwriteOptimized|OverwritePreserveDirectories. Define it as a list of DataObject id's.
Number of Executions to keep runtime data for in streaming mode (default = 10). Must be bigger than 1.
Trigger interval for synchronous actions in streaming mode in seconds (default = 60 seconds) The synchronous actions of the DAG will be executed with this interval if possile. Note that for asynchronous actions there are separate settings, e.g. SparkStreamingMode.triggerInterval.
Configure a list of exceptions for partitioned DataObject id's, which are allowed to overwrite the all partitions of a table if no partition values are set.
Configure a list of exceptions for partitioned DataObject id's, which are allowed to overwrite the all partitions of a table if no partition values are set. This is used to override/avoid a protective error when using SDLSaveMode.OverwriteOptimized|OverwritePreserveDirectories. Define it as a list of DataObject id's.
Create a spark session using settings from this global config
enable hive for spark session
classes to register for spark kryo serialization
enable periodic memory usage logging, see detailed configuration MemoryLogTimerConfig
Define UDFs in python to be registered in spark session.
Define UDFs in python to be registered in spark session. The registered UDFs are available in Spark SQL transformations but not for expression evaluation.
Number of Executions to keep runtime data for in streaming mode (default = 10).
Number of Executions to keep runtime data for in streaming mode (default = 10). Must be bigger than 1.
Define SecretProvider's to be registered.
enable shutdown hook logger to trace shutdown cause
spark options
Define UDFs to be registered in spark session.
Define UDFs to be registered in spark session. The registered UDFs are available in Spark SQL transformations and expression evaluation, e.g. configuration of ExecutionModes.
Define state listeners to be registered for receiving events of the execution of SmartDataLake job
Trigger interval for synchronous actions in streaming mode in seconds (default = 60 seconds) The synchronous actions of the DAG will be executed with this interval if possile.
Trigger interval for synchronous actions in streaming mode in seconds (default = 60 seconds) The synchronous actions of the DAG will be executed with this interval if possile. Note that for asynchronous actions there are separate settings, e.g. SparkStreamingMode.triggerInterval.
Global configuration options
classes to register for spark kryo serialization
spark options
enable hive for spark session
enable periodic memory usage logging, see detailed configuration MemoryLogTimerConfig
enable shutdown hook logger to trace shutdown cause
Define state listeners to be registered for receiving events of the execution of SmartDataLake job
Define UDFs to be registered in spark session. The registered UDFs are available in Spark SQL transformations and expression evaluation, e.g. configuration of ExecutionModes.
Define UDFs in python to be registered in spark session. The registered UDFs are available in Spark SQL transformations but not for expression evaluation.
Define SecretProvider's to be registered.
Configure a list of exceptions for partitioned DataObject id's, which are allowed to overwrite the all partitions of a table if no partition values are set. This is used to override/avoid a protective error when using SDLSaveMode.OverwriteOptimized|OverwritePreserveDirectories. Define it as a list of DataObject id's.
Number of Executions to keep runtime data for in streaming mode (default = 10). Must be bigger than 1.
Trigger interval for synchronous actions in streaming mode in seconds (default = 60 seconds) The synchronous actions of the DAG will be executed with this interval if possile. Note that for asynchronous actions there are separate settings, e.g. SparkStreamingMode.triggerInterval.