Class/Object

io.smartdatalake.workflow

FileSubFeed

Related Docs: object FileSubFeed | package workflow

Permalink

case class FileSubFeed(fileRefs: Option[Seq[FileRef]], dataObjectId: DataObjectId, partitionValues: Seq[PartitionValues], isDAGStart: Boolean = false, isSkipped: Boolean = false, processedInputFileRefs: Option[Seq[FileRef]] = None) extends SubFeed with Product with Serializable

A FileSubFeed is used to transport references to files between Actions.

fileRefs

path to files to be processed

dataObjectId

id of the DataObject this SubFeed corresponds to

partitionValues

Values of Partitions transported by this SubFeed

isDAGStart

true if this subfeed is a start node of the dag

isSkipped

true if this subfeed is the result of a skipped action

processedInputFileRefs

used to remember processed input FileRef's for post processing (e.g. delete after read)

Linear Supertypes
Serializable, Serializable, Product, Equals, SubFeed, SmartDataLakeLogger, DAGResult, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. FileSubFeed
  2. Serializable
  3. Serializable
  4. Product
  5. Equals
  6. SubFeed
  7. SmartDataLakeLogger
  8. DAGResult
  9. AnyRef
  10. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new FileSubFeed(fileRefs: Option[Seq[FileRef]], dataObjectId: DataObjectId, partitionValues: Seq[PartitionValues], isDAGStart: Boolean = false, isSkipped: Boolean = false, processedInputFileRefs: Option[Seq[FileRef]] = None)

    Permalink

    fileRefs

    path to files to be processed

    dataObjectId

    id of the DataObject this SubFeed corresponds to

    partitionValues

    Values of Partitions transported by this SubFeed

    isDAGStart

    true if this subfeed is a start node of the dag

    isSkipped

    true if this subfeed is the result of a skipped action

    processedInputFileRefs

    used to remember processed input FileRef's for post processing (e.g. delete after read)

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def breakLineage(implicit session: SparkSession, context: ActionPipelineContext): FileSubFeed

    Permalink

    Break lineage.

    Break lineage. This means to discard an existing DataFrame or List of FileRefs, so that it is requested again from the DataObject. On one side this is usable to break long DataFrame Lineages over multiple Actions and instead reread the data from an intermediate table. On the other side it is needed if partition values or filter condition are changed.

    Definition Classes
    FileSubFeedSubFeed
  6. def checkPartitionValuesColsExisting(partitions: Set[String]): Boolean

    Permalink
  7. def clearDAGStart(): FileSubFeed

    Permalink
    Definition Classes
    FileSubFeedSubFeed
  8. def clearPartitionValues(breakLineageOnChange: Boolean = true)(implicit session: SparkSession, context: ActionPipelineContext): FileSubFeed

    Permalink
    Definition Classes
    FileSubFeedSubFeed
  9. def clearSkipped(): FileSubFeed

    Permalink
    Definition Classes
    FileSubFeedSubFeed
  10. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  11. val dataObjectId: DataObjectId

    Permalink

    id of the DataObject this SubFeed corresponds to

    id of the DataObject this SubFeed corresponds to

    Definition Classes
    FileSubFeedSubFeed
  12. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  13. val fileRefs: Option[Seq[FileRef]]

    Permalink

    path to files to be processed

  14. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  15. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  16. val isDAGStart: Boolean

    Permalink

    true if this subfeed is a start node of the dag

    true if this subfeed is a start node of the dag

    Definition Classes
    FileSubFeedSubFeed
  17. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  18. val isSkipped: Boolean

    Permalink

    true if this subfeed is the result of a skipped action

    true if this subfeed is the result of a skipped action

    Definition Classes
    FileSubFeedSubFeed
  19. lazy val logger: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    SmartDataLakeLogger
  20. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  21. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  22. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  23. val partitionValues: Seq[PartitionValues]

    Permalink

    Values of Partitions transported by this SubFeed

    Values of Partitions transported by this SubFeed

    Definition Classes
    FileSubFeedSubFeed
  24. val processedInputFileRefs: Option[Seq[FileRef]]

    Permalink

    used to remember processed input FileRef's for post processing (e.g.

    used to remember processed input FileRef's for post processing (e.g. delete after read)

  25. def resultId: String

    Permalink
    Definition Classes
    SubFeed → DAGResult
  26. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  27. def toOutput(dataObjectId: DataObjectId): FileSubFeed

    Permalink
    Definition Classes
    FileSubFeedSubFeed
  28. def union(other: SubFeed)(implicit session: SparkSession, context: ActionPipelineContext): SubFeed

    Permalink
    Definition Classes
    FileSubFeedSubFeed
  29. def unionPartitionValues(otherPartitionValues: Seq[PartitionValues]): Seq[PartitionValues]

    Permalink
    Definition Classes
    SubFeed
  30. def updatePartitionValues(partitions: Seq[String], breakLineageOnChange: Boolean = true, newPartitionValues: Option[Seq[PartitionValues]] = None)(implicit session: SparkSession, context: ActionPipelineContext): FileSubFeed

    Permalink
    Definition Classes
    FileSubFeedSubFeed
  31. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  32. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  33. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from SubFeed

Inherited from SmartDataLakeLogger

Inherited from DAGResult

Inherited from AnyRef

Inherited from Any

Ungrouped