trait Source extends SparkDataStream
A source of continually arriving data for a streaming query. A Source must have a monotonically increasing notion of progress that can be represented as an Offset. Spark will regularly query each Source to see if any more data is available.
Note that, we extends SparkDataStream here, to make the v1 streaming source API be compatible
with data source v2.
- Alphabetic
- By Inheritance
- Source
- SparkDataStream
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Abstract Value Members
-
abstract
def
getBatch(start: Option[Offset], end: Offset): DataFrame
Returns the data that is between the offsets (
start,end].Returns the data that is between the offsets (
start,end]. WhenstartisNone, then the batch should begin with the first record. This method must always return the same data for a particularstartandendpair; even after the Source has been restarted on a different node.Higher layers will always call this method with a value of
startgreater than or equal to the last value passed tocommitand a value ofendless than or equal to the last value returned bygetOffsetIt is possible for the Offset type to be a SerializedOffset when it was obtained from the log. Moreover, StreamExecution only compares the Offset JSON representation to determine if the two objects are equal. This could have ramifications when upgrading Offset JSON formats i.e., two equivalent Offset objects could differ between version. Consequently, StreamExecution may call this method with two such equivalent Offset objects. In which case, the Source should return an empty DataFrame
-
abstract
def
getOffset: Option[Offset]
Returns the maximum available offset for this source.
Returns the maximum available offset for this source. Returns
Noneif this source has never received any data. -
abstract
def
schema: StructType
Returns the schema of the data from this source
-
abstract
def
stop(): Unit
- Definition Classes
- SparkDataStream
Concrete Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
commit(end: connector.read.streaming.Offset): Unit
- Definition Classes
- Source → SparkDataStream
-
def
commit(end: Offset): Unit
Informs the source that Spark has completed processing all data for offsets less than or equal to
endand will only request offsets greater thanendin the future. -
def
deserializeOffset(json: String): connector.read.streaming.Offset
- Definition Classes
- Source → SparkDataStream
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
initialOffset(): connector.read.streaming.Offset
- Definition Classes
- Source → SparkDataStream
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()