org.apache.spark.sql.execution.streaming
2.0.0
The string that represents the format that this data source provider uses.
The string that represents the format that this data source provider uses. This is overridden by children to provide a nice alias for the data source. For example:
override def shortName(): String = "parquet"
1.5.0
Returns the name and schema of the source that can be used to continually read data.
Returns the name and schema of the source that can be used to continually read data.
2.0.0
A source that generates increment long values with timestamps. Each generated row has two columns: a timestamp column for the generated time and an auto increment long column starting with 0L.
This source supports the following options:
rowsPerSecond
(e.g. 100, default: 1): How many rows should be generated per second.rampUpTime
(e.g. 5s, default: 0s): How long to ramp up before the generating speed becomesrowsPerSecond
. Using finer granularities than seconds will be truncated to integer seconds.numPartitions
(e.g. 10, default: Spark's default parallelism): The partition number for the generated rows. The source will try its best to reachrowsPerSecond
, but the query may be resource constrained, andnumPartitions
can be tweaked to help reach the desired speed.