classMongoPaginateBySizePartitioner extends Logging with MongoPartitioner with MongoPaginationPartitioner
The pagination by size partitioner.
Paginates the collection into partitions based on their size. Uses the collStats command and the average document size to
estimate the partition boundaries.
Configuration Properties
The prefix when using sparkConf is: spark.mongodb.input.partitionerOptions followed by the property name:
partitionKey, the field to partition the collection by. The field should be indexed and contain unique values.
Defaults to _id.
partitionSizeMB, the size (in MB) for each partition. Defaults to 64.
Note: This can be a expensive operation as it creates 1 cursor for every estimated partitionSizeMBs worth of documents.
Since
1.0
Linear Supertypes
MongoPaginationPartitioner, MongoPartitioner, Serializable, Serializable, Logging, LoggingTrait, AnyRef, Any
The pagination by size partitioner.
Paginates the collection into partitions based on their size. Uses the
collStats
command and the average document size to estimate the partition boundaries.Configuration Properties
The prefix when using
sparkConf
is:spark.mongodb.input.partitionerOptions
followed by the property name:_id
.64
.Note: This can be a expensive operation as it creates 1 cursor for every estimated
partitionSizeMB
s worth of documents.1.0