com.mongodb.spark.rdd.partitioner
The partition key property
The partition size MB property
Calculate the Partitions
Calculate the Partitions
the MongoConnector
the pipeline to apply if any. Note this pipeline may have been appended to during optimization.
the partitions
The number of samples for each partition
The Sample Partitioner.
Uses the average document size and random sampling of the collection to determine suitable partitions for the collection.
Configuration Properties
The prefix when using
sparkConf
is:spark.mongodb.input.partitionerOptions
followed by the property name:_id
.64
.10
.*Note:* Requires MongoDB 3.2+ *Note:* Does not support views. Use
MongoPaginateByCountPartitioner
or create a custom partitioner.1.0