:: DeveloperApi ::
The default collection partitioner implementation
The default collection partitioner implementation
Wraps the MongoSamplePartitioner and provides in-depth information for users of older MongoDBs.
1.0
The pagination by count partitioner.
The pagination by count partitioner.
Paginates the collection into a maximum number of partitions.
The prefix when using sparkConf
is: spark.mongodb.input.partitionerOptions
followed by the property name:
_id
.64
.Note: This can be a expensive operation as it creates 1 cursor for every partition.
1.0
The pagination by size partitioner.
The pagination by size partitioner.
Paginates the collection into partitions based on their size. Uses the collStats
command and the average document size to
estimate the partition boundaries.
The prefix when using sparkConf
is: spark.mongodb.input.partitionerOptions
followed by the property name:
_id
.64
.*Note:* This can be a expensive operation as it creates 1 cursor for every estimated partitionSizeMB
s worth of documents.
*Note:* Does not support views. Use MongoPaginateByCountPartitioner
or create a custom partitioner.
1.0
An identifier for a partition in a MongoRDD.
An identifier for a partition in a MongoRDD.
The partition's index within its parent RDD
The query bounds for the data within this partition
The preferred locations (hostnames) for the data
1.0
The MongoPartitioner provides the partitions of a collection
The Sample Partitioner.
The Sample Partitioner.
Uses the average document size and random sampling of the collection to determine suitable partitions for the collection.
The prefix when using sparkConf
is: spark.mongodb.input.partitionerOptions
followed by the property name:
_id
.64
.10
.*Note:* Requires MongoDB 3.2+
*Note:* Does not support views. Use MongoPaginateByCountPartitioner
or create a custom partitioner.
1.0
The Sharded Partitioner
The Sharded Partitioner
Partitions collections by shard and chunk.
The prefix when using sparkConf
is: spark.mongodb.input.partitionerOptions
followed by the property name:
_id
.
1.0
The Single Partitioner.
The Single Partitioner.
Creates a single partition for the whole collection.
Note: Using this partitioner loses any parallelism and therefore is not generally recommended.
1.0
The SplitVector Partitioner.
The SplitVector Partitioner.
Uses the SplitVector
command on the primary node to generate partitions for a collection.
Requires ClusterManager
privilege.
The prefix when using sparkConf
is: spark.mongodb.input.partitionerOptions
followed by the property name:
_id
.64
.
1.0
The MongoPartition companion object
The MongoPartition companion object
1.0
:: DeveloperApi ::
:: DeveloperApi ::
Helper methods for partitioner implementations
1.0
:: DeveloperApi ::
Ordering implement for BsonValues
1.0