partitioner

Type Members

trait BsonValueOrdering extends Ordering[BsonValue]

:: DeveloperApi ::
:: DeveloperApi ::
Ordering implement for BsonValues

Annotations
@DeveloperApi()
Since
1.0
class DefaultMongoPartitioner extends Logging with MongoPartitioner

The default collection partitioner implementation
The default collection partitioner implementation
Wraps the MongoSamplePartitioner and provides in-depth information for users of older MongoDBs.

Since
1.0
class MongoPaginateByCountPartitioner extends Logging with MongoPartitioner with MongoPaginationPartitioner

The pagination by count partitioner.
The pagination by count partitioner.
Paginates the collection into a maximum number of partitions.
Configuration Properties
The prefix when using sparkConf is: spark.mongodb.input.partitionerOptions followed by the property name:
- partitionKey, the field to partition the collection by. The field should be indexed and contain unique values. Defaults to _id.
- numberOfPartitions, the maximum number of partitions to create. Defaults to 64.
Note: This can be a expensive operation as it creates 1 cursor for every partition.
Since
1.0
class MongoPaginateBySizePartitioner extends Logging with MongoPartitioner with MongoPaginationPartitioner

The pagination by size partitioner.
The pagination by size partitioner.
Paginates the collection into partitions based on their size. Uses the collStats command and the average document size to estimate the partition boundaries.
Configuration Properties
The prefix when using sparkConf is: spark.mongodb.input.partitionerOptions followed by the property name:
- partitionKey, the field to partition the collection by. The field should be indexed and contain unique values. Defaults to _id.
- partitionSizeMB, the size (in MB) for each partition. Defaults to 64.
*Note:* This can be a expensive operation as it creates 1 cursor for every estimated partitionSizeMBs worth of documents. *Note:* Does not support views. Use MongoPaginateByCountPartitioner or create a custom partitioner.
Since
1.0
case class MongoPartition(index: Int, queryBounds: BsonDocument, locations: Seq[String]) extends Partition with Product with Serializable

An identifier for a partition in a MongoRDD.
An identifier for a partition in a MongoRDD.
index
The partition's index within its parent RDD
queryBounds
The query bounds for the data within this partition
locations
The preferred locations (hostnames) for the data

Since
1.0
trait MongoPartitioner extends Logging with Serializable

The MongoPartitioner provides the partitions of a collection
class MongoSamplePartitioner extends Logging with MongoPartitioner

The Sample Partitioner.
The Sample Partitioner.
Uses the average document size and random sampling of the collection to determine suitable partitions for the collection.
Configuration Properties
The prefix when using sparkConf is: spark.mongodb.input.partitionerOptions followed by the property name:
- partitionKey, the field to partition the collection by. The field should be indexed and contain unique values. Defaults to _id.
- partitionSizeMB, the size (in MB) for each partition. Defaults to 64.
- samplesPerPartition, the number of samples for each partition. Defaults to 10.
*Note:* Requires MongoDB 3.2+ *Note:* Does not support views. Use MongoPaginateByCountPartitioner or create a custom partitioner.
Since
1.0
class MongoShardedPartitioner extends Logging with MongoPartitioner

The Sharded Partitioner
The Sharded Partitioner
Partitions collections by shard and chunk.
Configuration Properties
The prefix when using sparkConf is: spark.mongodb.input.partitionerOptions followed by the property name:
- shardKey, the shardKey for the collection. Defaults to _id.
Since
1.0
class MongoSinglePartitioner extends Logging with MongoPartitioner

The Single Partitioner.
The Single Partitioner.
Creates a single partition for the whole collection.
Note: Using this partitioner loses any parallelism and therefore is not generally recommended.

Since
1.0
class MongoSplitVectorPartitioner extends Logging with MongoPartitioner

The SplitVector Partitioner.
The SplitVector Partitioner.
Uses the SplitVector command on the primary node to generate partitions for a collection. Requires ClusterManager privilege.
Configuration Properties
The prefix when using sparkConf is: spark.mongodb.input.partitionerOptions followed by the property name:
- partitionKey, the field to partition the collection by. The field should be indexed and contain unique values. Defaults to _id.
- partitionSizeMB, the size (in MB) for each partition. Defaults to 64.
Since
1.0

Value Members

object DefaultMongoPartitioner extends DefaultMongoPartitioner with Product with Serializable
object MongoPaginateByCountPartitioner extends MongoPaginateByCountPartitioner with Product with Serializable
object MongoPaginateBySizePartitioner extends MongoPaginateBySizePartitioner with Product with Serializable
object MongoPartition extends Serializable

The MongoPartition companion object
The MongoPartition companion object

Since
1.0
object MongoSamplePartitioner extends MongoSamplePartitioner with Product with Serializable
object MongoShardedPartitioner extends MongoShardedPartitioner with Product with Serializable
object MongoSinglePartitioner extends MongoSinglePartitioner with Product with Serializable
object MongoSplitVectorPartitioner extends MongoSplitVectorPartitioner with Product with Serializable
object PartitionerHelper

:: DeveloperApi ::
:: DeveloperApi ::
Helper methods for partitioner implementations

Annotations
@DeveloperApi()
Since
1.0

package partitioner

Type Members

trait BsonValueOrdering extends Ordering[BsonValue]

class DefaultMongoPartitioner extends Logging with MongoPartitioner

class MongoPaginateByCountPartitioner extends Logging with MongoPartitioner with MongoPaginationPartitioner

Configuration Properties

class MongoPaginateBySizePartitioner extends Logging with MongoPartitioner with MongoPaginationPartitioner

Configuration Properties

case class MongoPartition(index: Int, queryBounds: BsonDocument, locations: Seq[String]) extends Partition with Product with Serializable

trait MongoPartitioner extends Logging with Serializable

class MongoSamplePartitioner extends Logging with MongoPartitioner

Configuration Properties

class MongoShardedPartitioner extends Logging with MongoPartitioner

Configuration Properties

class MongoSinglePartitioner extends Logging with MongoPartitioner

class MongoSplitVectorPartitioner extends Logging with MongoPartitioner

Configuration Properties

Value Members

object DefaultMongoPartitioner extends DefaultMongoPartitioner with Product with Serializable

object MongoPaginateByCountPartitioner extends MongoPaginateByCountPartitioner with Product with Serializable

object MongoPaginateBySizePartitioner extends MongoPaginateBySizePartitioner with Product with Serializable

object MongoPartition extends Serializable

object MongoSamplePartitioner extends MongoSamplePartitioner with Product with Serializable

object MongoShardedPartitioner extends MongoShardedPartitioner with Product with Serializable

object MongoSinglePartitioner extends MongoSinglePartitioner with Product with Serializable

object MongoSplitVectorPartitioner extends MongoSplitVectorPartitioner with Product with Serializable

object PartitionerHelper

Ungrouped