Package

com.mongodb.spark.rdd

partitioner

Permalink

package partitioner

Visibility
  1. Public
  2. All

Type Members

  1. trait BsonValueOrdering extends Ordering[BsonValue]

    Permalink

    :: DeveloperApi ::

    :: DeveloperApi ::

    Ordering implement for BsonValues

    Annotations
    @DeveloperApi()
    Since

    1.0

  2. class DefaultMongoPartitioner extends Logging with MongoPartitioner

    Permalink

    The default collection partitioner implementation

    The default collection partitioner implementation

    Wraps the MongoSamplePartitioner and provides in-depth information for users of older MongoDBs.

    Since

    1.0

  3. class MongoPaginateByCountPartitioner extends Logging with MongoPartitioner with MongoPaginationPartitioner

    Permalink

    The pagination by count partitioner.

    The pagination by count partitioner.

    Paginates the collection into a maximum number of partitions.

    Configuration Properties

    The prefix when using sparkConf is: spark.mongodb.input.partitionerOptions followed by the property name:

    • partitionKey, the field to partition the collection by. The field should be indexed and contain unique values. Defaults to _id.
    • numberOfPartitions, the maximum number of partitions to create. Defaults to 64.

    Note: This can be a expensive operation as it creates 1 cursor for every partition.

    Since

    1.0

  4. class MongoPaginateBySizePartitioner extends Logging with MongoPartitioner with MongoPaginationPartitioner

    Permalink

    The pagination by size partitioner.

    The pagination by size partitioner.

    Paginates the collection into partitions based on their size. Uses the collStats command and the average document size to estimate the partition boundaries.

    Configuration Properties

    The prefix when using sparkConf is: spark.mongodb.input.partitionerOptions followed by the property name:

    • partitionKey, the field to partition the collection by. The field should be indexed and contain unique values. Defaults to _id.
    • partitionSizeMB, the size (in MB) for each partition. Defaults to 64.

    *Note:* This can be a expensive operation as it creates 1 cursor for every estimated partitionSizeMBs worth of documents. *Note:* Does not support views. Use MongoPaginateByCountPartitioner or create a custom partitioner.

    Since

    1.0

  5. case class MongoPartition(index: Int, queryBounds: BsonDocument, locations: Seq[String]) extends Partition with Product with Serializable

    Permalink

    An identifier for a partition in a MongoRDD.

    An identifier for a partition in a MongoRDD.

    index

    The partition's index within its parent RDD

    queryBounds

    The query bounds for the data within this partition

    locations

    The preferred locations (hostnames) for the data

    Since

    1.0

  6. trait MongoPartitioner extends Logging with Serializable

    Permalink

    The MongoPartitioner provides the partitions of a collection

  7. class MongoSamplePartitioner extends Logging with MongoPartitioner

    Permalink

    The Sample Partitioner.

    The Sample Partitioner.

    Uses the average document size and random sampling of the collection to determine suitable partitions for the collection.

    Configuration Properties

    The prefix when using sparkConf is: spark.mongodb.input.partitionerOptions followed by the property name:

    • partitionKey, the field to partition the collection by. The field should be indexed and contain unique values. Defaults to _id.
    • partitionSizeMB, the size (in MB) for each partition. Defaults to 64.
    • samplesPerPartition, the number of samples for each partition. Defaults to 10.

    *Note:* Requires MongoDB 3.2+ *Note:* Does not support views. Use MongoPaginateByCountPartitioner or create a custom partitioner.

    Since

    1.0

  8. class MongoShardedPartitioner extends Logging with MongoPartitioner

    Permalink

    The Sharded Partitioner

    The Sharded Partitioner

    Partitions collections by shard and chunk.

    Configuration Properties

    The prefix when using sparkConf is: spark.mongodb.input.partitionerOptions followed by the property name:

    • shardKey, the shardKey for the collection. Defaults to _id.
    Since

    1.0

  9. class MongoSinglePartitioner extends Logging with MongoPartitioner

    Permalink

    The Single Partitioner.

    The Single Partitioner.

    Creates a single partition for the whole collection.

    Note: Using this partitioner loses any parallelism and therefore is not generally recommended.

    Since

    1.0

  10. class MongoSplitVectorPartitioner extends Logging with MongoPartitioner

    Permalink

    The SplitVector Partitioner.

    The SplitVector Partitioner.

    Uses the SplitVector command on the primary node to generate partitions for a collection. Requires ClusterManager privilege.

    Configuration Properties

    The prefix when using sparkConf is: spark.mongodb.input.partitionerOptions followed by the property name:

    • partitionKey, the field to partition the collection by. The field should be indexed and contain unique values. Defaults to _id.
    • partitionSizeMB, the size (in MB) for each partition. Defaults to 64.
    Since

    1.0

Value Members

  1. object DefaultMongoPartitioner extends DefaultMongoPartitioner with Product with Serializable

    Permalink
  2. object MongoPaginateByCountPartitioner extends MongoPaginateByCountPartitioner with Product with Serializable

    Permalink
  3. object MongoPaginateBySizePartitioner extends MongoPaginateBySizePartitioner with Product with Serializable

    Permalink
  4. object MongoPartition extends Serializable

    Permalink

    The MongoPartition companion object

    The MongoPartition companion object

    Since

    1.0

  5. object MongoSamplePartitioner extends MongoSamplePartitioner with Product with Serializable

    Permalink
  6. object MongoShardedPartitioner extends MongoShardedPartitioner with Product with Serializable

    Permalink
  7. object MongoSinglePartitioner extends MongoSinglePartitioner with Product with Serializable

    Permalink
  8. object MongoSplitVectorPartitioner extends MongoSplitVectorPartitioner with Product with Serializable

    Permalink
  9. object PartitionerHelper

    Permalink

    :: DeveloperApi ::

    :: DeveloperApi ::

    Helper methods for partitioner implementations

    Annotations
    @DeveloperApi()
    Since

    1.0

Ungrouped