Object

com.netflix.atlas.core.util

Shards

Related Doc: package util

Permalink

object Shards

Utility functions for mapping ids or indices to a shard. For our purposes, a shard is an instance with a set of server groups. The union of data from all groups comprises a full copy of the overall dataset. To allow for smaller deployment units, an individual group or subset of the groups can be replicated. For redundancy, groups could be replicated all the time. At Netflix, we typically replicate the overall set of server groups in another region or zone instead.

This class specifically focuses on relatively simple sharding schemes where the component making the decision only needs to know the set of instances and a slot for each instance. Edda is one example of a system that provides this information for AWS auto-scaling groups. More complex sharding schemes that require additional infrastructure, e.g, zookeeper, are out of scope here. There are two sharding modes supported by this class:

1. Mapping an id for a tagged item to a shard. This is typically done while data is flowing into the system and each datapoint can be routed based on the id.

2. Mapping an positional index to a shard. This is typically done for loading data that has been processed via Hadoop or similar tools and stored in a fixed number of files. There should be a manifest with an order list of the files for a given time and the position can be used to map to a shard. When using this approach it is recommended to use a [highly composite number][hcn] for the number of files. This makes it easier to pick a number of groups and sizes for the groups such that each instance will get the same number of files.

When mapping this to AWS an overall deployment is typically a set of auto-scaling groups (ASG). Each instance should get the same amount of data if possible given the set of files. Deployments are typically done as a red/black push of one ASG at a time. So the amount of additional capacity during a push is the size of one of these groups if deployments across the groups are performed serially. While multiple ASGs for a particular group are active the data will be replicated across them.

[hcn]: https://en.wikipedia.org/wiki/Highly_composite_number

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. Shards
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. case class Group[T](name: String, instances: Array[T]) extends Product with Serializable

    Permalink

    Group of instances representing a subset of the overall deployment.

    Group of instances representing a subset of the overall deployment.

    name

    Name of the group. In the case of replicas each replica group should have the same name.

    instances

    Instances that are part of the group. The order of this array is important to ensure that an instance will always get the same set of data. The position is used to associate data to the instance.

  2. class LocalMapper[T] extends AnyRef

    Permalink

    Mapper intended to run on a given instance and check to see if data should be loaded there.

  3. class Mapper[T] extends AnyRef

    Permalink

    Mapper for finding the instance that should receive data for an id or index of a file.

  4. class ReplicaMapper[T] extends AnyRef

    Permalink

    Mapper for finding the instance that should receive data for an id or index of a file.

    Mapper for finding the instance that should receive data for an id or index of a file. There can be multiple groups with a given name and data will be replicated across all of the groups for that name.

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @HotSpotIntrinsicCandidate() @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  8. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
    Annotations
    @HotSpotIntrinsicCandidate()
  9. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
    Annotations
    @HotSpotIntrinsicCandidate()
  10. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  11. def localMapper[T](groupSize: Int, instanceIdx: Int, groupIdx: Int, numGroups: Int): LocalMapper[T]

    Permalink

    Creates a mapper that can be used on an instance of a group.

    Creates a mapper that can be used on an instance of a group. This is typically used if the local instance needs to figure out what data to load.

    groupSize

    Size of the group that contains the instance.

    instanceIdx

    Index for this instance within the group.

    groupIdx

    Index of this group within the overall set of groups.

    numGroups

    Number of groups that make up the complete deployment.

    returns

    Mapper

  12. def mapper[T](groups: List[Group[T]]): Mapper[T]

    Permalink

    Creates a mapper used to route data to the appropriate instance.

    Creates a mapper used to route data to the appropriate instance. This form is typically used as data is flowing into the system when replicas are not a concern. If replication is needed, then see replicaMapper instead.

    groups

    Set of groups that makes up the complete data set.

    returns

    Mapper for routing data to instances.

  13. def mapper[T](group: Group[T]): Mapper[T]

    Permalink

    Creates a mapper used to route data to the appropriate instance.

    Creates a mapper used to route data to the appropriate instance. This form is typically used as data is flowing into the system when replicas are not a concern. If replication is needed, then see replicaMapper instead.

    group

    Single group that makes up the complete data set.

    returns

    Mapper for routing data to instances.

  14. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  15. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @HotSpotIntrinsicCandidate()
  16. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @HotSpotIntrinsicCandidate()
  17. def replicaMapper[T](groups: List[Group[T]]): ReplicaMapper[T]

    Permalink

    Creates a mapper used to route data to the appropriate instance.

    Creates a mapper used to route data to the appropriate instance. This form is used as data is flowing into the system and there can be replicas for the groups.

    groups

    Set of groups that makes up the complete deployment.

    returns

    Mapper for routing data to instances.

  18. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  19. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  20. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  21. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  22. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Deprecated Value Members

  1. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @Deprecated @deprecated @throws( classOf[java.lang.Throwable] )
    Deprecated

    (Since version ) see corresponding Javadoc for more information.

Inherited from AnyRef

Inherited from Any

Ungrouped