Interface ClusterService

  • All Superinterfaces:
    ClusterFilterSupport<ClusterView>, ClusterTopologySupport, ClusterView, Service

    @DefaultServiceFactory(ClusterServiceFactory.class)
    public interface ClusterService
    extends Service, ClusterView
    « start hereMain entry point to clustering API.

    Overview

    Cluster service provides functionality for getting information about the cluster members and keeping track of membership changes. It uses an eventually consistent gossip-based protocol for changes propagation among the cluster nodes. Eventual consistency in this context means that node join/leave events are not atomic and can be processed by different members at different time. However it is guaranteed that at some point in time all nodes within the cluster will receive such events and will become aware of cluster topology changes. The time it takes to propagate such events to all cluster members depend on the cluster service configuration options.

    Service Configuration

    ClusterService can be configured and registered within the HekateBootstrap via the ClusterServiceFactory class as in the example below:

    
    // Prepare service factory and configure some options.
    ClusterServiceFactory factory = new ClusterServiceFactory()
        .withGossipInterval(1000)
        .withSeedNodeProvider(new MulticastSeedNodeProvider(
            new MulticastSeedNodeProviderConfig()
                .withGroup("224.1.2.12")
                .withPort(45454)
                .withInterval(200)
                .withWaitTime(1000)
        ));
    
    // ...other options...
    
    // Start node.
    Hekate hekate = new HekateBootstrap()
        .withService(factory)
        .join();
    
    // Access the service.
    ClusterService cluster = hekate.cluster();
    
    Note: This example requires Spring Framework integration (see HekateSpringBootstrap).
    
    <beans xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xmlns:h="http://www.hekate.io/spring/hekate-core"
        xmlns="http://www.springframework.org/schema/beans"
        xsi:schemaLocation="http://www.springframework.org/schema/beans
            http://www.springframework.org/schema/beans/spring-beans.xsd
            http://www.hekate.io/spring/hekate-core
            http://www.hekate.io/spring/hekate-core.xsd">
    
        <h:node id="hekate">
            <!-- Cluster service. -->
            <h:cluster gossip-interval-ms="1000">
                <h:seed-node-provider>
                    <h:multicast group="224.1.2.12" port="45454" interval-ms="200" wait-time-ms="1000" ttl="3"/>
                </h:seed-node-provider>
                <h:failure-detection>
                    <h:heartbeat interval-ms="500" loss-threshold="6" quorum="2"/>
                </h:failure-detection>
    
                <!-- ...other options... -->
            </h:cluster>
    
            <!-- ...other services... -->
        </h:node>
    </beans>
    
    Note: This example requires Spring Framework integration (see HekateSpringBootstrap).
    
    <beans xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xmlns="http://www.springframework.org/schema/beans"
        xsi:schemaLocation="http://www.springframework.org/schema/beans
            http://www.springframework.org/schema/beans/spring-beans.xsd">
    
        <bean id="hekate" class="io.hekate.spring.bean.HekateSpringBootstrap">
            <property name="services">
                <list>
                    <!-- Cluster service. -->
                    <bean class="io.hekate.cluster.ClusterServiceFactory">
                        <property name="gossipInterval" value="1000"/>
                        <property name="seedNodeProvider">
                            <bean class="io.hekate.cluster.seed.multicast.MulticastSeedNodeProvider">
                                <constructor-arg>
                                    <bean class="io.hekate.cluster.seed.multicast.MulticastSeedNodeProviderConfig">
                                        <property name="group" value="224.1.2.12"/>
                                        <property name="port" value="45454"/>
                                        <property name="interval" value="200"/>
                                        <property name="waitTime" value="1000"/>
                                        <property name="ttl" value="2"/>
                                    </bean>
                                </constructor-arg>
                            </bean>
                        </property>
                        <property name="failureDetector">
                            <bean class="io.hekate.cluster.health.DefaultFailureDetector">
                                <constructor-arg>
                                    <bean class="io.hekate.cluster.health.DefaultFailureDetectorConfig">
                                        <property name="heartbeatInterval" value="500"/>
                                        <property name="heartbeatLossThreshold" value="6"/>
                                        <property name="failureDetectionQuorum" value="2"/>
                                    </bean>
                                </constructor-arg>
                            </bean>
                        </property>
    
                        <!-- ...other options... -->
                    </bean>
    
                    <!-- ...other services... -->
                </list>
            </property>
        </bean>
    </beans>
    

    For more details about the configuration options please see the documentation of ClusterServiceFactory class.

    Cluster Topology

    Cluster membership information (aka cluster topology) is represented by the ClusterTopology interface. Instances of this interface can be obtained via ClusterTopologySupport.topology() method. This interface provides various methods for getting information about the cluster nodes based on different criteria (f.e. remote nodes, oldest/youngest node, join order, etc).

    Each node in the cluster topology is represented by the ClusterNode interface. This interface provides information about the node's network address, provided services, roles and user-defined properties.

    Below is the example of accessing the current cluster topology:

    
    // Immutable snapshot of the current cluster topology.
    ClusterTopology topology = hekate.cluster().topology();
    
    System.out.println("   Local node: " + topology.localNode());
    System.out.println("    All nodes: " + topology.nodes());
    System.out.println(" Remote nodes: " + topology.remoteNodes());
    System.out.println("   Join order: " + topology.joinOrder());
    System.out.println("  Oldest node: " + topology.oldest());
    System.out.println("Youngest node: " + topology.youngest());
    

    Filtering can be applied to ClusterTopology by providing an implementation of ClusterNodeFilter interface as in the example below:

    
    // Immutable copy that contains only nodes with the specified role.
    ClusterTopology filtered = topology.filter(node -> node.hasRole("my_role"));
    

    Cluster service supports topology versioning by managing a counter that gets incremented every time whenever nodes join or leave the cluster. Value of this counter can be obtained via ClusterTopology.version() method and can be used to distinguish which topology instance is older and which one is newer.

    Please note that topology versioning is local to each node and can differ from node to node (i.e. each node maintains its own counter).

    Cluster Event Listener

    Listening for cluster events can be implemented by registering an instance of ClusterEventListener interface. This can be done at configuration time or at runtime. The key difference between those two methods is that listeners that were added at configuration time will be kept registered across multiple restarts while listeners that were added at runtime will be kept registered only while node stays in the cluster and will be automatically unregistered when node leaves the cluster.

    Listeners are notified when local node joins the cluster, leaves the cluster or if cluster service detects membership changes. Multiple concurrent changes are aggregated by the cluster service into a single ClusterEvent, i.e. if multiple nodes joined or left at the same time then only a single event will be fired holding all the information about all of those nodes and their state.

    Below is the example of registering a cluster event listener:

    
    hekate.cluster().addListener(event -> {
        switch (event.type()) {
            case JOIN: {
                ClusterJoinEvent join = event.asJoin();
    
                System.out.println("Joined : " + join.topology());
    
                break;
            }
            case CHANGE: {
                ClusterChangeEvent change = event.asChange();
    
                System.out.println("Topology change :" + change.topology());
                System.out.println("      added nodes=" + change.added());
                System.out.println("    removed nodes=" + change.removed());
    
                break;
            }
            case LEAVE: {
                ClusterLeaveEvent leave = event.asLeave();
    
                System.out.println("Left : " + leave.topology());
    
                break;
            }
            default: {
                throw new IllegalArgumentException("Unsupported event type: " + event);
            }
        }
    });
    

    For more details of cluster events processing please see the documentation of ClusterEventListener interface.

    Seed Nodes Discovery

    Whenever local node starts joining the cluster it tries to discover nodes that are already running. If none of such nodes could be found then local node assumes that it is the first node in the cluster and switches to the UP state. If some existing nodes could be discovered then local node chooses one of them as a contact node and starts cluster join negotiations with that node.

    Cluster service uses the SeedNodeProvider interface for the purpose of existing nodes discovery. Instances of this interface can be registered via ClusterServiceFactory.setSeedNodeProvider(SeedNodeProvider) method.

    The following implementations of this interface are available out of the box:

    Please see the documentation of SeedNodeProvider for more details on providing custom implementations of this interface.

    Failure Detection

    Cluster service relies on FailureDetector interface for node failure detection. Implementations of this interface are typically using a heartbeat-based approach for failure detection, however other algorithms can also be implemented.

    Failure detector can be specified via ClusterServiceFactory.setFailureDetector(FailureDetector) method.

    Default implementation of this interface is provided by the DefaultFailureDetector class. This class organizes all nodes into a monitoring ring with fixed heartbeat interval and loss threshold. Please see its javadoc for implementation details and configuration options.

    Please see the documentation of FailureDetector interface for more details on implementing custom failure detection logic.

    Split-brain Detection

    Cluster service can be configured to automatically detect and perform appropriate actions in case if split-brain problem arises. Detection is controlled by an implementation of SplitBrainDetector interface that can be registered within the cluster service via ClusterServiceFactory.setSplitBrainDetector(SplitBrainDetector) method.

    The following implementations of this interface are available out of the box:

    Multiple detectors can be combined with the help of SplitBrainDetectorGroup class.

    The action to perform upon split-brain is controlled by HekateFatalErrorPolicy of Hekate node. When split-brain is detected, the cluster service will apply this policy with ClusterSplitBrainException as a cause.

    Cluster Acceptors

    Whenever a new node tries joins the cluster it can be verified based on some custom application-specific rules (f.e. authorization and permissions checking) and rejected in case of a verification failure.

    Such verification can be implemented by configuring an implementation of ClusterAcceptor interface within the cluster service. This interface is used by an existing cluster node when a join request is received from a new node that tries to join the cluster. Implementation of this interface can use the joining node information in order to decide whether the new node should be accepted or it should be rejected. If node gets rejected then it will fail with ClusterRejectedJoinException.

    Please see the documentation of ClusterAcceptor interface for mode details.

    Gossip Protocol

    Cluster service uses a push-pull gossip protocol for membership state management. At high level this protocol can be described as follows:

    Time to time (based on the configurable interval) each node in the cluster sends a message with its membership information to a number of randomly selected nodes. Every node that receives such a message compares it with its local membership state information. If the received information is more up to date then node updates its local membership view. If local information is more up to date then it is sent back to the originator node (so that it could update its local view with the latest one).

    Note that when cluster is in convergent state (i.e. membership view is consistent across all cluster nodes) then nodes do not send the whole membership information over the network. In such cases only small membership digests are exchanged.

    Cluster service supports the concept of speeding up the gossip protocol by using a reactive approach instead of the periodic approach during the membership information exchange. With reactive approach when particular node receives a membership exchange message it immediately forwards such message to a node that haven't seen this membership information yet. Such approach helps to achieve the cluster convergence much faster than with periodic approach, however it requires a much higher network resources utilization.

    Key configuration options of gossip protocol are:

    • Gossip interval - time interval in milliseconds between gossip rounds
    • Speed up size - the maximum amount of nodes in the cluster when gossip protocol can be speeded up by using the reactive approach during messages exchange
    See Also:
    ClusterServiceFactory