public class SemiClustering extends Object implements ComputeFunction<Long,Set<SemiClustering.SemiCluster>,Double,Set<SemiClustering.SemiCluster>>
The input to the algorithm is an undirected weighted graph and the output is a set of clusters with each vertex potentially belonging to multiple clusters.
A semi-cluster is assigned a score S=(I-f*B)/(V(V-1)/2), where I is the sum of weights of all internal edges, B is the sum of weights of all boundary edges, V is the number of vertices in the semi-cluster, f is a user-specified boundary edge score factor with a value between 0 and 1.
Each vertex maintains a list containing a maximum number of semi-clusters, sorted by score. The lists gets greedily updated in an iterative manner.
The algorithm finishes when the semi-cluster lists don't change or after a maximum number of iterations.
Modifier and Type | Class and Description |
---|---|
static class |
SemiClustering.SemiCluster
This class represents a semi-cluster.
|
ComputeFunction.Aggregators, ComputeFunction.Callback<K,VV,EV,Message>, ComputeFunction.InitCallback, ComputeFunction.MasterCallback, ComputeFunction.ReadAggregators, ComputeFunction.ReadWriteAggregators
Modifier and Type | Field and Description |
---|---|
static String |
CLUSTER_CAPACITY
Maximum number of vertices in a semi-cluster.
|
static int |
CLUSTER_CAPACITY_DEFAULT
Default value for cluster capacity.
|
static String |
ITERATIONS
Maximum number of iterations.
|
static int |
ITERATIONS_DEFAULT
Default value for ITERATIONS.
|
static String |
MAX_CLUSTERS
Maximum number of semi-clusters.
|
static int |
MAX_CLUSTERS_DEFAULT
Default value for maximum number of semi-clusters.
|
static String |
SCORE_FACTOR
Boundary edge score factor.
|
static double |
SCORE_FACTOR_DEFAULT
Default value for Boundary Edge Score Factor.
|
Constructor and Description |
---|
SemiClustering() |
Modifier and Type | Method and Description |
---|---|
void |
compute(int superstep,
VertexWithValue<Long,Set<SemiClustering.SemiCluster>> vertex,
Iterable<Set<SemiClustering.SemiCluster>> messages,
Iterable<EdgeWithValue<Long,Double>> edges,
ComputeFunction.Callback<Long,Set<SemiClustering.SemiCluster>,Double,Set<SemiClustering.SemiCluster>> cb)
Compute method.
|
void |
init(Map<String,?> configs,
ComputeFunction.InitCallback cb)
Initialize the ComputeFunction, this is the place to register aggregators.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
masterCompute, postSuperstep, preSuperstep
public static final String ITERATIONS
public static final int ITERATIONS_DEFAULT
public static final String MAX_CLUSTERS
public static final int MAX_CLUSTERS_DEFAULT
public static final String CLUSTER_CAPACITY
public static final int CLUSTER_CAPACITY_DEFAULT
public static final String SCORE_FACTOR
public static final double SCORE_FACTOR_DEFAULT
public final void init(Map<String,?> configs, ComputeFunction.InitCallback cb)
ComputeFunction
init
in interface ComputeFunction<Long,Set<SemiClustering.SemiCluster>,Double,Set<SemiClustering.SemiCluster>>
configs
- configuration parameterscb
- a callback for registering aggregatorspublic void compute(int superstep, VertexWithValue<Long,Set<SemiClustering.SemiCluster>> vertex, Iterable<Set<SemiClustering.SemiCluster>> messages, Iterable<EdgeWithValue<Long,Double>> edges, ComputeFunction.Callback<Long,Set<SemiClustering.SemiCluster>,Double,Set<SemiClustering.SemiCluster>> cb)
compute
in interface ComputeFunction<Long,Set<SemiClustering.SemiCluster>,Double,Set<SemiClustering.SemiCluster>>
messages
- Messages receivedsuperstep
- the count of the current superstepvertex
- the current vertex with its valueedges
- the adjacent edges with their valuescb
- a callback for setting a new vertex value or sending messages to the next superstepCopyright © 2020. All rights reserved.