weka.clusterers
Class CheckClusterer

java.lang.Object
  extended by weka.core.Check
      extended by weka.core.CheckScheme
          extended by weka.clusterers.CheckClusterer
All Implemented Interfaces:
OptionHandler, RevisionHandler

public class CheckClusterer
extends CheckScheme

Class for examining the capabilities and finding problems with clusterers. If you implement a clusterer using the WEKA.libraries, you should run the checks on it to ensure robustness and correct operation. Passing all the tests of this object does not mean bugs in the clusterer don't exist, but this will help find some common ones.

Typical usage:

java weka.clusterers.CheckClusterer -W clusterer_name -- clusterer_options

CheckClusterer reports on the following:

Running CheckClusterer with the debug option set will output the training dataset for any failed tests.

The weka.clusterers.AbstractClustererTest uses this class to test all the clusterers. Any changes here, have to be checked in that abstract test class, too.

Valid options are:

 -D
  Turn on debugging output.
 -S
  Silent mode - prints nothing to stdout.
 -N <num>
  The number of instances in the datasets (default 20).
 -nominal <num>
  The number of nominal attributes (default 2).
 -nominal-values <num>
  The number of values for nominal attributes (default 1).
 -numeric <num>
  The number of numeric attributes (default 1).
 -string <num>
  The number of string attributes (default 1).
 -date <num>
  The number of date attributes (default 1).
 -relational <num>
  The number of relational attributes (default 1).
 -num-instances-relational <num>
  The number of instances in relational/bag attributes (default 10).
 -words <comma-separated-list>
  The words to use in string attributes.
 -word-separators <chars>
  The word separators to use in string attributes.
 -W
  Full name of the clusterer analyzed.
  eg: weka.clusterers.SimpleKMeans
  (default weka.clusterers.SimpleKMeans)
 
 Options specific to clusterer weka.clusterers.SimpleKMeans:
 
 -N <num>
  number of clusters.
  (default 2).
 -V
  Display std. deviations for centroids.
 
 -M
  Replace missing values with mean/mode.
 
 -S <num>
  Random number seed.
  (default 10)
Options after -- are passed to the designated clusterer.

Version:
$Revision: 1.11 $
Author:
Len Trigg ([email protected]), FracPete (fracpete at waikato dot ac dot nz)
See Also:
TestInstances

Nested Class Summary
 
Nested classes/interfaces inherited from class weka.core.CheckScheme
CheckScheme.PostProcessor
 
Constructor Summary
CheckClusterer()
          default constructor
 
Method Summary
 void doTests()
          Begin the tests, reporting results to System.out
 Clusterer getClusterer()
          Get the clusterer used as the clusterer
 java.lang.String[] getOptions()
          Gets the current settings of the CheckClusterer.
 java.lang.String getRevision()
          Returns the revision string.
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] args)
          Test method for this class
 void setClusterer(Clusterer newClusterer)
          Set the clusterer for testing.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 
Methods inherited from class weka.core.CheckScheme
attributeTypeToString, getNumDate, getNumInstances, getNumInstancesRelational, getNumNominal, getNumNumeric, getNumRelational, getNumString, getPostProcessor, getWords, getWordSeparators, hasClasspathProblems, setNumDate, setNumInstances, setNumInstancesRelational, setNumNominal, setNumNumeric, setNumRelational, setNumString, setPostProcessor, setWords, setWordSeparators
 
Methods inherited from class weka.core.Check
getDebug, getSilent, setDebug, setSilent
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CheckClusterer

public CheckClusterer()
default constructor

Method Detail

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Overrides:
listOptions in class CheckScheme
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options.

Valid options are:

 -D
  Turn on debugging output.
 -S
  Silent mode - prints nothing to stdout.
 -N <num>
  The number of instances in the datasets (default 20).
 -nominal <num>
  The number of nominal attributes (default 2).
 -nominal-values <num>
  The number of values for nominal attributes (default 1).
 -numeric <num>
  The number of numeric attributes (default 1).
 -string <num>
  The number of string attributes (default 1).
 -date <num>
  The number of date attributes (default 1).
 -relational <num>
  The number of relational attributes (default 1).
 -num-instances-relational <num>
  The number of instances in relational/bag attributes (default 10).
 -words <comma-separated-list>
  The words to use in string attributes.
 -word-separators <chars>
  The word separators to use in string attributes.
 -W
  Full name of the clusterer analyzed.
  eg: weka.clusterers.SimpleKMeans
  (default weka.clusterers.SimpleKMeans)
 
 Options specific to clusterer weka.clusterers.SimpleKMeans:
 
 -N <num>
  number of clusters.
  (default 2).
 -V
  Display std. deviations for centroids.
 
 -M
  Replace missing values with mean/mode.
 
 -S <num>
  Random number seed.
  (default 10)

Specified by:
setOptions in interface OptionHandler
Overrides:
setOptions in class CheckScheme
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the CheckClusterer.

Specified by:
getOptions in interface OptionHandler
Overrides:
getOptions in class CheckScheme
Returns:
an array of strings suitable for passing to setOptions

doTests

public void doTests()
Begin the tests, reporting results to System.out

Specified by:
doTests in class CheckScheme

setClusterer

public void setClusterer(Clusterer newClusterer)
Set the clusterer for testing.

Parameters:
newClusterer - the Clusterer to use.

getClusterer

public Clusterer getClusterer()
Get the clusterer used as the clusterer

Returns:
the clusterer used as the clusterer

getRevision

public java.lang.String getRevision()
Returns the revision string.

Returns:
the revision

main

public static void main(java.lang.String[] args)
Test method for this class

Parameters:
args - the commandline options