weka.classifiers.trees
Class SimpleCart

java.lang.Object
  extended by weka.classifiers.AbstractClassifier
      extended by weka.classifiers.RandomizableClassifier
          extended by weka.classifiers.trees.SimpleCart
All Implemented Interfaces:
Serializable, Cloneable, weka.classifiers.Classifier, weka.core.AdditionalMeasureProducer, weka.core.CapabilitiesHandler, weka.core.OptionHandler, weka.core.Randomizable, weka.core.RevisionHandler, weka.core.TechnicalInformationHandler

public class SimpleCart
extends weka.classifiers.RandomizableClassifier
implements weka.core.AdditionalMeasureProducer, weka.core.TechnicalInformationHandler

Class implementing minimal cost-complexity pruning.
Note when dealing with missing values, use "fractional instances" method instead of surrogate split method.

For more information, see:

Leo Breiman, Jerome H. Friedman, Richard A. Olshen, Charles J. Stone (1984). Classification and Regression Trees. Wadsworth International Group, Belmont, California.

BibTeX:

 @book{Breiman1984,
    address = {Belmont, California},
    author = {Leo Breiman and Jerome H. Friedman and Richard A. Olshen and Charles J. Stone},
    publisher = {Wadsworth International Group},
    title = {Classification and Regression Trees},
    year = {1984}
 }
 

Valid options are:

 -S <num>
  Random number seed.
  (default 1)
 -D
  If set, classifier is run in debug mode and
  may output additional info to the console
 -M <min no>
  The minimal number of instances at the terminal nodes.
  (default 2)
 -N <num folds>
  The number of folds used in the minimal cost-complexity pruning.
  (default 5)
 -U
  Don't use the minimal cost-complexity pruning.
  (default yes).
 -H
  Don't use the heuristic method for binary split.
  (default true).
 -A
  Use 1 SE rule to make pruning decision.
  (default no).
 -C
  Percentage of training data size (0-1].
  (default 1).

Version:
$Revision: 8109 $
Author:
Haijian Shi ([email protected])
See Also:
Serialized Form

Constructor Summary
SimpleCart()
           
 
Method Summary
 void buildClassifier(weka.core.Instances data)
          Build the classifier.
 void calculateAlphas()
          Updates the alpha field for all nodes.
 double[] distributionForInstance(weka.core.Instance instance)
          Computes class probabilities for instance using the decision tree.
 Enumeration enumerateMeasures()
          Return an enumeration of the measure names.
 weka.core.Capabilities getCapabilities()
          Returns default capabilities of the classifier.
 boolean getHeuristic()
          Get if use heuristic search for nominal attributes in multi-class problems.
 double getMeasure(String additionalMeasureName)
          Returns the value of the named measure.
 double getMinNumObj()
          Get minimal number of instances at the terminal nodes.
 int getNumFoldsPruning()
          Set number of folds in internal cross-validation.
 String[] getOptions()
          Gets the current settings of the classifier.
 String getRevision()
          Returns the revision string.
 double getSizePer()
          Get training set size.
 weka.core.TechnicalInformation getTechnicalInformation()
          Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
 boolean getUseOneSE()
          Get if use the 1SE rule to choose final model.
 boolean getUsePrune()
          Get if use minimal cost-complexity pruning.
 String globalInfo()
          Return a description suitable for displaying in the explorer/experimenter.
 String heuristicTipText()
          Returns the tip text for this property
 Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(String[] args)
          Main method.
 double measureTreeSize()
          Return number of tree size.
 String minNumObjTipText()
          Returns the tip text for this property
 void modelErrors()
          Updates the numIncorrectModel field for all nodes when subtree (to be pruned) is rooted.
 String numFoldsPruningTipText()
          Returns the tip text for this property
 int numInnerNodes()
          Method to count the number of inner nodes in the tree.
 int numLeaves()
          Compute number of leaf nodes.
 int numNodes()
          Compute size of the tree.
 void prune(double alpha)
          Prunes the original tree using the CART pruning scheme, given a cost-complexity parameter alpha.
 int prune(double[] alphas, double[] errors, weka.core.Instances test)
          Method for performing one fold in the cross-validation of minimal cost-complexity pruning.
 void setHeuristic(boolean value)
          Set if use heuristic search for nominal attributes in multi-class problems.
 void setMinNumObj(double value)
          Set minimal number of instances at the terminal nodes.
 void setNumFoldsPruning(int value)
          Set number of folds in internal cross-validation.
 void setOptions(String[] options)
          Parses a given list of options.
 void setSizePer(double value)
          Set training set size.
 void setUseOneSE(boolean value)
          Set if use the 1SE rule to choose final model.
 void setUsePrune(boolean value)
          Set if use minimal cost-complexity pruning.
 String sizePerTipText()
          Returns the tip text for this property
 String toString()
          Prints the decision tree using the protected toString method from below.
 void treeErrors()
          Updates the numIncorrectTree field for all nodes.
 String useOneSETipText()
          Returns the tip text for this property
 String usePruneTipText()
          Return the tip text for this property
 
Methods inherited from class weka.classifiers.RandomizableClassifier
getSeed, seedTipText, setSeed
 
Methods inherited from class weka.classifiers.AbstractClassifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, runClassifier, setDebug
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

SimpleCart

public SimpleCart()
Method Detail

globalInfo

public String globalInfo()
Return a description suitable for displaying in the explorer/experimenter.

Returns:
a description suitable for displaying in the explorer/experimenter

getTechnicalInformation

public weka.core.TechnicalInformation getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.

Specified by:
getTechnicalInformation in interface weka.core.TechnicalInformationHandler
Returns:
the technical information about this class

getCapabilities

public weka.core.Capabilities getCapabilities()
Returns default capabilities of the classifier.

Specified by:
getCapabilities in interface weka.classifiers.Classifier
Specified by:
getCapabilities in interface weka.core.CapabilitiesHandler
Overrides:
getCapabilities in class weka.classifiers.AbstractClassifier
Returns:
the capabilities of this classifier

buildClassifier

public void buildClassifier(weka.core.Instances data)
                     throws Exception
Build the classifier.

Specified by:
buildClassifier in interface weka.classifiers.Classifier
Parameters:
data - the training instances
Throws:
Exception - if something goes wrong

prune

public void prune(double alpha)
           throws Exception
Prunes the original tree using the CART pruning scheme, given a cost-complexity parameter alpha.

Parameters:
alpha - the cost-complexity parameter
Throws:
Exception - if something goes wrong

prune

public int prune(double[] alphas,
                 double[] errors,
                 weka.core.Instances test)
          throws Exception
Method for performing one fold in the cross-validation of minimal cost-complexity pruning. Generates a sequence of alpha-values with error estimates for the corresponding (partially pruned) trees, given the test set of that fold.

Parameters:
alphas - array to hold the generated alpha-values
errors - array to hold the corresponding error estimates
test - test set of that fold (to obtain error estimates)
Returns:
the iteration of the pruning
Throws:
Exception - if something goes wrong

modelErrors

public void modelErrors()
                 throws Exception
Updates the numIncorrectModel field for all nodes when subtree (to be pruned) is rooted. This is needed for calculating the alpha-values.

Throws:
Exception - if something goes wrong

treeErrors

public void treeErrors()
                throws Exception
Updates the numIncorrectTree field for all nodes. This is needed for calculating the alpha-values.

Throws:
Exception - if something goes wrong

calculateAlphas

public void calculateAlphas()
                     throws Exception
Updates the alpha field for all nodes.

Throws:
Exception - if something goes wrong

distributionForInstance

public double[] distributionForInstance(weka.core.Instance instance)
                                 throws Exception
Computes class probabilities for instance using the decision tree.

Specified by:
distributionForInstance in interface weka.classifiers.Classifier
Overrides:
distributionForInstance in class weka.classifiers.AbstractClassifier
Parameters:
instance - the instance for which class probabilities is to be computed
Returns:
the class probabilities for the given instance
Throws:
Exception - if something goes wrong

toString

public String toString()
Prints the decision tree using the protected toString method from below.

Overrides:
toString in class Object
Returns:
a textual description of the classifier

numNodes

public int numNodes()
Compute size of the tree.

Returns:
size of the tree

numInnerNodes

public int numInnerNodes()
Method to count the number of inner nodes in the tree.

Returns:
the number of inner nodes

numLeaves

public int numLeaves()
Compute number of leaf nodes.

Returns:
number of leaf nodes

listOptions

public Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface weka.core.OptionHandler
Overrides:
listOptions in class weka.classifiers.RandomizableClassifier
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(String[] options)
                throws Exception
Parses a given list of options.

Valid options are:

 -S <num>
  Random number seed.
  (default 1)
 -D
  If set, classifier is run in debug mode and
  may output additional info to the console
 -M <min no>
  The minimal number of instances at the terminal nodes.
  (default 2)
 -N <num folds>
  The number of folds used in the minimal cost-complexity pruning.
  (default 5)
 -U
  Don't use the minimal cost-complexity pruning.
  (default yes).
 -H
  Don't use the heuristic method for binary split.
  (default true).
 -A
  Use 1 SE rule to make pruning decision.
  (default no).
 -C
  Percentage of training data size (0-1].
  (default 1).

Specified by:
setOptions in interface weka.core.OptionHandler
Overrides:
setOptions in class weka.classifiers.RandomizableClassifier
Parameters:
options - the list of options as an array of strings
Throws:
Exception - if an options is not supported

getOptions

public String[] getOptions()
Gets the current settings of the classifier.

Specified by:
getOptions in interface weka.core.OptionHandler
Overrides:
getOptions in class weka.classifiers.RandomizableClassifier
Returns:
the current setting of the classifier

enumerateMeasures

public Enumeration enumerateMeasures()
Return an enumeration of the measure names.

Specified by:
enumerateMeasures in interface weka.core.AdditionalMeasureProducer
Returns:
an enumeration of the measure names

measureTreeSize

public double measureTreeSize()
Return number of tree size.

Returns:
number of tree size

getMeasure

public double getMeasure(String additionalMeasureName)
Returns the value of the named measure.

Specified by:
getMeasure in interface weka.core.AdditionalMeasureProducer
Parameters:
additionalMeasureName - the name of the measure to query for its value
Returns:
the value of the named measure
Throws:
IllegalArgumentException - if the named measure is not supported

minNumObjTipText

public String minNumObjTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setMinNumObj

public void setMinNumObj(double value)
Set minimal number of instances at the terminal nodes.

Parameters:
value - minimal number of instances at the terminal nodes

getMinNumObj

public double getMinNumObj()
Get minimal number of instances at the terminal nodes.

Returns:
minimal number of instances at the terminal nodes

numFoldsPruningTipText

public String numFoldsPruningTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setNumFoldsPruning

public void setNumFoldsPruning(int value)
Set number of folds in internal cross-validation.

Parameters:
value - number of folds in internal cross-validation.

getNumFoldsPruning

public int getNumFoldsPruning()
Set number of folds in internal cross-validation.

Returns:
number of folds in internal cross-validation.

usePruneTipText

public String usePruneTipText()
Return the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui.

setUsePrune

public void setUsePrune(boolean value)
Set if use minimal cost-complexity pruning.

Parameters:
value - if use minimal cost-complexity pruning

getUsePrune

public boolean getUsePrune()
Get if use minimal cost-complexity pruning.

Returns:
if use minimal cost-complexity pruning

heuristicTipText

public String heuristicTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui.

setHeuristic

public void setHeuristic(boolean value)
Set if use heuristic search for nominal attributes in multi-class problems.

Parameters:
value - if use heuristic search for nominal attributes in multi-class problems

getHeuristic

public boolean getHeuristic()
Get if use heuristic search for nominal attributes in multi-class problems.

Returns:
if use heuristic search for nominal attributes in multi-class problems

useOneSETipText

public String useOneSETipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui.

setUseOneSE

public void setUseOneSE(boolean value)
Set if use the 1SE rule to choose final model.

Parameters:
value - if use the 1SE rule to choose final model

getUseOneSE

public boolean getUseOneSE()
Get if use the 1SE rule to choose final model.

Returns:
if use the 1SE rule to choose final model

sizePerTipText

public String sizePerTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui.

setSizePer

public void setSizePer(double value)
Set training set size.

Parameters:
value - training set size

getSizePer

public double getSizePer()
Get training set size.

Returns:
training set size

getRevision

public String getRevision()
Returns the revision string.

Specified by:
getRevision in interface weka.core.RevisionHandler
Overrides:
getRevision in class weka.classifiers.AbstractClassifier
Returns:
the revision

main

public static void main(String[] args)
Main method.

Parameters:
args - the options for the classifier


Copyright © 2012 University of Waikato, Hamilton, NZ. All Rights Reserved.