|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectweka.classifiers.Classifier
weka.classifiers.SingleClassifierEnhancer
weka.classifiers.RandomizableSingleClassifierEnhancer
weka.classifiers.meta.MetaCost
public class MetaCost
This metaclassifier makes its base classifier cost-sensitive using the method specified in
Pedro Domingos: MetaCost: A general method for making classifiers cost-sensitive. In: Fifth International Conference on Knowledge Discovery and Data Mining, 155-164, 1999.
This classifier should produce similar results to one created by passing the base learner to Bagging, which is in turn passed to a CostSensitiveClassifier operating on minimum expected cost. The difference is that MetaCost produces a single cost-sensitive classifier of the base learner, giving the benefits of fast classification and interpretable output (if the base learner itself is interpretable). This implementation uses all bagging iterations when reclassifying training data (the MetaCost paper reports a marginal improvement when only those iterations containing each training instance are used in reclassifying that instance).
@inproceedings{Domingos1999, author = {Pedro Domingos}, booktitle = {Fifth International Conference on Knowledge Discovery and Data Mining}, pages = {155-164}, title = {MetaCost: A general method for making classifiers cost-sensitive}, year = {1999} }Valid options are:
-I <num> Number of bagging iterations. (default 10)
-C <cost file name> File name of a cost matrix to use. If this is not supplied, a cost matrix will be loaded on demand. The name of the on-demand file is the relation name of the training data plus ".cost", and the path to the on-demand file is specified with the -N option.
-N <directory> Name of a directory to search for cost files when loading costs on demand (default current directory).
-cost-matrix <matrix> The cost matrix in Matlab single line format.
-P Size of each bag, as a percentage of the training set size. (default 100)
-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.rules.ZeroR)
Options specific to classifier weka.classifiers.rules.ZeroR:
-D If set, classifier is run in debug mode and may output additional info to the consoleOptions after -- are passed to the designated classifier.
Field Summary | |
---|---|
static int |
MATRIX_ON_DEMAND
load cost matrix on demand |
static int |
MATRIX_SUPPLIED
use explicit matrix |
static Tag[] |
TAGS_MATRIX_SOURCE
Specify possible sources of the cost matrix |
Constructor Summary | |
---|---|
MetaCost()
|
Method Summary | |
---|---|
java.lang.String |
bagSizePercentTipText()
Returns the tip text for this property |
void |
buildClassifier(Instances data)
Builds the model of the base learner. |
java.lang.String |
costMatrixSourceTipText()
Returns the tip text for this property |
java.lang.String |
costMatrixTipText()
Returns the tip text for this property |
double[] |
distributionForInstance(Instance instance)
Classifies a given instance after filtering. |
int |
getBagSizePercent()
Gets the size of each bag, as a percentage of the training set size. |
Capabilities |
getCapabilities()
Returns default capabilities of the classifier. |
CostMatrix |
getCostMatrix()
Gets the misclassification cost matrix. |
SelectedTag |
getCostMatrixSource()
Gets the source location method of the cost matrix. |
int |
getNumIterations()
Gets the number of bagging iterations |
java.io.File |
getOnDemandDirectory()
Returns the directory that will be searched for cost files when loading on demand. |
java.lang.String[] |
getOptions()
Gets the current settings of the Classifier. |
java.lang.String |
getRevision()
Returns the revision string. |
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on. |
java.lang.String |
globalInfo()
Returns a string describing classifier |
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options. |
static void |
main(java.lang.String[] argv)
Main method for testing this class. |
java.lang.String |
numIterationsTipText()
Returns the tip text for this property |
java.lang.String |
onDemandDirectoryTipText()
Returns the tip text for this property |
void |
setBagSizePercent(int newBagSizePercent)
Sets the size of each bag, as a percentage of the training set size. |
void |
setCostMatrix(CostMatrix newCostMatrix)
Sets the misclassification cost matrix. |
void |
setCostMatrixSource(SelectedTag newMethod)
Sets the source location of the cost matrix. |
void |
setNumIterations(int numIterations)
Sets the number of bagging iterations |
void |
setOnDemandDirectory(java.io.File newDir)
Sets the directory that will be searched for cost files when loading on demand. |
void |
setOptions(java.lang.String[] options)
Parses a given list of options. |
java.lang.String |
toString()
Output a representation of this classifier |
Methods inherited from class weka.classifiers.RandomizableSingleClassifierEnhancer |
---|
getSeed, seedTipText, setSeed |
Methods inherited from class weka.classifiers.SingleClassifierEnhancer |
---|
classifierTipText, getClassifier, setClassifier |
Methods inherited from class weka.classifiers.Classifier |
---|
classifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, setDebug |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
public static final int MATRIX_ON_DEMAND
public static final int MATRIX_SUPPLIED
public static final Tag[] TAGS_MATRIX_SOURCE
Constructor Detail |
---|
public MetaCost()
Method Detail |
---|
public java.lang.String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation
in interface TechnicalInformationHandler
public java.util.Enumeration listOptions()
listOptions
in interface OptionHandler
listOptions
in class RandomizableSingleClassifierEnhancer
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-I <num> Number of bagging iterations. (default 10)
-C <cost file name> File name of a cost matrix to use. If this is not supplied, a cost matrix will be loaded on demand. The name of the on-demand file is the relation name of the training data plus ".cost", and the path to the on-demand file is specified with the -N option.
-N <directory> Name of a directory to search for cost files when loading costs on demand (default current directory).
-cost-matrix <matrix> The cost matrix in Matlab single line format.
-P Size of each bag, as a percentage of the training set size. (default 100)
-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.rules.ZeroR)
Options specific to classifier weka.classifiers.rules.ZeroR:
-D If set, classifier is run in debug mode and may output additional info to the consoleOptions after -- are passed to the designated classifier.
setOptions
in interface OptionHandler
setOptions
in class RandomizableSingleClassifierEnhancer
options
- the list of options as an array of strings
java.lang.Exception
- if an option is not supportedpublic java.lang.String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class RandomizableSingleClassifierEnhancer
public java.lang.String costMatrixSourceTipText()
public SelectedTag getCostMatrixSource()
public void setCostMatrixSource(SelectedTag newMethod)
newMethod
- the cost matrix location method.public java.lang.String onDemandDirectoryTipText()
public java.io.File getOnDemandDirectory()
public void setOnDemandDirectory(java.io.File newDir)
newDir
- The cost file search directory.public java.lang.String bagSizePercentTipText()
public int getBagSizePercent()
public void setBagSizePercent(int newBagSizePercent)
newBagSizePercent
- the bag size, as a percentage.public java.lang.String numIterationsTipText()
public void setNumIterations(int numIterations)
numIterations
- the number of iterations to usepublic int getNumIterations()
public java.lang.String costMatrixTipText()
public CostMatrix getCostMatrix()
public void setCostMatrix(CostMatrix newCostMatrix)
newCostMatrix
- the cost matrixpublic Capabilities getCapabilities()
getCapabilities
in interface CapabilitiesHandler
getCapabilities
in class SingleClassifierEnhancer
Capabilities
public void buildClassifier(Instances data) throws java.lang.Exception
buildClassifier
in class Classifier
data
- the training data
java.lang.Exception
- if the classifier could not be built successfullypublic double[] distributionForInstance(Instance instance) throws java.lang.Exception
distributionForInstance
in class Classifier
instance
- the instance to be classified
java.lang.Exception
- if instance could not be classified
successfullypublic java.lang.String toString()
toString
in class java.lang.Object
public java.lang.String getRevision()
getRevision
in interface RevisionHandler
getRevision
in class Classifier
public static void main(java.lang.String[] argv)
argv
- should contain the following arguments:
-t training file [-T test file] [-c class index]
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |