weka.classifiers.lazy
Class IBk

java.lang.Object
  extended by weka.classifiers.AbstractClassifier
      extended by weka.classifiers.lazy.IBk
All Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable, Classifier, UpdateableClassifier, AdditionalMeasureProducer, CapabilitiesHandler, OptionHandler, RevisionHandler, TechnicalInformationHandler, WeightedInstancesHandler

public class IBk
extends AbstractClassifier
implements OptionHandler, UpdateableClassifier, WeightedInstancesHandler, TechnicalInformationHandler, AdditionalMeasureProducer

K-nearest neighbours classifier. Can select appropriate value of K based on cross-validation. Can also do distance weighting.

For more information, see

D. Aha, D. Kibler (1991). Instance-based learning algorithms. Machine Learning. 6:37-66.

BibTeX:

 @article{Aha1991,
    author = {D. Aha and D. Kibler},
    journal = {Machine Learning},
    pages = {37-66},
    title = {Instance-based learning algorithms},
    volume = {6},
    year = {1991}
 }
 

Valid options are:

 -I
  Weight neighbours by the inverse of their distance
  (use when k > 1)
 -F
  Weight neighbours by 1 - their distance
  (use when k > 1)
 -K <number of neighbors>
  Number of nearest neighbours (k) used in classification.
  (Default = 1)
 -E
  Minimise mean squared error rather than mean absolute
  error when using -X option with numeric prediction.
 -W <window size>
  Maximum number of training instances maintained.
  Training instances are dropped FIFO. (Default = no window)
 -X
  Select the number of nearest neighbours between 1
  and the k value specified using hold-one-out evaluation
  on the training data (use when k > 1)
 -A
  The nearest neighbour search algorithm to use (default: weka.core.neighboursearch.LinearNNSearch).
 

Version:
$Revision: 6572 $
Author:
Stuart Inglis ([email protected]), Len Trigg ([email protected]), Eibe Frank ([email protected])
See Also:
Serialized Form

Field Summary
static Tag[] TAGS_WEIGHTING
          possible instance weighting methods.
static int WEIGHT_INVERSE
          weight by 1/distance.
static int WEIGHT_NONE
          no weighting.
static int WEIGHT_SIMILARITY
          weight by 1-distance.
 
Constructor Summary
IBk()
          IB1 classifer.
IBk(int k)
          IBk classifier.
 
Method Summary
 void buildClassifier(Instances instances)
          Generates the classifier.
 java.lang.String crossValidateTipText()
          Returns the tip text for this property.
 java.lang.String distanceWeightingTipText()
          Returns the tip text for this property.
 double[] distributionForInstance(Instance instance)
          Calculates the class membership probabilities for the given test instance.
 java.util.Enumeration enumerateMeasures()
          Returns an enumeration of the additional measure names produced by the neighbour search algorithm, plus the chosen K in case cross-validation is enabled.
 Capabilities getCapabilities()
          Returns default capabilities of the classifier.
 boolean getCrossValidate()
          Gets whether hold-one-out cross-validation will be used to select the best k value.
 SelectedTag getDistanceWeighting()
          Gets the distance weighting method used.
 int getKNN()
          Gets the number of neighbours the learner will use.
 boolean getMeanSquared()
          Gets whether the mean squared error is used rather than mean absolute error when doing cross-validation.
 double getMeasure(java.lang.String additionalMeasureName)
          Returns the value of the named measure from the neighbour search algorithm, plus the chosen K in case cross-validation is enabled.
 NearestNeighbourSearch getNearestNeighbourSearchAlgorithm()
          Returns the current nearestNeighbourSearch algorithm in use.
 int getNumTraining()
          Get the number of training instances the classifier is currently using.
 java.lang.String[] getOptions()
          Gets the current settings of IBk.
 java.lang.String getRevision()
          Returns the revision string.
 TechnicalInformation getTechnicalInformation()
          Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
 int getWindowSize()
          Gets the maximum number of instances allowed in the training pool.
 java.lang.String globalInfo()
          Returns a string describing classifier.
 java.lang.String KNNTipText()
          Returns the tip text for this property.
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] argv)
          Main method for testing this class.
 java.lang.String meanSquaredTipText()
          Returns the tip text for this property.
 java.lang.String nearestNeighbourSearchAlgorithmTipText()
          Returns the tip text for this property.
 Instances pruneToK(Instances neighbours, double[] distances, int k)
          Prunes the list to contain the k nearest neighbors.
 void setCrossValidate(boolean newCrossValidate)
          Sets whether hold-one-out cross-validation will be used to select the best k value.
 void setDistanceWeighting(SelectedTag newMethod)
          Sets the distance weighting method used.
 void setKNN(int k)
          Set the number of neighbours the learner is to use.
 void setMeanSquared(boolean newMeanSquared)
          Sets whether the mean squared error is used rather than mean absolute error when doing cross-validation.
 void setNearestNeighbourSearchAlgorithm(NearestNeighbourSearch nearestNeighbourSearchAlgorithm)
          Sets the nearestNeighbourSearch algorithm to be used for finding nearest neighbour(s).
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setWindowSize(int newWindowSize)
          Sets the maximum number of instances allowed in the training pool.
 java.lang.String toString()
          Returns a description of this classifier.
 void updateClassifier(Instance instance)
          Adds the supplied instance to the training set.
 java.lang.String windowSizeTipText()
          Returns the tip text for this property.
 
Methods inherited from class weka.classifiers.AbstractClassifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, runClassifier, setDebug
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

WEIGHT_NONE

public static final int WEIGHT_NONE
no weighting.

See Also:
Constant Field Values

WEIGHT_INVERSE

public static final int WEIGHT_INVERSE
weight by 1/distance.

See Also:
Constant Field Values

WEIGHT_SIMILARITY

public static final int WEIGHT_SIMILARITY
weight by 1-distance.

See Also:
Constant Field Values

TAGS_WEIGHTING

public static final Tag[] TAGS_WEIGHTING
possible instance weighting methods.

Constructor Detail

IBk

public IBk(int k)
IBk classifier. Simple instance-based learner that uses the class of the nearest k training instances for the class of the test instances.

Parameters:
k - the number of nearest neighbors to use for prediction

IBk

public IBk()
IB1 classifer. Instance-based learner. Predicts the class of the single nearest training instance for each test instance.

Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing classifier.

Returns:
a description suitable for displaying in the explorer/experimenter gui

getTechnicalInformation

public TechnicalInformation getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.

Specified by:
getTechnicalInformation in interface TechnicalInformationHandler
Returns:
the technical information about this class

KNNTipText

public java.lang.String KNNTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setKNN

public void setKNN(int k)
Set the number of neighbours the learner is to use.

Parameters:
k - the number of neighbours.

getKNN

public int getKNN()
Gets the number of neighbours the learner will use.

Returns:
the number of neighbours.

windowSizeTipText

public java.lang.String windowSizeTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getWindowSize

public int getWindowSize()
Gets the maximum number of instances allowed in the training pool. The addition of new instances above this value will result in old instances being removed. A value of 0 signifies no limit to the number of training instances.

Returns:
Value of WindowSize.

setWindowSize

public void setWindowSize(int newWindowSize)
Sets the maximum number of instances allowed in the training pool. The addition of new instances above this value will result in old instances being removed. A value of 0 signifies no limit to the number of training instances.

Parameters:
newWindowSize - Value to assign to WindowSize.

distanceWeightingTipText

public java.lang.String distanceWeightingTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getDistanceWeighting

public SelectedTag getDistanceWeighting()
Gets the distance weighting method used. Will be one of WEIGHT_NONE, WEIGHT_INVERSE, or WEIGHT_SIMILARITY

Returns:
the distance weighting method used.

setDistanceWeighting

public void setDistanceWeighting(SelectedTag newMethod)
Sets the distance weighting method used. Values other than WEIGHT_NONE, WEIGHT_INVERSE, or WEIGHT_SIMILARITY will be ignored.

Parameters:
newMethod - the distance weighting method to use

meanSquaredTipText

public java.lang.String meanSquaredTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getMeanSquared

public boolean getMeanSquared()
Gets whether the mean squared error is used rather than mean absolute error when doing cross-validation.

Returns:
true if so.

setMeanSquared

public void setMeanSquared(boolean newMeanSquared)
Sets whether the mean squared error is used rather than mean absolute error when doing cross-validation.

Parameters:
newMeanSquared - true if so.

crossValidateTipText

public java.lang.String crossValidateTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getCrossValidate

public boolean getCrossValidate()
Gets whether hold-one-out cross-validation will be used to select the best k value.

Returns:
true if cross-validation will be used.

setCrossValidate

public void setCrossValidate(boolean newCrossValidate)
Sets whether hold-one-out cross-validation will be used to select the best k value.

Parameters:
newCrossValidate - true if cross-validation should be used.

nearestNeighbourSearchAlgorithmTipText

public java.lang.String nearestNeighbourSearchAlgorithmTipText()
Returns the tip text for this property.

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

getNearestNeighbourSearchAlgorithm

public NearestNeighbourSearch getNearestNeighbourSearchAlgorithm()
Returns the current nearestNeighbourSearch algorithm in use.

Returns:
the NearestNeighbourSearch algorithm currently in use.

setNearestNeighbourSearchAlgorithm

public void setNearestNeighbourSearchAlgorithm(NearestNeighbourSearch nearestNeighbourSearchAlgorithm)
Sets the nearestNeighbourSearch algorithm to be used for finding nearest neighbour(s).

Parameters:
nearestNeighbourSearchAlgorithm - - The NearestNeighbourSearch class.

getNumTraining

public int getNumTraining()
Get the number of training instances the classifier is currently using.

Returns:
the number of training instances the classifier is currently using

getCapabilities

public Capabilities getCapabilities()
Returns default capabilities of the classifier.

Specified by:
getCapabilities in interface Classifier
Specified by:
getCapabilities in interface CapabilitiesHandler
Overrides:
getCapabilities in class AbstractClassifier
Returns:
the capabilities of this classifier
See Also:
Capabilities

buildClassifier

public void buildClassifier(Instances instances)
                     throws java.lang.Exception
Generates the classifier.

Specified by:
buildClassifier in interface Classifier
Parameters:
instances - set of instances serving as training data
Throws:
java.lang.Exception - if the classifier has not been generated successfully

updateClassifier

public void updateClassifier(Instance instance)
                      throws java.lang.Exception
Adds the supplied instance to the training set.

Specified by:
updateClassifier in interface UpdateableClassifier
Parameters:
instance - the instance to add
Throws:
java.lang.Exception - if instance could not be incorporated successfully

distributionForInstance

public double[] distributionForInstance(Instance instance)
                                 throws java.lang.Exception
Calculates the class membership probabilities for the given test instance.

Specified by:
distributionForInstance in interface Classifier
Overrides:
distributionForInstance in class AbstractClassifier
Parameters:
instance - the instance to be classified
Returns:
predicted class probability distribution
Throws:
java.lang.Exception - if an error occurred during the prediction

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Overrides:
listOptions in class AbstractClassifier
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options.

Valid options are:

 -I
  Weight neighbours by the inverse of their distance
  (use when k > 1)
 -F
  Weight neighbours by 1 - their distance
  (use when k > 1)
 -K <number of neighbors>
  Number of nearest neighbours (k) used in classification.
  (Default = 1)
 -E
  Minimise mean squared error rather than mean absolute
  error when using -X option with numeric prediction.
 -W <window size>
  Maximum number of training instances maintained.
  Training instances are dropped FIFO. (Default = no window)
 -X
  Select the number of nearest neighbours between 1
  and the k value specified using hold-one-out evaluation
  on the training data (use when k > 1)
 -A
  The nearest neighbour search algorithm to use (default: weka.core.neighboursearch.LinearNNSearch).
 

Specified by:
setOptions in interface OptionHandler
Overrides:
setOptions in class AbstractClassifier
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of IBk.

Specified by:
getOptions in interface OptionHandler
Overrides:
getOptions in class AbstractClassifier
Returns:
an array of strings suitable for passing to setOptions()

enumerateMeasures

public java.util.Enumeration enumerateMeasures()
Returns an enumeration of the additional measure names produced by the neighbour search algorithm, plus the chosen K in case cross-validation is enabled.

Specified by:
enumerateMeasures in interface AdditionalMeasureProducer
Returns:
an enumeration of the measure names

getMeasure

public double getMeasure(java.lang.String additionalMeasureName)
Returns the value of the named measure from the neighbour search algorithm, plus the chosen K in case cross-validation is enabled.

Specified by:
getMeasure in interface AdditionalMeasureProducer
Parameters:
additionalMeasureName - the name of the measure to query for its value
Returns:
the value of the named measure
Throws:
java.lang.IllegalArgumentException - if the named measure is not supported

toString

public java.lang.String toString()
Returns a description of this classifier.

Overrides:
toString in class java.lang.Object
Returns:
a description of this classifier as a string.

pruneToK

public Instances pruneToK(Instances neighbours,
                          double[] distances,
                          int k)
Prunes the list to contain the k nearest neighbors. If there are multiple neighbors at the k'th distance, all will be kept.

Parameters:
neighbours - the neighbour instances.
distances - the distances of the neighbours from target instance.
k - the number of neighbors to keep.
Returns:
the pruned neighbours.

getRevision

public java.lang.String getRevision()
Returns the revision string.

Specified by:
getRevision in interface RevisionHandler
Overrides:
getRevision in class AbstractClassifier
Returns:
the revision

main

public static void main(java.lang.String[] argv)
Main method for testing this class.

Parameters:
argv - should contain command line options (see setOptions)