weka.core.converters
Class ArffLoader

java.lang.Object
  extended by weka.core.converters.AbstractLoader
      extended by weka.core.converters.AbstractFileLoader
          extended by weka.core.converters.ArffLoader
All Implemented Interfaces:
java.io.Serializable, BatchConverter, FileSourcedConverter, IncrementalConverter, Loader, URLSourcedLoader, EnvironmentHandler, OptionHandler, RevisionHandler

public class ArffLoader
extends AbstractFileLoader
implements OptionHandler, BatchConverter, IncrementalConverter, URLSourcedLoader

Reads a source that is in arff (attribute relation file format) format.

Version:
$Revision: 7792 $
Author:
Mark Hall ([email protected]), FracPete (fracpete at waikato dot ac dot nz)
See Also:
Loader, Serialized Form

Nested Class Summary
static class ArffLoader.ArffReader
          Reads data from an ARFF file, either in incremental or batch mode.
 
Field Summary
static java.lang.String FILE_EXTENSION
          the file extension
static java.lang.String FILE_EXTENSION_COMPRESSED
           
 
Fields inherited from interface weka.core.converters.Loader
BATCH, INCREMENTAL, NONE
 
Constructor Summary
ArffLoader()
           
 
Method Summary
 Instances getDataSet()
          Return the full data set.
 java.lang.String getFileDescription()
          Returns a description of the file type.
 java.lang.String getFileExtension()
          Get the file extension used for arff files
 java.lang.String[] getFileExtensions()
          Gets all the file extensions used for this type of file
 Instance getNextInstance(Instances structure)
          Read the data set incrementally---get the next instance in the data set or returns null if there are no more instances to get.
 java.lang.String[] getOptions()
          Gets the current settings of the Classifier.
 boolean getRetainStringValues()
          Get whether to retain all string values for string in the header when reading incrementally
 java.lang.String getRevision()
          Returns the revision string.
 Instances getStructure()
          Determines and returns (if possible) the structure (internally the header) of the data set as an empty set of instances.
 java.lang.String globalInfo()
          Returns a string describing this Loader
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] args)
          Main method.
 void reset()
          Resets the Loader ready to read a new data set or the same data set again.
 java.lang.String retainStringValuesTipText()
          the tip text for this property
 java.io.File retrieveFile()
          get the File specified as the source
 java.lang.String retrieveURL()
          Return the current url
 void setFile(java.io.File file)
          sets the source File
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setRetainStringValues(boolean r)
          Set whether to retain all string values for string in the header when reading incrementally
 void setSource(java.io.InputStream in)
          Resets the Loader object and sets the source of the data set to be the supplied InputStream.
 void setSource(java.net.URL url)
          Resets the Loader object and sets the source of the data set to be the supplied url.
 void setURL(java.lang.String url)
          Set the url to load from
 
Methods inherited from class weka.core.converters.AbstractFileLoader
getUseRelativePath, runFileLoader, setEnvironment, setSource, setUseRelativePath, useRelativePathTipText
 
Methods inherited from class weka.core.converters.AbstractLoader
setRetrieval
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

FILE_EXTENSION

public static java.lang.String FILE_EXTENSION
the file extension


FILE_EXTENSION_COMPRESSED

public static java.lang.String FILE_EXTENSION_COMPRESSED
Constructor Detail

ArffLoader

public ArffLoader()
Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing this Loader

Returns:
a description of the Loader suitable for displaying in the explorer/experimenter gui

getFileExtension

public java.lang.String getFileExtension()
Get the file extension used for arff files

Specified by:
getFileExtension in interface FileSourcedConverter
Returns:
the file extension

getFileExtensions

public java.lang.String[] getFileExtensions()
Gets all the file extensions used for this type of file

Specified by:
getFileExtensions in interface FileSourcedConverter
Returns:
the file extensions

getFileDescription

public java.lang.String getFileDescription()
Returns a description of the file type.

Specified by:
getFileDescription in interface FileSourcedConverter
Returns:
a short file description

reset

public void reset()
           throws java.io.IOException
Resets the Loader ready to read a new data set or the same data set again.

Specified by:
reset in interface Loader
Overrides:
reset in class AbstractFileLoader
Throws:
java.io.IOException - if something goes wrong

setSource

public void setSource(java.net.URL url)
               throws java.io.IOException
Resets the Loader object and sets the source of the data set to be the supplied url.

Parameters:
url - the source url.
Throws:
java.io.IOException - if an error occurs

retrieveFile

public java.io.File retrieveFile()
get the File specified as the source

Specified by:
retrieveFile in interface FileSourcedConverter
Overrides:
retrieveFile in class AbstractFileLoader
Returns:
the source file

setFile

public void setFile(java.io.File file)
             throws java.io.IOException
sets the source File

Specified by:
setFile in interface FileSourcedConverter
Overrides:
setFile in class AbstractFileLoader
Parameters:
file - the source file
Throws:
java.io.IOException - if an error occurs

setURL

public void setURL(java.lang.String url)
            throws java.io.IOException
Set the url to load from

Specified by:
setURL in interface URLSourcedLoader
Parameters:
url - the url to load from
Throws:
java.io.IOException - if the url can't be set.

retrieveURL

public java.lang.String retrieveURL()
Return the current url

Specified by:
retrieveURL in interface URLSourcedLoader
Returns:
the current url

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options.

Valid options are:

 -R
  Retain all string attribute values when reading incrementally.

Specified by:
setOptions in interface OptionHandler
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

setRetainStringValues

public void setRetainStringValues(boolean r)
Set whether to retain all string values for string in the header when reading incrementally

Parameters:
r - true if all string values are to be stored (as opposed to just the current one).

getRetainStringValues

public boolean getRetainStringValues()
Get whether to retain all string values for string in the header when reading incrementally

Returns:
true if all string values are to be stored (as opposed to just the current one).

retainStringValuesTipText

public java.lang.String retainStringValuesTipText()
the tip text for this property

Returns:
the tip text

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the Classifier.

Specified by:
getOptions in interface OptionHandler
Returns:
an array of strings suitable for passing to setOptions

setSource

public void setSource(java.io.InputStream in)
               throws java.io.IOException
Resets the Loader object and sets the source of the data set to be the supplied InputStream.

Specified by:
setSource in interface Loader
Overrides:
setSource in class AbstractLoader
Parameters:
in - the source InputStream.
Throws:
java.io.IOException - always thrown.

getStructure

public Instances getStructure()
                       throws java.io.IOException
Determines and returns (if possible) the structure (internally the header) of the data set as an empty set of instances.

Specified by:
getStructure in interface Loader
Specified by:
getStructure in class AbstractLoader
Returns:
the structure of the data set as an empty set of Instances
Throws:
java.io.IOException - if an error occurs

getDataSet

public Instances getDataSet()
                     throws java.io.IOException
Return the full data set. If the structure hasn't yet been determined by a call to getStructure then method should do so before processing the rest of the data set.

Specified by:
getDataSet in interface Loader
Specified by:
getDataSet in class AbstractLoader
Returns:
the structure of the data set as an empty set of Instances
Throws:
java.io.IOException - if there is no source or parsing fails

getNextInstance

public Instance getNextInstance(Instances structure)
                         throws java.io.IOException
Read the data set incrementally---get the next instance in the data set or returns null if there are no more instances to get. If the structure hasn't yet been determined by a call to getStructure then method should do so before returning the next instance in the data set.

Specified by:
getNextInstance in interface Loader
Specified by:
getNextInstance in class AbstractLoader
Parameters:
structure - the dataset header information, will get updated in case of string or relational attributes
Returns:
the next instance in the data set as an Instance object or null if there are no more instances to be read
Throws:
java.io.IOException - if there is an error during parsing

getRevision

public java.lang.String getRevision()
Returns the revision string.

Specified by:
getRevision in interface RevisionHandler
Returns:
the revision

main

public static void main(java.lang.String[] args)
Main method.

Parameters:
args - should contain the name of an input file.