it.unimi.dsi.io
Class LineWordReader

java.lang.Object
  extended by it.unimi.dsi.io.LineWordReader
All Implemented Interfaces:
WordReader, Serializable

public class LineWordReader
extends Object
implements WordReader, Serializable

A trivial WordReader that considers each line of a document a single word.

The intended usage of this class is that of indexing stuff like lists of document identifiers: if the identifiers contain nonalphabetical characters, the default FastBufferedReader might do a poor job.

Note that the non-word returned by next(MutableString, MutableString) is always empty.

See Also:
Serialized Form

Constructor Summary
LineWordReader()
           
 
Method Summary
 LineWordReader copy()
          Returns a copy of this word reader.
 boolean next(MutableString word, MutableString nonWord)
          Extracts the next word and non-word.
 LineWordReader setReader(Reader reader)
          Resets the internal state of this word reader, which will start again reading from the given reader.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

LineWordReader

public LineWordReader()
Method Detail

next

public boolean next(MutableString word,
                    MutableString nonWord)
             throws IOException
Description copied from interface: WordReader
Extracts the next word and non-word.

If this method returns true, a new non-empty word, and possibly a new non-word, have been extracted. It is acceptable that the first call to this method after creation or after a call to WordReader.setReader(Reader) returns an empty word. In other words both word and nonWord are maximal.

Specified by:
next in interface WordReader
Parameters:
word - the next word returned by the underlying reader.
nonWord - the nonword following the next word returned by the underlying reader.
Returns:
true if a new word was processed, false otherwise (in which case both word and nonWord are unchanged).
Throws:
IOException

setReader

public LineWordReader setReader(Reader reader)
Description copied from interface: WordReader
Resets the internal state of this word reader, which will start again reading from the given reader.

Specified by:
setReader in interface WordReader
Parameters:
reader - the new reader providing characters.
Returns:
this word reader.

copy

public LineWordReader copy()
Description copied from interface: WordReader
Returns a copy of this word reader.

This method must return a word reader with a behaviour that matches exactly that of this word reader.

Specified by:
copy in interface WordReader
Returns:
a copy of this word reader.