Package it.unimi.dsi.io
Class LineWordReader
java.lang.Object
it.unimi.dsi.io.LineWordReader
- All Implemented Interfaces:
WordReader,Serializable
public class LineWordReader extends Object implements WordReader, Serializable
A trivial
WordReader that considers each line
of a document a single word.
The intended usage of this class is that of indexing stuff like lists of document
identifiers: if the identifiers contain nonalphabetical characters, the default
FastBufferedReader might do a poor job.
Note that the non-word returned by next(MutableString, MutableString) is
always empty.
- See Also:
- Serialized Form
-
Constructor Summary
Constructors Constructor Description LineWordReader() -
Method Summary
Modifier and Type Method Description LineWordReadercopy()Returns a copy of this word reader.booleannext(MutableString word, MutableString nonWord)Extracts the next word and non-word.LineWordReadersetReader(Reader reader)Resets the internal state of this word reader, which will start again reading from the given reader.
-
Constructor Details
-
LineWordReader
public LineWordReader()
-
-
Method Details
-
next
Description copied from interface:WordReaderExtracts the next word and non-word.If this method returns true, a new non-empty word, and possibly a new non-word, have been extracted. It is acceptable that the first call to this method after creation or after a call to
WordReader.setReader(Reader)returns an empty word. In other words bothwordandnonWordare maximal.- Specified by:
nextin interfaceWordReader- Parameters:
word- the next word returned by the underlying reader.nonWord- the nonword following the next word returned by the underlying reader.- Returns:
- true if a new word was processed, false otherwise (in which
case both
wordandnonWordare unchanged). - Throws:
IOException
-
setReader
Description copied from interface:WordReaderResets the internal state of this word reader, which will start again reading from the given reader.- Specified by:
setReaderin interfaceWordReader- Parameters:
reader- the new reader providing characters.- Returns:
- this word reader.
-
copy
Description copied from interface:WordReaderReturns a copy of this word reader.This method must return a word reader with a behaviour that matches exactly that of this word reader.
- Specified by:
copyin interfaceWordReader- Returns:
- a copy of this word reader.
-