public class RuleBasedBreakIterator extends BreakIterator
DONE, KIND_CHARACTER, KIND_LINE, KIND_SENTENCE, KIND_TITLE, KIND_WORD, WORD_IDEO, WORD_IDEO_LIMIT, WORD_KANA, WORD_KANA_LIMIT, WORD_LETTER, WORD_LETTER_LIMIT, WORD_NONE, WORD_NONE_LIMIT, WORD_NUMBER, WORD_NUMBER_LIMIT
Constructor and Description |
---|
RuleBasedBreakIterator(java.lang.String rules)
Construct a RuleBasedBreakIterator from a set of rules supplied as a string.
|
Modifier and Type | Method and Description |
---|---|
protected static void |
checkOffset(int offset,
java.text.CharacterIterator text)
Throw IllegalArgumentException unless begin <= offset < end.
|
java.lang.Object |
clone()
Clones this iterator.
|
static void |
compileRules(java.lang.String rules,
java.io.OutputStream ruleBinary)
Compile a set of source break rules into the binary state tables used
by the break iterator engine.
|
int |
current()
Returns the current iteration position.
|
void |
dump(java.io.PrintStream out)
Deprecated.
This API is ICU internal only.
|
boolean |
equals(java.lang.Object that)
Returns true if both BreakIterators are of the same class, have the same
rules, and iterate over the same text.
|
int |
first()
Sets the current iteration position to the beginning of the text.
|
int |
following(int startPos)
Sets the iterator to refer to the first boundary position following
the specified position.
|
static RuleBasedBreakIterator |
getInstanceFromCompiledRules(java.nio.ByteBuffer bytes)
Deprecated.
This API is ICU internal only.
|
static RuleBasedBreakIterator |
getInstanceFromCompiledRules(java.io.InputStream is)
Create a break iterator from a precompiled set of break rules.
|
int |
getRuleStatus()
Return the status tag from the break rule that determined the most recently
returned break position.
|
int |
getRuleStatusVec(int[] fillInArray)
Get the status (tag) values from the break rule(s) that determined the most
recently returned break position.
|
java.text.CharacterIterator |
getText()
Return a CharacterIterator over the text being analyzed.
|
int |
hashCode()
Compute a hashcode for this BreakIterator
|
boolean |
isBoundary(int offset)
Returns true if the specified position is a boundary position.
|
int |
last()
Sets the current iteration position to the end of the text.
|
int |
next()
Advances the iterator to the next boundary position.
|
int |
next(int n)
Advances the iterator either forward or backward the specified number of steps.
|
int |
preceding(int offset)
Sets the iterator to refer to the last boundary position before the
specified position.
|
int |
previous()
Moves the iterator backwards, to the boundary preceding the current one.
|
void |
setText(java.text.CharacterIterator newText)
Set the iterator to analyze a new piece of text.
|
java.lang.String |
toString()
Returns the description (rules) used to create this iterator.
|
getAvailableLocales, getAvailableULocales, getBreakInstance, getCharacterInstance, getCharacterInstance, getCharacterInstance, getLineInstance, getLineInstance, getLineInstance, getLocale, getSentenceInstance, getSentenceInstance, getSentenceInstance, getTitleInstance, getTitleInstance, getTitleInstance, getWordInstance, getWordInstance, getWordInstance, registerInstance, registerInstance, setText, setText, unregister
public RuleBasedBreakIterator(java.lang.String rules)
rules
- The break rules to be used.public static RuleBasedBreakIterator getInstanceFromCompiledRules(java.io.InputStream is) throws java.io.IOException
is
- an input stream supplying the compiled binary rules.java.io.IOException
- if there is an error while reading the rules from the InputStream.compileRules(String, OutputStream)
@Deprecated public static RuleBasedBreakIterator getInstanceFromCompiledRules(java.nio.ByteBuffer bytes) throws java.io.IOException
bytes
- a buffer supplying the compiled binary rules.java.io.IOException
- if there is an error while reading the rules from the buffer.compileRules(String, OutputStream)
public java.lang.Object clone()
clone
in class BreakIterator
public boolean equals(java.lang.Object that)
equals
in class java.lang.Object
public java.lang.String toString()
toString
in class java.lang.Object
public int hashCode()
hashCode
in class java.lang.Object
@Deprecated public void dump(java.io.PrintStream out)
public static void compileRules(java.lang.String rules, java.io.OutputStream ruleBinary) throws java.io.IOException
rules
- The source form of the break rulesruleBinary
- An output stream to receive the compiled rules.java.io.IOException
- If there is an error writing the output.getInstanceFromCompiledRules(InputStream)
public int first()
first
in class BreakIterator
public int last()
last
in class BreakIterator
public int next(int n)
next
in class BreakIterator
n
- The number of steps to move. The sign indicates the direction
(negative is backwards, and positive is forwards).public int next()
next
in class BreakIterator
public int previous()
previous
in class BreakIterator
public int following(int startPos)
following
in class BreakIterator
startPos
- The position from which to begin searching for a break position.public int preceding(int offset)
preceding
in class BreakIterator
offset
- The position to begin searching for a break from.protected static final void checkOffset(int offset, java.text.CharacterIterator text)
public boolean isBoundary(int offset)
isBoundary
in class BreakIterator
offset
- the offset to check.public int current()
current
in class BreakIterator
public int getRuleStatus()
Of the standard types of ICU break iterators, only the word and line break
iterator provides status values. The values are defined in
class RuleBasedBreakIterator, and allow distinguishing between words
that contain alphabetic letters, "words" that appear to be numbers,
punctuation and spaces, words containing ideographic characters, and
more. Call getRuleStatus
after obtaining a boundary
position from next()
, previous()
, or
any other break iterator functions that returns a boundary position.
getRuleStatus
in class BreakIterator
public int getRuleStatusVec(int[] fillInArray)
The status values used by the standard ICU break rules are defined as public constants in class RuleBasedBreakIterator.
If the size of the output array is insufficient to hold the data, the output will be truncated to the available length. No exception will be thrown.
getRuleStatusVec
in class BreakIterator
fillInArray
- an array to be filled in with the status values.public java.text.CharacterIterator getText()
getText
in class BreakIterator
public void setText(java.text.CharacterIterator newText)
setText
in class BreakIterator
newText
- An iterator over the text to analyze.Copyright © 2016 Unicode, Inc. and others.