Class UniCharIterator
- java.lang.Object
-
- com.adobe.xfa.ut.UniCharIterator
-
public class UniCharIterator extends java.lang.Object
Allow iteration by Unicode characters over a Java UTF-16 encoded string.A Java character is only a 16-bit quantity. Unicode characters can have values up to 0x10FFFF, which exceeds the space available in a Java character. When such Unicode characters appear in a Java string they are encoded using the UTF-16 encoding and occupuy two consecutive Java characters, known as a surrogate pair.
This class allows the caller to step through a Java string it true Unicode character amounts. It also provides some static methods to generate Java characters from Unicode characters.
An iterator instance is associated with an instance of the Java CharSequence interface. This interface is implemented by both the String and StringBuilder classes.
At any given time, one can think of the iterator as being positioned between characters in the associated character sequence. It can also be positioned before the first character and after the last. Operations move the iterator forward or backward in the underlying character sequence and return the Unicode character passed over.
The iterator carries an index number that can be useful for indexing into the character sequence independently of the iterator. Index values start at zero and count up to the number of Java characters in the sequence. Index zero is before the first character, index one is between the first and second characters, and so on.
It does not make sense for the iterator to be positioned between the two Java characters making up a surrogate pair. Subsequent operations could lead to assertion errors and unpredictable results.
Note: The iterator caches the length of the given character sequence. If the caller is using an iterator and modifies the sequence in such a way that its length changes, it must call an associate() overload to re-establish the length.
-
-
Constructor Summary
Constructors Constructor Description UniCharIterator()
Default constructor.UniCharIterator(java.lang.CharSequence charSequence)
Construct an iterator associated with a given character sequence.UniCharIterator(java.lang.CharSequence charSequence, int index)
Construct an iterator associated with a given sequence, and initially positioned at a specified index.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static void
append(java.lang.StringBuilder s, int c)
Append a Unicode character to a Java StringBuilder.void
attach(java.lang.CharSequence charSequence)
Attach the iterator to a given character sequence.void
attach(java.lang.CharSequence charSequence, int index)
Attach the iterator to a given sequence, and initially positioned at a specified index.int
getIndex()
Get the current Java character index number of the iterator.boolean
isAtEnd()
Query whether the iterator is at the end of the text.boolean
isAtStart()
Query whether the iterator is at the the of the text.int
next()
Advance the iterator by one Unicode character.int
prev()
Back up the iterator by one Unicode character.void
setIndex(int index)
Set the iterator's index.static java.lang.String
toString(int c)
Return a Java string that represents the given Unicode character.
-
-
-
Constructor Detail
-
UniCharIterator
public UniCharIterator()
Default constructor.The iterator is not associated with any character sequence, and is not particularly useful until the attach() method is called.
-
UniCharIterator
public UniCharIterator(java.lang.CharSequence charSequence)
Construct an iterator associated with a given character sequence. The iterator is initially positioned before the first character in the sequence.- Parameters:
charSequence
- Character sequence to associate the iterator with.
-
UniCharIterator
public UniCharIterator(java.lang.CharSequence charSequence, int index)
Construct an iterator associated with a given sequence, and initially positioned at a specified index.- Parameters:
charSequence
- Character sequence to associate the iterator with.index
- Index number into the character sequence, with meaning as described above.
-
-
Method Detail
-
append
public static void append(java.lang.StringBuilder s, int c)
Append a Unicode character to a Java StringBuilder. This method determines whether the Unicode character can be represented as a single Java character or must be a surrogate pair. It then adds the appropriate Java character(s) to the given string buffer.- Parameters:
s
- String buffer to add to.c
- Unicode character to be added.
-
attach
public void attach(java.lang.CharSequence charSequence)
Attach the iterator to a given character sequence. The iterator is initially positioned before the first character in the sequence.- Parameters:
charSequence
- Character sequence to associate the iterator with.
-
attach
public void attach(java.lang.CharSequence charSequence, int index)
Attach the iterator to a given sequence, and initially positioned at a specified index.- Parameters:
charSequence
- Character sequence to associate the iterator with.index
- Index number into the character sequence, with meaning as described above.
-
getIndex
public int getIndex()
Get the current Java character index number of the iterator.- Returns:
- Index number, as described above.
-
isAtEnd
public boolean isAtEnd()
Query whether the iterator is at the end of the text.- Returns:
- True if the iterator is positioned after the last character in the underlying text; false if not.
-
isAtStart
public boolean isAtStart()
Query whether the iterator is at the the of the text.- Returns:
- True if the iterator is positioned before the first character in the underlying text; false if not.
-
next
public int next()
Advance the iterator by one Unicode character. The iterator will not be advanced if it is already positioined after the last Java character in the sequence. The iterator's index will increase by one or two, depending on the makeup of the Unicode character it advances over.- Returns:
- Unicode character advanced over.
-
prev
public int prev()
Back up the iterator by one Unicode character. The iterator will not be moved if it is already positioined after the last Java character in the sequence. The iterator's index will decrease by one or two, depending on the makeup of the Unicode character it moves over.- Returns:
- Unicode character passed over.
-
setIndex
public void setIndex(int index)
Set the iterator's index. This method changes the index, but keeps the iterator associated with the same character sequence.- Parameters:
index
- New index to set for this iterator.
-
toString
public static java.lang.String toString(int c)
Return a Java string that represents the given Unicode character.- Parameters:
c
- Unicode character to convert to a Java string.- Returns:
- Resulting String. If the character is less than 0x10000, the result will simply contain the single character passed in. Otherwise it will contain the two characters making up the surrogate pair.
-
-