|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectjava.util.AbstractCollection<K>
it.unimi.dsi.fastutil.objects.AbstractObjectCollection<K>
it.unimi.dsi.fastutil.objects.AbstractObjectList<MutableString>
it.unimi.dsi.util.FrontCodedStringList
public class FrontCodedStringList
Compact storage of strings using front-coding compression.
This class stores a list of strings using front-coding compression (of course,
the compression will be reasonable only if the list is sorted, but you could
also use instances of this class just as a handy way to manage a large
amount of strings). It implements an immutable ObjectList
that returns the i-th
string (as a MutableString
) when the get(int)
method is
called with argument i. The returned mutable string may be freely
modified.
As a commodity, this class provides a main method that reads from standard input a sequence of newline-separated words, and writes a corresponding serialized front-coded string list.
To store the list of strings, we use either a UTF-8 coded ByteArrayFrontCodedList
, or a CharArrayFrontCodedList
, depending on
the value of the utf8
parameter at creation time. In the first case, if the
strings are ASCII-oriented the resulting array will be much smaller, but
access times will increase manifold, as each string must be UTF-8 encoded
before being returned.
Nested Class Summary |
---|
Nested classes/interfaces inherited from class it.unimi.dsi.fastutil.objects.AbstractObjectList |
---|
AbstractObjectList.ObjectSubList<K> |
Field Summary | |
---|---|
protected ByteArrayFrontCodedList |
byteFrontCodedList
The underlying ByteArrayFrontCodedList , or null . |
protected CharArrayFrontCodedList |
charFrontCodedList
The underlying CharArrayFrontCodedList , or null . |
static long |
serialVersionUID
|
protected boolean |
utf8
Whether this front-coded list is UTF-8 encoded. |
Constructor Summary | |
---|---|
FrontCodedStringList(Collection<? extends CharSequence> c,
int ratio,
boolean utf8)
Creates a new front-coded string list containing the character sequences contained in the given collection. |
|
FrontCodedStringList(Iterator<? extends CharSequence> words,
int ratio,
boolean utf8)
Creates a new front-coded string list containing the character sequences returned by the given iterator. |
Method Summary | |
---|---|
protected static char[] |
byte2Char(byte[] a,
char[] s)
|
protected static int |
countUTF8Chars(byte[] a)
|
MutableString |
get(int index)
Returns the element at the specified position in this front-coded as a mutable string. |
void |
get(int index,
MutableString s)
Returns the element at the specified position in this front-coded list by storing it in a mutable string. |
ObjectListIterator<MutableString> |
listIterator(int k)
|
static void |
main(String[] arg)
|
int |
ratio()
Returns the ratio of the underlying front-coded list. |
int |
size()
|
boolean |
utf8()
Returns whether this front-coded string list is storing its strings as UTF-8 encoded bytes. |
Methods inherited from class it.unimi.dsi.fastutil.objects.AbstractObjectList |
---|
add, add, addAll, addAll, addElements, addElements, compareTo, contains, ensureIndex, ensureRestrictedIndex, equals, getElements, hashCode, indexOf, iterator, lastIndexOf, listIterator, objectListIterator, objectListIterator, objectSubList, peek, pop, push, remove, removeElements, set, size, subList, top, toString |
Methods inherited from class it.unimi.dsi.fastutil.objects.AbstractObjectCollection |
---|
containsAll, isEmpty, objectIterator, removeAll, retainAll, toArray, toArray |
Methods inherited from class java.util.AbstractCollection |
---|
clear, remove |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Methods inherited from interface java.util.List |
---|
clear, containsAll, isEmpty, remove, removeAll, retainAll, toArray, toArray |
Methods inherited from interface it.unimi.dsi.fastutil.objects.ObjectCollection |
---|
objectIterator, toArray |
Methods inherited from interface it.unimi.dsi.fastutil.Stack |
---|
isEmpty |
Field Detail |
---|
public static final long serialVersionUID
protected final ByteArrayFrontCodedList byteFrontCodedList
ByteArrayFrontCodedList
, or null
.
protected final CharArrayFrontCodedList charFrontCodedList
CharArrayFrontCodedList
, or null
.
protected final boolean utf8
Constructor Detail |
---|
public FrontCodedStringList(Iterator<? extends CharSequence> words, int ratio, boolean utf8)
words
- an iterator returning character sequences.ratio
- the desired ratio.utf8
- if true, the strings will be stored as UTF-8 byte arrays.public FrontCodedStringList(Collection<? extends CharSequence> c, int ratio, boolean utf8)
c
- a collection containing character sequences.ratio
- the desired ratio.utf8
- if true, the strings will be stored as UTF-8 byte arrays.Method Detail |
---|
public boolean utf8()
public int ratio()
public MutableString get(int index)
get
in interface List<MutableString>
index
- an index in the list.
MutableString
that will contain the string at the specified position. The string may be freely modified.public void get(int index, MutableString s)
index
- an index in the list.s
- a mutable string that will contain the string at the specified position.protected static int countUTF8Chars(byte[] a)
protected static char[] byte2Char(byte[] a, char[] s)
public ObjectListIterator<MutableString> listIterator(int k)
listIterator
in interface ObjectList<MutableString>
listIterator
in interface List<MutableString>
listIterator
in class AbstractObjectList<MutableString>
public int size()
size
in interface Collection<MutableString>
size
in interface List<MutableString>
size
in class AbstractCollection<MutableString>
public static void main(String[] arg) throws IOException, JSAPException, NoSuchMethodException
IOException
JSAPException
NoSuchMethodException
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |