com.ibm.icu.text
Class UnicodeDecompressor

java.lang.Object
  extended by com.ibm.icu.text.UnicodeDecompressor

public final class UnicodeDecompressor
extends Object

A decompression engine implementing the Standard Compression Scheme for Unicode (SCSU) as outlined in Unicode Technical Report #6.

USAGE

The static methods on UnicodeDecompressor may be used in a straightforward manner to decompress simple strings:

  byte [] compressed = ... ; // get compressed bytes from somewhere
  String result = UnicodeDecompressor.decompress(compressed);
 

The static methods have a fairly large memory footprint. For finer-grained control over memory usage, UnicodeDecompressor offers more powerful APIs allowing iterative decompression:

  // Decompress an array "bytes" of length "len" using a buffer of 512 chars
  // to the Writer "out"

  UnicodeDecompressor myDecompressor         = new UnicodeDecompressor();
  final static int    BUFSIZE                = 512;
  char []             charBuffer             = new char [ BUFSIZE ];
  int                 charsWritten           = 0;
  int []              bytesRead              = new int [1];
  int                 totalBytesDecompressed = 0;
  int                 totalCharsWritten      = 0;

  do {
    // do the decompression
    charsWritten = myDecompressor.decompress(bytes, totalBytesDecompressed, 
                                             len, bytesRead,
                                             charBuffer, 0, BUFSIZE);

    // do something with the current set of chars
    out.write(charBuffer, 0, charsWritten);

    // update the no. of bytes decompressed
    totalBytesDecompressed += bytesRead[0];

    // update the no. of chars written
    totalCharsWritten += charsWritten;

  } while(totalBytesDecompressed < len);

  myDecompressor.reset(); // reuse decompressor
 

Decompression is performed according to the standard set forth in Unicode Technical Report #6

Author:
Stephen F. Booth
See Also:
UnicodeCompressor
Status:
Stable ICU 2.4.

Field Summary
static int ARMENIANINDEX
           
static int COMPRESSIONOFFSET
           
static int GREEKINDEX
           
static int HALFWIDTHKATAKANAINDEX
           
static int HIRAGANAINDEX
           
static int INVALIDCHAR
           
static int INVALIDWINDOW
           
static int IPAEXTENSIONINDEX
           
static int KATAKANAINDEX
           
static int LATININDEX
           
static int MAXINDEX
           
static int NUMSTATICWINDOWS
           
static int NUMWINDOWS
           
static int RESERVEDINDEX
           
static int SCHANGE0
           
static int SCHANGE1
           
static int SCHANGE2
           
static int SCHANGE3
           
static int SCHANGE4
           
static int SCHANGE5
           
static int SCHANGE6
           
static int SCHANGE7
           
static int SCHANGEU
           
static int SDEFINE0
           
static int SDEFINE1
           
static int SDEFINE2
           
static int SDEFINE3
           
static int SDEFINE4
           
static int SDEFINE5
           
static int SDEFINE6
           
static int SDEFINE7
           
static int SDEFINEX
           
static int SINGLEBYTEMODE
           
static int[] sOffsets
          Static compression window offsets
static int[] sOffsetTable
          For window offset mapping
static int SQUOTE0
           
static int SQUOTE1
           
static int SQUOTE2
           
static int SQUOTE3
           
static int SQUOTE4
           
static int SQUOTE5
           
static int SQUOTE6
           
static int SQUOTE7
           
static int SQUOTEU
           
static int SRESERVED
           
static int UCHANGE0
           
static int UCHANGE1
           
static int UCHANGE2
           
static int UCHANGE3
           
static int UCHANGE4
           
static int UCHANGE5
           
static int UCHANGE6
           
static int UCHANGE7
           
static int UDEFINE0
           
static int UDEFINE1
           
static int UDEFINE2
           
static int UDEFINE3
           
static int UDEFINE4
           
static int UDEFINE5
           
static int UDEFINE6
           
static int UDEFINE7
           
static int UDEFINEX
           
static int UNICODEMODE
           
static int UQUOTEU
           
static int URESERVED
           
 
Constructor Summary
UnicodeDecompressor()
          Create a UnicodeDecompressor.
 
Method Summary
static String decompress(byte[] buffer)
          Decompress a byte array into a String.
static char[] decompress(byte[] buffer, int start, int limit)
          Decompress a byte array into a Unicode character array.
 int decompress(byte[] byteBuffer, int byteBufferStart, int byteBufferLimit, int[] bytesRead, char[] charBuffer, int charBufferStart, int charBufferLimit)
          Decompress a byte array into a Unicode character array.
 void reset()
          Reset the decompressor to its initial state.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

COMPRESSIONOFFSET

public static final int COMPRESSIONOFFSET
See Also:
Constant Field Values

NUMWINDOWS

public static final int NUMWINDOWS
See Also:
Constant Field Values

NUMSTATICWINDOWS

public static final int NUMSTATICWINDOWS
See Also:
Constant Field Values

INVALIDWINDOW

public static final int INVALIDWINDOW
See Also:
Constant Field Values

INVALIDCHAR

public static final int INVALIDCHAR
See Also:
Constant Field Values

SINGLEBYTEMODE

public static final int SINGLEBYTEMODE
See Also:
Constant Field Values

UNICODEMODE

public static final int UNICODEMODE
See Also:
Constant Field Values

MAXINDEX

public static final int MAXINDEX
See Also:
Constant Field Values

RESERVEDINDEX

public static final int RESERVEDINDEX
See Also:
Constant Field Values

LATININDEX

public static final int LATININDEX
See Also:
Constant Field Values

IPAEXTENSIONINDEX

public static final int IPAEXTENSIONINDEX
See Also:
Constant Field Values

GREEKINDEX

public static final int GREEKINDEX
See Also:
Constant Field Values

ARMENIANINDEX

public static final int ARMENIANINDEX
See Also:
Constant Field Values

HIRAGANAINDEX

public static final int HIRAGANAINDEX
See Also:
Constant Field Values

KATAKANAINDEX

public static final int KATAKANAINDEX
See Also:
Constant Field Values

HALFWIDTHKATAKANAINDEX

public static final int HALFWIDTHKATAKANAINDEX
See Also:
Constant Field Values

SDEFINEX

public static final int SDEFINEX
See Also:
Constant Field Values

SRESERVED

public static final int SRESERVED
See Also:
Constant Field Values

SQUOTEU

public static final int SQUOTEU
See Also:
Constant Field Values

SCHANGEU

public static final int SCHANGEU
See Also:
Constant Field Values

SQUOTE0

public static final int SQUOTE0
See Also:
Constant Field Values

SQUOTE1

public static final int SQUOTE1
See Also:
Constant Field Values

SQUOTE2

public static final int SQUOTE2
See Also:
Constant Field Values

SQUOTE3

public static final int SQUOTE3
See Also:
Constant Field Values

SQUOTE4

public static final int SQUOTE4
See Also:
Constant Field Values

SQUOTE5

public static final int SQUOTE5
See Also:
Constant Field Values

SQUOTE6

public static final int SQUOTE6
See Also:
Constant Field Values

SQUOTE7

public static final int SQUOTE7
See Also:
Constant Field Values

SCHANGE0

public static final int SCHANGE0
See Also:
Constant Field Values

SCHANGE1

public static final int SCHANGE1
See Also:
Constant Field Values

SCHANGE2

public static final int SCHANGE2
See Also:
Constant Field Values

SCHANGE3

public static final int SCHANGE3
See Also:
Constant Field Values

SCHANGE4

public static final int SCHANGE4
See Also:
Constant Field Values

SCHANGE5

public static final int SCHANGE5
See Also:
Constant Field Values

SCHANGE6

public static final int SCHANGE6
See Also:
Constant Field Values

SCHANGE7

public static final int SCHANGE7
See Also:
Constant Field Values

SDEFINE0

public static final int SDEFINE0
See Also:
Constant Field Values

SDEFINE1

public static final int SDEFINE1
See Also:
Constant Field Values

SDEFINE2

public static final int SDEFINE2
See Also:
Constant Field Values

SDEFINE3

public static final int SDEFINE3
See Also:
Constant Field Values

SDEFINE4

public static final int SDEFINE4
See Also:
Constant Field Values

SDEFINE5

public static final int SDEFINE5
See Also:
Constant Field Values

SDEFINE6

public static final int SDEFINE6
See Also:
Constant Field Values

SDEFINE7

public static final int SDEFINE7
See Also:
Constant Field Values

UCHANGE0

public static final int UCHANGE0
See Also:
Constant Field Values

UCHANGE1

public static final int UCHANGE1
See Also:
Constant Field Values

UCHANGE2

public static final int UCHANGE2
See Also:
Constant Field Values

UCHANGE3

public static final int UCHANGE3
See Also:
Constant Field Values

UCHANGE4

public static final int UCHANGE4
See Also:
Constant Field Values

UCHANGE5

public static final int UCHANGE5
See Also:
Constant Field Values

UCHANGE6

public static final int UCHANGE6
See Also:
Constant Field Values

UCHANGE7

public static final int UCHANGE7
See Also:
Constant Field Values

UDEFINE0

public static final int UDEFINE0
See Also:
Constant Field Values

UDEFINE1

public static final int UDEFINE1
See Also:
Constant Field Values

UDEFINE2

public static final int UDEFINE2
See Also:
Constant Field Values

UDEFINE3

public static final int UDEFINE3
See Also:
Constant Field Values

UDEFINE4

public static final int UDEFINE4
See Also:
Constant Field Values

UDEFINE5

public static final int UDEFINE5
See Also:
Constant Field Values

UDEFINE6

public static final int UDEFINE6
See Also:
Constant Field Values

UDEFINE7

public static final int UDEFINE7
See Also:
Constant Field Values

UQUOTEU

public static final int UQUOTEU
See Also:
Constant Field Values

UDEFINEX

public static final int UDEFINEX
See Also:
Constant Field Values

URESERVED

public static final int URESERVED
See Also:
Constant Field Values

sOffsetTable

public static final int[] sOffsetTable
For window offset mapping


sOffsets

public static final int[] sOffsets
Static compression window offsets

Constructor Detail

UnicodeDecompressor

public UnicodeDecompressor()
Create a UnicodeDecompressor. Sets all windows to their default values.

See Also:
reset()
Status:
Stable ICU 2.4.
Method Detail

decompress

public static String decompress(byte[] buffer)
Decompress a byte array into a String.

Parameters:
buffer - The byte array to decompress.
Returns:
A String containing the decompressed characters.
See Also:
decompress(byte [], int, int)
Status:
Stable ICU 2.4.

decompress

public static char[] decompress(byte[] buffer,
                                int start,
                                int limit)
Decompress a byte array into a Unicode character array.

Parameters:
buffer - The byte array to decompress.
start - The start of the byte run to decompress.
limit - The limit of the byte run to decompress.
Returns:
A character array containing the decompressed bytes.
See Also:
decompress(byte [])
Status:
Stable ICU 2.4.

decompress

public int decompress(byte[] byteBuffer,
                      int byteBufferStart,
                      int byteBufferLimit,
                      int[] bytesRead,
                      char[] charBuffer,
                      int charBufferStart,
                      int charBufferLimit)
Decompress a byte array into a Unicode character array. This function will either completely fill the output buffer, or consume the entire input.

Parameters:
byteBuffer - The byte buffer to decompress.
byteBufferStart - The start of the byte run to decompress.
byteBufferLimit - The limit of the byte run to decompress.
bytesRead - A one-element array. If not null, on return the number of bytes read from byteBuffer.
charBuffer - A buffer to receive the decompressed data. This buffer must be at minimum two characters in size.
charBufferStart - The starting offset to which to write decompressed data.
charBufferLimit - The limiting offset for writing decompressed data.
Returns:
The number of Unicode characters written to charBuffer.
Status:
Stable ICU 2.4.

reset

public void reset()
Reset the decompressor to its initial state.

Status:
Stable ICU 2.4.


Copyright (c) 2011 IBM Corporation and others.