Package com.globalmentor.io
Class UTF8
java.lang.Object
com.globalmentor.io.UTF8
Constants and methods for working with the UTF-8 encoding.
- Author:
- Garret Wilson
- See Also:
-
Field Summary
Modifier and TypeFieldDescriptionstatic final int
The maximum number of octets used to encoded a character in UTF-8.static final int
The largest code point value that can be encoded in one byte.static final int
The largest code point value that can be encoded in two bytes.static final int
The largest code point value that can be encoded in three bytes. -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic int
Determines how many bytes are needed to encode a single character in UTF-8.static int
getEncodedByteCountFromInitialByte
(byte initialByte) Determines how many bytes are used to encoded a sequence based on its first encoded byte.static int
getEncodedByteCountFromInitialOctet
(int initialOctet) Determines how many bytes are used to encoded a sequence based on its first encoded octet.
-
Field Details
-
MAX_ENCODED_BYTE_COUNT1
public static final int MAX_ENCODED_BYTE_COUNT1The largest code point value that can be encoded in one byte.- See Also:
-
MAX_ENCODED_BYTE_COUNT2
public static final int MAX_ENCODED_BYTE_COUNT2The largest code point value that can be encoded in two bytes.- See Also:
-
MAX_ENCODED_BYTE_COUNT3
public static final int MAX_ENCODED_BYTE_COUNT3The largest code point value that can be encoded in three bytes.- See Also:
-
MAX_ENCODED_BYTE_COUNT_LENGTH
public static final int MAX_ENCODED_BYTE_COUNT_LENGTHThe maximum number of octets used to encoded a character in UTF-8.- See Also:
-
-
Constructor Details
-
UTF8
public UTF8()
-
-
Method Details
-
getEncodedByteCountForCodePoint
public static int getEncodedByteCountForCodePoint(int c) Determines how many bytes are needed to encode a single character in UTF-8.- Parameters:
c
- The character to encode.- Returns:
- The minimum number of bytes needed to encode a single character.
-
getEncodedByteCountFromInitialByte
public static int getEncodedByteCountFromInitialByte(byte initialByte) Determines how many bytes are used to encoded a sequence based on its first encoded byte.- Parameters:
initialByte
- The value of the first byte (which in Java may be a negative number, as bytes are signed) in a UTF-8 sequence.- Returns:
- The number of octets to expect in the sequence beginning with the given byte.
- Throws:
IllegalArgumentException
- if the given value is not a valid initial octet of UTF-8.- See Also:
-
getEncodedByteCountFromInitialOctet
public static int getEncodedByteCountFromInitialOctet(int initialOctet) Determines how many bytes are used to encoded a sequence based on its first encoded octet.- Parameters:
initialOctet
- The value of the first octet in a UTF-8 sequence.- Returns:
- The number of octets to expect in the sequence beginning with the given octet.
- Throws:
IllegalArgumentException
- if the given value is not a valid initial octet of UTF-8.
-