Class Utf8Util

java.lang.Object
dev.blaauwendraad.masker.json.util.Utf8Util

public final class Utf8Util extends Object
UTF-8 encoding utilities class
  • Method Summary

    Modifier and Type
    Method
    Description
    static int
    countNonVisibleCharacters(byte[] message, int fromIndex, int length)
    Counts the number of non-visible characters inside the string.
    static int
    UTF-8: variable width 1-4 byte code points: 1 byte: 0xxxxxxx 2 bytes: 110xxxxx 10xxxxxx 3 bytes: 1110xxxx 10xxxxxx 10xxxxxx 4 bytes: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
    static char
    unicodeHexToChar(byte b1, byte b2, byte b3, byte b4)
    Converts a 4-byte UTF-8 encoded character ('') into a char.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Method Details

    • getCodePointByteLength

      public static int getCodePointByteLength(byte input)
      UTF-8: variable width 1-4 byte code points: 1 byte: 0xxxxxxx 2 bytes: 110xxxxx 10xxxxxx 3 bytes: 1110xxxx 10xxxxxx 10xxxxxx 4 bytes: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
      Parameters:
      input - first (or only) code point byte
      Returns:
      code point length in bytes
    • unicodeHexToChar

      public static char unicodeHexToChar(byte b1, byte b2, byte b3, byte b4)
      Converts a 4-byte UTF-8 encoded character ('') into a char. Each byte MUST represent a valid HEX character, i.e.
      • in range from 48 ('0') to 57 ('9')
      • in range from 65 ('A') to 70 ('F')
      • in range from 97 ('a') to 102 ('f')
    • countNonVisibleCharacters

      public static int countNonVisibleCharacters(byte[] message, int fromIndex, int length)
      Counts the number of non-visible characters inside the string. The intervals provided must be within a single string as this method will not do boundary checks or terminate at the end of string value.
      Parameters:
      message - the byte array containing the string
      fromIndex - the starting index of the string value (after the quote)
      length - the length of the string value (excluding the quotes)
      Returns:
      the number of non-visible characters in the string, i.e., escape characters, unicode characters (''), or other characters that are represented by more than a single byte are counted as one character