Class CharsetUtil

java.lang.Object
com.alipay.sofa.common.utils.CharsetUtil

public class CharsetUtil extends Object
Util to determine Charsets, the implements fork from guava.
Version:
CharsetUtil.java, v 0.1 2023年04月14日 2:06 PM huzijie Exp $
Author:
huzijie
See Also:
  • Utf8.isWellFormed(byte[])
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static org.slf4j.Logger
     
    static final int
     
    static final int
     
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    static void
    assertUTF8WellFormed(byte[] bytes)
    Asserts the given byte array is well formed in UTF-8 encoding format.
    static void
    assertUTF8WellFormed(byte[] bytes, int off, int len)
    Asserts that a portion of the given byte array is well formed in UTF-8 encoding format.
    static void
    checkUTF8WellFormed(byte[] bytes, int mode)
    Checks whether the given byte array is well formed in UTF-8 encoding format.
    static void
    checkUTF8WellFormed(byte[] bytes, int off, int len, int mode)
    Checks whether a portion of the given byte array is well formed in UTF-8 encoding format.
    static boolean
    isUTF8WellFormed(byte[] bytes)
    Determines whether the given byte array is well formed in UTF-8 encoding format according to Unicode 6.0.
    static boolean
    isUTF8WellFormed(byte[] bytes, int off, int len)
    Determines whether a portion of the given byte array is well formed in UTF-8 encoding format according to Unicode 6.0.
    static void
    monitorUTF8WellFormed(byte[] bytes)
    Monitor the given byte array is well formed in UTF-8 encoding format.
    static void
    monitorUTF8WellFormed(byte[] bytes, int off, int len)
    Monitor that a portion of the given byte array is well formed in UTF-8 encoding format.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • CHARSET_MONITOR_LOG

      public static org.slf4j.Logger CHARSET_MONITOR_LOG
    • MODE_ASSERT

      public static final int MODE_ASSERT
      See Also:
    • MODE_MONITOR

      public static final int MODE_MONITOR
      See Also:
  • Constructor Details

    • CharsetUtil

      public CharsetUtil()
  • Method Details

    • assertUTF8WellFormed

      public static void assertUTF8WellFormed(byte[] bytes)
      Asserts the given byte array is well formed in UTF-8 encoding format.
      Parameters:
      bytes - The byte array to be checked.
    • assertUTF8WellFormed

      public static void assertUTF8WellFormed(byte[] bytes, int off, int len)
      Asserts that a portion of the given byte array is well formed in UTF-8 encoding format.
      Parameters:
      bytes - The byte array to be checked.
      off - The starting position in the array to be checked.
      len - The length of the portion to be checked.
    • monitorUTF8WellFormed

      public static void monitorUTF8WellFormed(byte[] bytes)
      Monitor the given byte array is well formed in UTF-8 encoding format.
      Parameters:
      bytes - The byte array to be checked.
    • monitorUTF8WellFormed

      public static void monitorUTF8WellFormed(byte[] bytes, int off, int len)
      Monitor that a portion of the given byte array is well formed in UTF-8 encoding format.
      Parameters:
      bytes - The byte array to be checked.
      off - The starting position in the array to be checked.
      len - The length of the portion to be checked.
    • checkUTF8WellFormed

      public static void checkUTF8WellFormed(byte[] bytes, int mode)
      Checks whether the given byte array is well formed in UTF-8 encoding format.
      Parameters:
      bytes - The byte array to be checked.
      mode - The checking mode when bytes isn't well formed in UTF-8 encoding forma .

      In mode 0, it will throw IllegalArgumentException.

      In mode 1, it will print error log in slf4j logger: CHARSET_MONITOR_LOG

    • checkUTF8WellFormed

      public static void checkUTF8WellFormed(byte[] bytes, int off, int len, int mode)
      Checks whether a portion of the given byte array is well formed in UTF-8 encoding format.
      Parameters:
      bytes - The byte array to be checked.
      off - The starting position in the array to be checked.
      len - The length of the portion to be checked.
      mode - The checking mode when bytes isn't well formed in UTF-8 encoding forma .

      In mode 0, it will throw IllegalArgumentException.

      In mode 1, it will print error log in slf4j logger: CHARSET_MONITOR_LOG

    • isUTF8WellFormed

      public static boolean isUTF8WellFormed(byte[] bytes)
      Determines whether the given byte array is well formed in UTF-8 encoding format according to Unicode 6.0.
      Parameters:
      bytes - The byte array to be checked.
      Returns:
      true if the byte array is in well formed UTF-8 encoding format, false otherwise
    • isUTF8WellFormed

      public static boolean isUTF8WellFormed(byte[] bytes, int off, int len)
      Determines whether a portion of the given byte array is well formed in UTF-8 encoding format according to Unicode 6.0.
      Parameters:
      bytes - The byte array to be checked.
      off - The starting position in the array to be checked.
      len - The length of the portion to be checked.
      Returns:
      true if the byte array is in well formed UTF-8 encoding format, false otherwise
      See Also:
      • Utf8.isWellFormed(byte[], int, int)