Class Characters

java.lang.Object
com.globalmentor.java.Characters

public final class Characters extends Object
An immutable set of characters that supports various searching and other functions. This essentially provides an efficient yet immutable array with object-oriented functionality.

This class is similar to String, except that it discards duplicate characters. Furthermore, this class allows no Unicode surrogates; the characters contained are interpreted as complete Unicode code points. This also makes comparison more efficient. As this class is similar to an ordered set than a list, it doesn't implement CharSequence in order to prevent signature conflicts; and provides size() to count its contents instead of a "length" property.

This class also provides static utilities and constants for interacting with characters in general.

In most cases, names of constants are derived from Unicode names.

Author:
Garret Wilson
See Also:
  • Field Details

    • NO_CHARS

      public static final char[] NO_CHARS
      A shared instance of an empty array of characters.
    • EMPTY_ARRAY

      @Deprecated public static final char[] EMPTY_ARRAY
      Deprecated.
      to be removed in favor of NO_CHARS.
      A shared instance of an empty array of characters.
    • NONE

      public static final Characters NONE
      The shared instance of no characters.
    • NULL_CHAR

      public static final char NULL_CHAR
      The character with Unicode code point zero.
      See Also:
    • BACKSPACE_CHAR

      public static final char BACKSPACE_CHAR
      A backspace.
      See Also:
    • CHARACTER_TABULATION_CHAR

      public static final char CHARACTER_TABULATION_CHAR
      A horizontal tab (0009;<control>;Cc;0;S;;;;;N;CHARACTER TABULATION;;;;).
      See Also:
    • LINE_FEED_CHAR

      public static final char LINE_FEED_CHAR
      A line feed (LF).
      See Also:
    • LINE_TABULATION_CHAR

      public static final char LINE_TABULATION_CHAR
      A vertical tab (000B;<control>;Cc;0;S;;;;;N;LINE TABULATION;;;;).
      See Also:
    • FORM_FEED_CHAR

      public static final char FORM_FEED_CHAR
      A form feed (FF).
      See Also:
    • CARRIAGE_RETURN_CHAR

      public static final char CARRIAGE_RETURN_CHAR
      A carriage return.
      See Also:
    • INFORMATION_SEPARATOR_FOUR_CHAR

      public static final char INFORMATION_SEPARATOR_FOUR_CHAR
      The information separator four character.
      See Also:
    • INFORMATION_SEPARATOR_THREE_CHAR

      public static final char INFORMATION_SEPARATOR_THREE_CHAR
      The information separator three character.
      See Also:
    • INFORMATION_SEPARATOR_TWO_CHAR

      public static final char INFORMATION_SEPARATOR_TWO_CHAR
      The information separator two character.
      See Also:
    • INFORMATION_SEPARATOR_ONE_CHAR

      public static final char INFORMATION_SEPARATOR_ONE_CHAR
      The information separator one character.
      See Also:
    • UNIT_SEPARATOR_CHAR

      public static final char UNIT_SEPARATOR_CHAR
      A unit separator character.
      See Also:
    • SPACE_CHAR

      public static final char SPACE_CHAR
      A space character.
      See Also:
    • QUOTATION_MARK_CHAR

      public static final char QUOTATION_MARK_CHAR
      A quotation mark character.
      See Also:
    • PERCENT_SIGN_CHAR

      public static final char PERCENT_SIGN_CHAR
      The percent sign.
      See Also:
    • APOSTROPHE_CHAR

      public static final char APOSTROPHE_CHAR
      An apostrophe character.
      See Also:
    • PLUS_SIGN_CHAR

      public static final char PLUS_SIGN_CHAR
      A plus sign character.
      See Also:
    • COMMA_CHAR

      public static final char COMMA_CHAR
      A comma character.
      See Also:
    • HYPHEN_MINUS_CHAR

      public static final char HYPHEN_MINUS_CHAR
      A hyphen or minus character.
      See Also:
    • SOLIDUS_CHAR

      public static final char SOLIDUS_CHAR
      A solidus or slash character (002F;SOLIDUS;Po;0;CS;;;;;N;SLASH;;;;).
      See Also:
    • COLON_CHAR

      public static final char COLON_CHAR
      A colon character.
      See Also:
    • SEMICOLON_CHAR

      public static final char SEMICOLON_CHAR
      A semicolon character.
      See Also:
    • LESS_THAN_CHAR

      public static final char LESS_THAN_CHAR
      A less-than sign character (003C;LESS-THAN SIGN;Sm;0;ON;;;;;Y;;;;;).
      See Also:
    • EQUALS_SIGN_CHAR

      public static final char EQUALS_SIGN_CHAR
      An equals sign character (003D;EQUALS SIGN;Sm;0;ON;;;;;N;;;;;).
      See Also:
    • GREATER_THAN_CHAR

      public static final char GREATER_THAN_CHAR
      A greater-than sign character (003E;GREATER-THAN SIGN;Sm;0;ON;;;;;Y;;;;;).
      See Also:
    • QUESTION_MARK_CHAR

      public static final char QUESTION_MARK_CHAR
      A question mark character (003F;QUESTION MARK;Po;0;ON;;;;;N;;;;;).
      See Also:
    • GRAVE_ACCENT_CHAR

      public static final char GRAVE_ACCENT_CHAR
      A grave accent character.
      See Also:
    • TILDE_CHAR

      public static final char TILDE_CHAR
      A tilde character (007E;TILDE;Sm;0;ON;;;;;N;;;;;).
      See Also:
    • NEXT_LINE_CHAR

      public static final char NEXT_LINE_CHAR
      A next line (NEL) control character.
      See Also:
    • START_OF_STRING_CHAR

      public static final char START_OF_STRING_CHAR
      A start of string control character.
      See Also:
    • STRING_TERMINATOR_CHAR

      public static final char STRING_TERMINATOR_CHAR
      A string terminator control character.
      See Also:
    • NO_BREAK_SPACE_CHAR

      public static final char NO_BREAK_SPACE_CHAR
      Unicode no-break space (NBSP).
      See Also:
    • LEFT_POINTING_DOUBLE_ANGLE_QUOTATION_MARK_CHAR

      public static final char LEFT_POINTING_DOUBLE_ANGLE_QUOTATION_MARK_CHAR
      A left-pointing guillemet character.
      See Also:
    • PILCROW_SIGN_CHAR

      public static final char PILCROW_SIGN_CHAR
      The pilcrow or paragraph sign.
      See Also:
    • PARAGRAPH_SIGN_CHAR

      public static final char PARAGRAPH_SIGN_CHAR
      The paragraph sign.
      See Also:
    • MIDDLE_DOT_CHAR

      public static final char MIDDLE_DOT_CHAR
      A middle dot character.
      See Also:
    • RIGHT_POINTING_DOUBLE_ANGLE_QUOTATION_MARK_CHAR

      public static final char RIGHT_POINTING_DOUBLE_ANGLE_QUOTATION_MARK_CHAR
      A right-pointing guillemet character.
      See Also:
    • LATIN_CAPITAL_LIGATURE_OE_CHAR

      public static final char LATIN_CAPITAL_LIGATURE_OE_CHAR
      An uppercase oe ligature.
      See Also:
    • LATIN_SMALL_LIGATURE_OE_CHAR

      public static final char LATIN_SMALL_LIGATURE_OE_CHAR
      A lowercase oe ligature.
      See Also:
    • LATIN_CAPITAL_LETTER_Y_WITH_DIAERESIS_CHAR

      public static final char LATIN_CAPITAL_LETTER_Y_WITH_DIAERESIS_CHAR
      A Y umlaut.
      See Also:
    • ZERO_WIDTH_SPACE_CHAR

      public static final char ZERO_WIDTH_SPACE_CHAR
      A zero-width space (ZWSP) that may expand during justification.
      See Also:
    • ZERO_WIDTH_NON_JOINER_CHAR

      public static final char ZERO_WIDTH_NON_JOINER_CHAR
      A zero-width non-joiner (200C;ZERO WIDTH NON-JOINER;Cf;0;BN;;;;;N;;;;;).
      See Also:
    • ZERO_WIDTH_JOINER_CHAR

      public static final char ZERO_WIDTH_JOINER_CHAR
      A zero-width joiner (200D;ZERO WIDTH JOINER;Cf;0;BN;;;;;N;;;;;).
      See Also:
    • LEFT_TO_RIGHT_MARK_CHAR

      public static final char LEFT_TO_RIGHT_MARK_CHAR
      A left-to-right mark (200E;LEFT-TO-RIGHT MARK;Cf;0;L;;;;;N;;;;;).
      See Also:
    • RIGHT_TO_LEFT_MARK_CHAR

      public static final char RIGHT_TO_LEFT_MARK_CHAR
      A right-to-right mark (200F;RIGHT-TO-LEFT MARK;Cf;0;R;;;;;N;;;;;).
      See Also:
    • WORD_JOINER_CHAR

      public static final char WORD_JOINER_CHAR
      A zero-width non-breaking space—word joiner (WJ).
      See Also:
    • LEFT_SINGLE_QUOTATION_MARK_CHAR

      public static final char LEFT_SINGLE_QUOTATION_MARK_CHAR
      A left single quote.
      See Also:
    • RIGHT_SINGLE_QUOTATION_MARK_CHAR

      public static final char RIGHT_SINGLE_QUOTATION_MARK_CHAR
      A right single quote.
      See Also:
    • SINGLE_LOW_9_QUOTATION_MARK_CHAR

      public static final char SINGLE_LOW_9_QUOTATION_MARK_CHAR
      A single low-9 quotation mark.
      See Also:
    • SINGLE_HIGH_REVERSED_9_QUOTATION_MARK_CHAR

      public static final char SINGLE_HIGH_REVERSED_9_QUOTATION_MARK_CHAR
      A single high-reversed-9 quotation mark.
      See Also:
    • LEFT_DOUBLE_QUOTATION_MARK_CHAR

      public static final char LEFT_DOUBLE_QUOTATION_MARK_CHAR
      A left double quote.
      See Also:
    • RIGHT_DOUBLE_QUOTATION_MARK_CHAR

      public static final char RIGHT_DOUBLE_QUOTATION_MARK_CHAR
      A right double quote.
      See Also:
    • DOUBLE_LOW_9_QUOTATION_MARK_CHAR

      public static final char DOUBLE_LOW_9_QUOTATION_MARK_CHAR
      A double low-9 quotation mark.
      See Also:
    • DOUBLE_HIGH_REVERSED_9_QUOTATION_MARK_CHAR

      public static final char DOUBLE_HIGH_REVERSED_9_QUOTATION_MARK_CHAR
      A double high-reversed-9 quotation mark.
      See Also:
    • EN_DASH_CHAR

      public static final char EN_DASH_CHAR
      Unicode en dash character.
      See Also:
    • EM_DASH_CHAR

      public static final char EM_DASH_CHAR
      Unicode em dash character.
      See Also:
    • BULLET_CHAR

      public static final char BULLET_CHAR
      Unicode bullet character.
      See Also:
    • HORIZONTAL_ELLIPSIS_CHAR

      public static final char HORIZONTAL_ELLIPSIS_CHAR
      Unicode horizontal ellipsis.
      See Also:
    • LINE_SEPARATOR_CHAR

      public static final char LINE_SEPARATOR_CHAR
      A line separator character (2028;LINE SEPARATOR;Zl;0;WS;;;;;N;;;;;).
      See Also:
    • PARAGRAPH_SEPARATOR_CHAR

      public static final char PARAGRAPH_SEPARATOR_CHAR
      A paragraph separator character (2029;PARAGRAPH SEPARATOR;Zp;0;B;;;;;N;;;;;).
      See Also:
    • SINGLE_LEFT_POINTING_ANGLE_QUOTATION_MARK_CHAR

      public static final char SINGLE_LEFT_POINTING_ANGLE_QUOTATION_MARK_CHAR
      A left-pointing single guillemet character.
      See Also:
    • SINGLE_RIGHT_POINTING_ANGLE_QUOTATION_MARK_CHAR

      public static final char SINGLE_RIGHT_POINTING_ANGLE_QUOTATION_MARK_CHAR
      A right-pointing single guillemet character.
      See Also:
    • TRADE_MARK_SIGN_CHAR

      public static final char TRADE_MARK_SIGN_CHAR
      Trademark character.
      See Also:
    • INFINITY_CHAR

      public static final char INFINITY_CHAR
      Infinity symbol (221E;INFINITY;Sm;0;ON;;;;;N;;;;;).
      See Also:
    • LEFT_POINTING_ANGLE_BRACKET

      public static final char LEFT_POINTING_ANGLE_BRACKET
      A left-pointing angle bracket character (2329;LEFT-POINTING ANGLE BRACKET;Ps;0;ON;3008;;;;Y;BRA;;;;).
      See Also:
    • RIGHT_POINTING_ANGLE_BRACKET

      public static final char RIGHT_POINTING_ANGLE_BRACKET
      A right-pointing angle bracket character (232A;RIGHT-POINTING ANGLE BRACKET;Pe;0;ON;3009;;;;Y;KET;;;;).
      See Also:
    • NULL_SYMBOL

      public static final char NULL_SYMBOL
      The symbol for NULL (2400;SYMBOL FOR NULL;So;0;ON;;;;;N;GRAPHIC FOR NULL;;;;).
      See Also:
    • LINE_FEED_SYMBOL

      public static final char LINE_FEED_SYMBOL
      The symbol for line feed (240A;SYMBOL FOR LINE FEED;So;0;ON;;;;;N;GRAPHIC FOR LINE FEED;;;;).
      See Also:
    • VERTICAL_TAB_SYMBOL

      public static final char VERTICAL_TAB_SYMBOL
      The symbol for vertical tab (240B;SYMBOL FOR VERTICAL TABULATION;So;0;ON;;;;;N;GRAPHIC FOR VERTICAL TABULATION;;;;).
      See Also:
    • FORM_FEED_SYMBOL

      public static final char FORM_FEED_SYMBOL
      The symbol for form feed (240C;SYMBOL FOR FORM FEED;So;0;ON;;;;;N;GRAPHIC FOR FORM FEED;;;;).
      See Also:
    • CARRIAGE_RETURN_SYMBOL

      public static final char CARRIAGE_RETURN_SYMBOL
      The symbol for carriage return (240D;SYMBOL FOR CARRIAGE RETURN;So;0;ON;;;;;N;GRAPHIC FOR CARRIAGE RETURN;;;;).
      See Also:
    • END_OF_TRANSMISSION_SYMBOL

      public static final char END_OF_TRANSMISSION_SYMBOL
      The symbol for end of transmission (2404;SYMBOL FOR END OF TRANSMISSION;So;0;ON;;;;;N;GRAPHIC FOR END OF TRANSMISSION;;;;).
      See Also:
    • SPACE_SYMBOL

      public static final char SPACE_SYMBOL
      The symbol for space (2420;SYMBOL FOR SPACE;So;0;ON;;;;;N;GRAPHIC FOR SPACE;;;;).
      See Also:
    • BLANK_SYMBOL

      public static final char BLANK_SYMBOL
      The blank symbol (2422;BLANK SYMBOL;So;0;ON;;;;;N;BLANK;;;;).
      See Also:
    • REVERSED_DOUBLE_PRIME_QUOTATION_MARK_CHAR

      public static final char REVERSED_DOUBLE_PRIME_QUOTATION_MARK_CHAR
      A reversed double prime quotation mark.
      See Also:
    • DOUBLE_PRIME_QUOTATION_MARK_CHAR

      public static final char DOUBLE_PRIME_QUOTATION_MARK_CHAR
      A double prime quotation mark.
      See Also:
    • LOW_DOUBLE_PRIME_QUOTATION_MARK_CHAR

      public static final char LOW_DOUBLE_PRIME_QUOTATION_MARK_CHAR
      A low double prime quotation mark.
      See Also:
    • FULLWIDTH_QUOTATION_MARK_CHAR

      public static final char FULLWIDTH_QUOTATION_MARK_CHAR
      A full width quotation mark.
      See Also:
    • ZERO_WIDTH_NO_BREAK_SPACE_CHAR

      public static final char ZERO_WIDTH_NO_BREAK_SPACE_CHAR
      A zero-width no-breaking space (ZWNBSP)—the Byte Order Mark (BOM) (FEFF;ZERO WIDTH NO-BREAK SPACE;Cf;0;BN;;;;;N;BYTE ORDER MARK;;;;). For non-breaking purposes, deprecated in favor of WORD_JOINER_CHAR.
      See Also:
    • BOM_CHAR

      public static final char BOM_CHAR
      The Byte Order Mark (BOM).
      See Also:
    • OBJECT_REPLACEMENT_CHAR

      public static final char OBJECT_REPLACEMENT_CHAR
      A character for a placeholder in text for an otherwise unspecified object.
      See Also:
    • REPLACEMENT_CHAR

      public static final char REPLACEMENT_CHAR
      Represents a character that is unknown or unrepresentable in Unicode.
      See Also:
    • UNDEFINED_CHAR

      public static final char UNDEFINED_CHAR
      An invalid, undefined Unicode character which is "guaranteed not to be a Unicode character at all.
      See Also:
    • CONTROL_CHARS

      public static final String CONTROL_CHARS
      Unicode control characters (0x0000-0x001F, 0x007F-0x09F).
      See Also:
    • PARAGRAPH_SEPARATOR_CHARS

      public static final String PARAGRAPH_SEPARATOR_CHARS
      Unicode paragraph separator characters.
      See Also:
    • SEGMENT_SEPARATOR_CHARS

      public static final String SEGMENT_SEPARATOR_CHARS
      Unicode segment separator characters.
      See Also:
    • NEWLINE_CHARACTERS

      public static final Characters NEWLINE_CHARACTERS
      Unicode newline characters.
      See Also:
    • WHITESPACE_CHARACTERS

      public static final Characters WHITESPACE_CHARACTERS
      Unicode whitespace characters.
    • FORMAT_CHARS

      public static final String FORMAT_CHARS
      Unicode formatting characters; Unicode characters marked with "Cf", such as WORD_JOINER.
      See Also:
    • TRIM_CHARACTERS

      public static final Characters TRIM_CHARACTERS
      Characters that do not contain visible "content", and may be trimmed from ends of a string. These include whitespace, control characters, and formatting characters.
    • LIST_DELIMITER_CHARS

      public static final String LIST_DELIMITER_CHARS
      Characters that delimit a list separated by trim characters, commas, and/or semicolons.
      See Also:
    • LEFT_QUOTE_CHARS

      public static final String LEFT_QUOTE_CHARS
      Characters that could be considered the start of a quotation.
      See Also:
    • RIGHT_QUOTE_CHARS

      public static final String RIGHT_QUOTE_CHARS
      Characters that could be considered the end of a quotation.
      See Also:
    • QUOTE_CHARS

      public static final String QUOTE_CHARS
      Characters that start or end quotations.
      See Also:
    • PHRASE_PUNCTUATION_CHARACTERS

      public static final Characters PHRASE_PUNCTUATION_CHARACTERS
      Characters used to punctuate phrases and sentences.
    • DEPENDENT_PUNCTUATION_CHARACTERS

      public static final Characters DEPENDENT_PUNCTUATION_CHARACTERS
      Punctuation that expects a character to follow at some point.
    • LEFT_GROUP_PUNCTUATION_CHARACTERS

      public static final Characters LEFT_GROUP_PUNCTUATION_CHARACTERS
      Left punctuation used to group characters.
    • RIGHT_GROUP_PUNCTUATION_CHARACTERS

      public static final Characters RIGHT_GROUP_PUNCTUATION_CHARACTERS
      Right punctuation used to group characters.
    • GROUP_PUNCTUATION_CHARACTERS

      public static final Characters GROUP_PUNCTUATION_CHARACTERS
      Punctuation used to group characters.
    • PUNCTUATION_CHARS

      public static final Characters PUNCTUATION_CHARS
      Characters used to punctuate phrases and sentences, as well as general punctuation such as quotes.
    • WORD_DELIMITER_CHARACTERS

      public static final Characters WORD_DELIMITER_CHARACTERS
      Characters that separate words.
    • SPACE_SEPARATOR_CHARACTERS

      public static final Characters SPACE_SEPARATOR_CHARACTERS
      Characters in the Unicode Space_Separator (Zs) category as of Unicode 9.0.0.
    • LINE_SEPARATOR_CHARACTERS

      public static final Characters LINE_SEPARATOR_CHARACTERS
      Characters in the Unicode Line_Separator (Zl) category as of Unicode 9.0.0.
    • PARAGRAPH_SEPARATOR_CHARACTERS

      public static final Characters PARAGRAPH_SEPARATOR_CHARACTERS
      Characters in the Unicode Paragraph_Separator (Zp) category as of Unicode 9.0.0.
    • EOL_CHARACTERS

      public static final Characters EOL_CHARACTERS
      Characters considered to be end-of-line markers (e.g. CR and LF).
    • SEPARATOR_CHARACTERS

      public static final Characters SEPARATOR_CHARACTERS
      Characters in the Unicode Separator (Z) group as of Unicode 9.0.0.
      See Also:
    • WORD_WRAP_CHARS

      public static final String WORD_WRAP_CHARS
      Characters that allow words to wrap.
  • Method Details

    • of

      public static Characters of(char... characters)
      Characters factory method. Duplicates are ignored.
      Parameters:
      characters - The characters to store.
      Returns:
      An instance of Characters with the given characters stored.
      Throws:
      NullPointerException - if the given characters is null.
      IllegalArgumentException - if the given characters contain Unicode surrogate characters.
    • of

      public static Characters of(char[] characters, int start, int end)
      Characters factory method. Duplicates are ignored.
      Parameters:
      characters - The characters to store.
      start - The start index, inclusive.
      end - The end index, exclusive.
      Returns:
      An instance of Characters with the given characters stored.
      Throws:
      NullPointerException - if the given characters is null.
      IllegalArgumentException - if the given characters contain Unicode surrogate characters.
    • ofRange

      public static Characters ofRange(char first, char last)
      Creates a range of characters.
      Parameters:
      first - The first of the range, inclusive.
      last - The last of the range, inclusive.
      Returns:
      Characters representing the indicated range.
      Throws:
      IllegalArgumentException - if the last character comes before the first character.
    • of

      public static Characters of(Characters... multipleCharacters)
      Characters factory method from existing Characters instances. Duplicates are ignored.
      Parameters:
      multipleCharacters - The Characters instances containing characters to store.
      Returns:
      An instance of Characters with the given characters stored.
      Throws:
      NullPointerException - if the given characters is null.
      IllegalArgumentException - if the given characters contain Unicode surrogate characters.
    • from

      public static Characters from(CharSequence charSequence)
      Character sequence factory method. Duplicates are ignored.
      Parameters:
      charSequence - The character sequence containing characters to store.
      Returns:
      An instance of Characters with the characters contained on the given char sequence.
      Throws:
      NullPointerException - if the given character sequence is null.
      IllegalArgumentException - if the given character sequence contains Unicode surrogate characters.
    • isEmpty

      public boolean isEmpty()
      Returns:
      true if this object contains no characters.
    • size

      public int size()
      Returns:
      The number of characters.
    • add

      public Characters add(Characters characters)
      Creates a new object with these characters and the given characters. Duplicates are ignored.
      Parameters:
      characters - The characters to add.
      Returns:
      A new object containing these characters and the given characters.
      Throws:
      NullPointerException - if the given characters is null.
      IllegalArgumentException - if the given characters contain Unicode surrogate characters.
    • add

      public Characters add(char... characters)
      Creates a new object with these characters and the given characters. Duplicates are ignored.
      Parameters:
      characters - The characters to add.
      Returns:
      A new object containing these characters and the given characters.
      Throws:
      NullPointerException - if the given characters is null.
      IllegalArgumentException - if the given characters contain Unicode surrogate characters.
    • add

      public Characters add(CharSequence charSequence)
      Creates a new object with these characters and the given characters. Duplicates are ignored.
      Parameters:
      charSequence - The characters to add.
      Returns:
      A new object containing these characters and the given characters.
      Throws:
      NullPointerException - if the given character sequence null.
      IllegalArgumentException - if the given character sequence contains Unicode surrogate characters.
    • addRange

      public Characters addRange(char first, char last)
      Adds a range of characters.
      Parameters:
      first - The first of the range, inclusive.
      last - The last of the range, inclusive.
      Returns:
      A new object containing these characters and the given range of characters.
      Throws:
      IllegalArgumentException - if the last character comes before the first character.
    • remove

      public Characters remove(Characters characters)
      Creates a new object with these characters, with the given characters removed.
      Parameters:
      characters - The characters to remove.
      Returns:
      A new object containing these characters without the given characters.
      Throws:
      NullPointerException - if the given characters is null.
    • remove

      public Characters remove(char... characters)
      Creates a new object with these characters, with the given characters removed.
      Parameters:
      characters - The characters to remove.
      Returns:
      A new object containing these characters without the given characters.
      Throws:
      NullPointerException - if the given characters is null.
    • remove

      public Characters remove(CharSequence charSequence)
      Creates a new object with these characters, with the given characters removed.
      Parameters:
      charSequence - The characters to add.
      Returns:
      A new object containing these characters without the given characters.
      Throws:
      NullPointerException - if the given character sequence null.
    • split

      public List<String> split(CharSequence charSequence)
      Splits a character sequence on the these characters. Runs of matching characters are removed and the interspersed tokens are returned.
      API Note:
      This method produces the same result without regard to whether one or more character sequences begin and/or end with the delimiter.
      Implementation Specification:
      The current implementation does not support surrogate characters.
      Implementation Note:
      This method is likely more efficient than a regular expression-based approach, especially in situations in which splitting is likely to occur at a small frequency, because the setup cost is low and individual character testing is efficient.
      Parameters:
      charSequence - The character sequence to split.
      Returns:
      A list of subsequences; the list may not be mutable.
    • toString

      public String toString()
      Overrides:
      toString in class Object
      Returns:
      A string containing these characters.
    • toLabelArrayString

      public String toLabelArrayString()
      Returns a string representing an array of these characters, each character represented as 'x', or if the character is a control character, the Unicode code point of this character, e.g. "U+1234". Example: "['a', 0x0020]"
      Implementation Specification:
      This method does not treat surrogate characters specially.
      Returns:
      A string containing an array representation of these characters.
    • toStringBuilder

      public StringBuilder toStringBuilder()
      A string builder containing these characters. This implementation provides an initial capacity for 16 more characters.
      Returns:
      A string builder containing these characters.
      See Also:
    • toStringBuilder

      public StringBuilder toStringBuilder(int extraCapacity)
      A string builder containing these characters, with an initial capacity with room for the specified number of extra characters.
      Parameters:
      extraCapacity - The extra initial capacity.
      Returns:
      A string builder containing these characters.
      Throws:
      IllegalArgumentException - if the given capacity is negative.
      See Also:
    • contains

      public boolean contains(char character)
      Determines whether the given character is contained in these characters.
      Parameters:
      character - The character to check.
      Returns:
      true if the character exists in these characters.
    • isCharInRange

      public static boolean isCharInRange(char c, char[][] ranges)
      Sees if the specified character is in one of the specified ranges.
      Parameters:
      c - The character to check.
      ranges - An array of character pair arrays, in order, the first of each pair specifying the bottom inclusive character of a range, the second of which specifying the top inclusive character of the range.
      Returns:
      true if the character is in one of the ranges, else false.
    • isASCII

      public static boolean isASCII(char c)
      Determines whether a character is in the ASCII character range (0x00-0x7F).
      Parameters:
      c - The character to examine.
      Returns:
      true if the character is an ASCII character.
    • isLatinDigit

      @Deprecated public static final boolean isLatinDigit(char c)
      Deprecated.
      Determines whether a character is one of the digits '0'-'9'.
      Parameters:
      c - The character to examine.
      Returns:
      true if the character is an ISO_LATIN_1 digit.
    • isPunctuation

      public static boolean isPunctuation(char c)
      Specifies whether or not a given character is a punctuation mark.
      Parameters:
      c - Character to analyze.
      Returns:
      true if the character is punctuation.
    • isRomanNumeral

      public static boolean isRomanNumeral(char c)
      Determines whether a character is a Roman numeral.
      Parameters:
      c - The character to examine.
      Returns:
      true if the character is a Roman numeral.
    • isWhitespace

      public static boolean isWhitespace(char c)
      Specifies whether or not a given character is whitespace.
      Parameters:
      c - Character to analyze.
      Returns:
      true if the character is whitespace.
    • isWordDelimiter

      public static boolean isWordDelimiter(char c)
      Specifies whether or not a given character is a word delimiter, such as whitespace or punctuation.
      Parameters:
      c - Character to analyze.
      Returns:
      true if the character allows word wrapping.
    • isWordWrap

      public static boolean isWordWrap(char c)
      Specifies whether or not a given character allows a word wrap.
      Parameters:
      c - Character to analyze.
      Returns:
      true if the character allows word wrapping.
    • getLabel

      public static String getLabel(int c)
      Returns a string representing the character as 'x', or if the character is a control character, the Unicode code point of this character, e.g. "U+1234".
      Implementation Specification:
      This method supports Unicode supplementary code points.
      Parameters:
      c - The code point a string representation of which to append.
      Returns:
      The string label representing the character.
      See Also:
    • toLabelArrayString

      public static String toLabelArrayString(char... characters)
      Returns a string representing an array of these characters, each character represented as 'x', or if the character is a control character, the Unicode code point of this character, e.g. "U+1234". Example: "['a', 0x0020]"
      Implementation Specification:
      This method does not treat surrogate characters specially.
      Parameters:
      characters - The characters to return as a string of an array.
      Returns:
      A string containing an array representation of these characters.
    • toLabelArrayString

      public static String toLabelArrayString(CharSequence characters)
      Returns a string representing an array of these characters, each character represented as 'x', or if the character is a control character, the Unicode code point of this character, e.g. "U+1234". Example: "['a', 0x0020]"
      Implementation Specification:
      This method does not treat surrogate characters specially.
      Parameters:
      characters - The characters to return as a string of an array.
      Returns:
      A string containing an array representation of these characters.
    • toByteArray

      public static byte[] toByteArray(char[] characters)
      Converts an array of characters to an array of bytes, using the UTF-8 charset.
      Parameters:
      characters - The characters to convert to bytes.
      Returns:
      An array of bytes representing the given characters in the UTF-8 charset.
    • toByteArray

      public static byte[] toByteArray(char[] characters, Charset charset)
      Converts an array of characters to an array of bytes, using the given character encoding.
      Parameters:
      characters - The characters to convert to bytes.
      charset - The charset to use when converting characters to bytes.
      Returns:
      An array of bytes representing the given characters in the specified encoding.
    • appendLabelArrayString

      public static StringBuilder appendLabelArrayString(StringBuilder stringBuilder, char[] characters)
      Appends a string representing an array of characters, each character represented as 'x', or if the character is a control character, the Unicode code point of this character, e.g. "U+1234". Example: "['a', 0x0020]"
      Implementation Specification:
      This method does not treat surrogate characters specially.
      Parameters:
      stringBuilder - The string builder to which the string will be appended.
      characters - The characters the strings of the Unicode code points to append.
      Returns:
      The string builder.
      Throws:
      NullPointerException - if the given string builder is null.
    • appendLabel

      public static StringBuilder appendLabel(StringBuilder stringBuilder, int c)
      Appends a string representing the character as 'x', or if the character is a control character or a surrogate, either a special representation such as '\n' or the Unicode code point of this character, e.g. "U+1234".
      Implementation Specification:
      This method supports Unicode supplementary code points.
      Parameters:
      stringBuilder - The string builder to which the string will be appended.
      c - The code point a string representation of which to append.
      Returns:
      The string builder.
      Throws:
      NullPointerException - if the given string builder is null.
      See Also:
    • appendUnicodeString

      public static StringBuilder appendUnicodeString(StringBuilder stringBuilder, int c)
      Appends a string representing the Unicode code point of this character, e.g. "U+1234". The length of the added string depends on the Unicode code point; most code points will result in four hex characters.
      Parameters:
      stringBuilder - The string builder to which the string will be appended.
      c - The code point the Unicode string of which to append.
      Returns:
      The string builder.
      Throws:
      NullPointerException - if the given string builder is null.
    • parseCharacter

      public static final Character parseCharacter(String string)
      Parses a string and returns its character value.
      Parameters:
      string - A string expected to contain a single character.
      Returns:
      The single character contained by the string.
      Throws:
      NullPointerException - if the given string is null
      IllegalArgumentException - if the string is not composed of a single character.