Class StringUtil

java.lang.Object
htsjdk.samtools.util.StringUtil

public class StringUtil extends Object
Grab-bag of stateless String-oriented utilities.
  • Field Details

  • Constructor Details

    • StringUtil

      public StringUtil()
  • Method Details

    • join

      public static <T> String join(String separator, Collection<T> objs)
      Parameters:
      separator - String to interject between each string in strings arg
      objs - List of objs to be joined
      Returns:
      String that concatenates the result of each item's to String method for all items in objs, with separator between each of them.
    • join

      public static <T> String join(String separator, T... objs)
    • split

      public static int split(String aString, String[] tokens, char delim)
      Split the string into tokens separated by the given delimiter. Profiling has revealed that the standard string.split() method typically takes > 1/2 the total time when used for parsing ascii files. Note that if tokens arg is not large enough to all the tokens in the string, excess tokens are discarded.
      Parameters:
      aString - the string to split
      tokens - an array to hold the parsed tokens
      delim - character that delimits tokens
      Returns:
      the number of tokens parsed
    • splitConcatenateExcessTokens

      public static int splitConcatenateExcessTokens(String aString, String[] tokens, char delim)
      Split the string into tokens separated by the given delimiter. Profiling has revealed that the standard string.split() method typically takes > 1/2 the total time when used for parsing ascii files. Note that the string is split into no more elements than tokens arg will hold, so the final tokenized element may contain delimiter chars.
      Parameters:
      aString - the string to split
      tokens - an array to hold the parsed tokens
      delim - character that delimits tokens
      Returns:
      the number of tokens parsed
    • toLowerCase

      public static byte toLowerCase(byte b)
      Parameters:
      b - ASCII character
      Returns:
      lowercase version of arg if it was uppercase, otherwise returns arg
    • toUpperCase

      public static byte toUpperCase(byte b)
      Parameters:
      b - ASCII character
      Returns:
      uppercase version of arg if it was lowercase, otherwise returns arg
    • toUpperCase

      public static void toUpperCase(byte[] bytes)
      Converts in place all lower case letters to upper case in the byte array provided.
    • assertCharactersNotInString

      public static String assertCharactersNotInString(String illegalChars, char... chars)
      Checks that a String doesn't contain one or more characters of interest.
      Parameters:
      illegalChars - the String to check
      chars - the characters to check for
      Returns:
      String the input String for convenience
      Throws:
      IllegalArgumentException - if the String contains one or more of the characters
    • wordWrap

      public static String wordWrap(String s, int maxLineLength)
      Return input string with newlines inserted to ensure that all lines have length <= maxLineLength. if a word is too long, it is simply broken at maxLineLength. Does not handle tabs intelligently (due to implementer laziness).
    • wordWrapSingleLine

      public static String wordWrapSingleLine(String s, int maxLineLength)
    • intValuesToString

      public static String intValuesToString(int[] intVals)
    • intValuesToString

      public static String intValuesToString(short[] shortVals)
    • bytesToString

      public static String bytesToString(byte[] data)
    • bytesToString

      public static String bytesToString(byte[] buffer, int offset, int length)
    • stringToBytes

      public static byte[] stringToBytes(String s)
    • stringToBytes

      public static byte[] stringToBytes(String s, int offset, int length)
    • readNullTerminatedString

      public static String readNullTerminatedString(BinaryCodec binaryCodec)
    • charsToBytes

      public static void charsToBytes(char[] chars, int charOffset, int length, byte[] bytes, int byteOffset)
      Convert chars to bytes merely by casting
      Parameters:
      chars - input chars
      charOffset - where to start converting from chars array
      length - how many chars to convert
      bytes - where to put the converted output
      byteOffset - where to start writing the converted output.
    • charToByte

      public static byte charToByte(char c)
      Convert ASCII char to byte.
    • byteToChar

      public static char byteToChar(byte b)
      Convert ASCII byte to ASCII char.
    • bytesToHexString

      public static String bytesToHexString(byte[] data)
      Convert a byte array into a String hex representation.
      Parameters:
      data - Input to be converted.
      Returns:
      String twice as long as data.length with hex representation of data.
    • hexStringToBytes

      public static byte[] hexStringToBytes(String s) throws NumberFormatException
      Convert a String containing hex characters into an array of bytes with the binary representation of the hex string
      Parameters:
      s - Hex string. Length must be even because each pair of hex chars is converted into a byte.
      Returns:
      byte array with binary representation of hex string.
      Throws:
      NumberFormatException
    • toHexDigit

      public static char toHexDigit(int value)
    • fromHexDigit

      public static int fromHexDigit(char c) throws NumberFormatException
      Throws:
      NumberFormatException
    • reverseString

      public static String reverseString(String s)
      Reverse the given string. Does not check for null.
      Parameters:
      s - String to be reversed.
      Returns:
      New string that is the reverse of the input string.
    • isBlank

      public static boolean isBlank(String str)

      Checks if a String is whitespace, empty ("") or null.

       StringUtils.isBlank(null)      = true
       StringUtils.isBlank("")        = true
       StringUtils.isBlank(" ")       = true
       StringUtils.isBlank("sam")     = false
       StringUtils.isBlank("  sam  ") = false
       
      Parameters:
      str - the String to check, may be null
      Returns:
      true if the String is null, empty or whitespace
    • repeatCharNTimes

      public static String repeatCharNTimes(char c, int repeatNumber)
    • asEmptyIfNull

      public static String asEmptyIfNull(Object string)
    • levenshteinDistance

      public static int levenshteinDistance(String string1, String string2, int swap, int substitution, int insertion, int deletion)
    • hammingDistance

      public static int hammingDistance(String s1, String s2)
      Calculates the Hamming distance (number of character mismatches) between two strings s1 and s2. Since Hamming distance is not defined for strings of differing lengths, we throw an exception if the two strings are of different lengths. Hamming distance is case sensitive and does not have any special treatment for DNA.
      Parameters:
      s1 - The first string to compare
      s2 - The second string to compare, note that if s1 and s2 are swapped the value returned will be identical.
      Returns:
      Hamming distance between s1 and s2.
      Throws:
      IllegalArgumentException - If the two strings have differing lengths.
    • isWithinHammingDistance

      public static boolean isWithinHammingDistance(String s1, String s2, int maxHammingDistance)
      Determines if two strings s1 and s2 are within maxHammingDistance of each other using the Hamming distance metric. Since Hamming distance is not defined for strings of differing lengths, we throw an exception if the two strings are of different lengths. Hamming distance is case sensitive and does not have any special treatment for DNA.
      Parameters:
      s1 - The first string to compare
      s2 - The second string to compare, note that if s1 and s2 are swapped the value returned will be identical.
      maxHammingDistance - The largest Hamming distance the strings can have for this function to return true.
      Returns:
      true if the two strings are within maxHammingDistance of each other, false otherwise.
      Throws:
      IllegalArgumentException - If the two strings have differing lengths.
    • humanReadableByteCount

      public static String humanReadableByteCount(long bytes)
      Takes a long value representing the number of bytes and produces a human readable byte count.
      Parameters:
      bytes - The number of bytes to create a human readable string for.
      Returns:
      A human readable string of the number of bytes given.