Package htsjdk.samtools.util
Class StringUtil
java.lang.Object
htsjdk.samtools.util.StringUtil
Grab-bag of stateless String-oriented utilities.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final String
ReturnsObject.toString()
of the provided value if it isn't null; "" otherwise. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic String
asEmptyIfNull
(Object string) static String
assertCharactersNotInString
(String illegalChars, char... chars) Checks that a String doesn't contain one or more characters of interest.static String
bytesToHexString
(byte[] data) Convert a byte array into a String hex representation.static String
bytesToString
(byte[] data) static String
bytesToString
(byte[] buffer, int offset, int length) static char
byteToChar
(byte b) Convert ASCII byte to ASCII char.static void
charsToBytes
(char[] chars, int charOffset, int length, byte[] bytes, int byteOffset) Convert chars to bytes merely by castingstatic byte
charToByte
(char c) Convert ASCII char to byte.static int
fromHexDigit
(char c) static int
hammingDistance
(String s1, String s2) Calculates the Hamming distance (number of character mismatches) between two strings s1 and s2.static byte[]
Convert a String containing hex characters into an array of bytes with the binary representation of the hex stringstatic String
humanReadableByteCount
(long bytes) Takes a long value representing the number of bytes and produces a human readable byte count.static String
intValuesToString
(int[] intVals) static String
intValuesToString
(short[] shortVals) static boolean
Checks if a String is whitespace, empty ("") or null.static boolean
isWithinHammingDistance
(String s1, String s2, int maxHammingDistance) Determines if two strings s1 and s2 are within maxHammingDistance of each other using the Hamming distance metric.static <T> String
join
(String separator, Collection<T> objs) static <T> String
static int
levenshteinDistance
(String string1, String string2, int swap, int substitution, int insertion, int deletion) static String
readNullTerminatedString
(BinaryCodec binaryCodec) static String
repeatCharNTimes
(char c, int repeatNumber) static String
Reverse the given string.static int
Split the string into tokens separated by the given delimiter.static int
splitConcatenateExcessTokens
(String aString, String[] tokens, char delim) Split the string into tokens separated by the given delimiter.static byte[]
static byte[]
stringToBytes
(String s, int offset, int length) static char
toHexDigit
(int value) static byte
toLowerCase
(byte b) static byte
toUpperCase
(byte b) static void
toUpperCase
(byte[] bytes) Converts in place all lower case letters to upper case in the byte array provided.static String
Return input string with newlines inserted to ensure that all lines have length <= maxLineLength.static String
wordWrapSingleLine
(String s, int maxLineLength)
-
Field Details
-
EMPTY_STRING
ReturnsObject.toString()
of the provided value if it isn't null; "" otherwise.- See Also:
-
-
Constructor Details
-
StringUtil
public StringUtil()
-
-
Method Details
-
join
- Parameters:
separator
- String to interject between each string in strings argobjs
- List of objs to be joined- Returns:
- String that concatenates the result of each item's to String method for all items in objs, with separator between each of them.
-
join
-
split
Split the string into tokens separated by the given delimiter. Profiling has revealed that the standard string.split() method typically takes > 1/2 the total time when used for parsing ascii files. Note that if tokens arg is not large enough to all the tokens in the string, excess tokens are discarded.- Parameters:
aString
- the string to splittokens
- an array to hold the parsed tokensdelim
- character that delimits tokens- Returns:
- the number of tokens parsed
-
splitConcatenateExcessTokens
Split the string into tokens separated by the given delimiter. Profiling has revealed that the standard string.split() method typically takes > 1/2 the total time when used for parsing ascii files. Note that the string is split into no more elements than tokens arg will hold, so the final tokenized element may contain delimiter chars.- Parameters:
aString
- the string to splittokens
- an array to hold the parsed tokensdelim
- character that delimits tokens- Returns:
- the number of tokens parsed
-
toLowerCase
public static byte toLowerCase(byte b) - Parameters:
b
- ASCII character- Returns:
- lowercase version of arg if it was uppercase, otherwise returns arg
-
toUpperCase
public static byte toUpperCase(byte b) - Parameters:
b
- ASCII character- Returns:
- uppercase version of arg if it was lowercase, otherwise returns arg
-
toUpperCase
public static void toUpperCase(byte[] bytes) Converts in place all lower case letters to upper case in the byte array provided. -
assertCharactersNotInString
Checks that a String doesn't contain one or more characters of interest.- Parameters:
illegalChars
- the String to checkchars
- the characters to check for- Returns:
- String the input String for convenience
- Throws:
IllegalArgumentException
- if the String contains one or more of the characters
-
wordWrap
Return input string with newlines inserted to ensure that all lines have length <= maxLineLength. if a word is too long, it is simply broken at maxLineLength. Does not handle tabs intelligently (due to implementer laziness). -
wordWrapSingleLine
-
intValuesToString
-
intValuesToString
-
bytesToString
-
bytesToString
-
stringToBytes
-
stringToBytes
-
readNullTerminatedString
-
charsToBytes
public static void charsToBytes(char[] chars, int charOffset, int length, byte[] bytes, int byteOffset) Convert chars to bytes merely by casting- Parameters:
chars
- input charscharOffset
- where to start converting from chars arraylength
- how many chars to convertbytes
- where to put the converted outputbyteOffset
- where to start writing the converted output.
-
charToByte
public static byte charToByte(char c) Convert ASCII char to byte. -
byteToChar
public static char byteToChar(byte b) Convert ASCII byte to ASCII char. -
bytesToHexString
Convert a byte array into a String hex representation.- Parameters:
data
- Input to be converted.- Returns:
- String twice as long as data.length with hex representation of data.
-
hexStringToBytes
Convert a String containing hex characters into an array of bytes with the binary representation of the hex string- Parameters:
s
- Hex string. Length must be even because each pair of hex chars is converted into a byte.- Returns:
- byte array with binary representation of hex string.
- Throws:
NumberFormatException
-
toHexDigit
public static char toHexDigit(int value) -
fromHexDigit
- Throws:
NumberFormatException
-
reverseString
Reverse the given string. Does not check for null.- Parameters:
s
- String to be reversed.- Returns:
- New string that is the reverse of the input string.
-
isBlank
Checks if a String is whitespace, empty ("") or null.
StringUtils.isBlank(null) = true StringUtils.isBlank("") = true StringUtils.isBlank(" ") = true StringUtils.isBlank("sam") = false StringUtils.isBlank(" sam ") = false
- Parameters:
str
- the String to check, may be null- Returns:
true
if the String is null, empty or whitespace
-
repeatCharNTimes
-
asEmptyIfNull
-
levenshteinDistance
-
hammingDistance
Calculates the Hamming distance (number of character mismatches) between two strings s1 and s2. Since Hamming distance is not defined for strings of differing lengths, we throw an exception if the two strings are of different lengths. Hamming distance is case sensitive and does not have any special treatment for DNA.- Parameters:
s1
- The first string to compares2
- The second string to compare, note that if s1 and s2 are swapped the value returned will be identical.- Returns:
- Hamming distance between s1 and s2.
- Throws:
IllegalArgumentException
- If the two strings have differing lengths.
-
isWithinHammingDistance
Determines if two strings s1 and s2 are within maxHammingDistance of each other using the Hamming distance metric. Since Hamming distance is not defined for strings of differing lengths, we throw an exception if the two strings are of different lengths. Hamming distance is case sensitive and does not have any special treatment for DNA.- Parameters:
s1
- The first string to compares2
- The second string to compare, note that if s1 and s2 are swapped the value returned will be identical.maxHammingDistance
- The largest Hamming distance the strings can have for this function to return true.- Returns:
- true if the two strings are within maxHammingDistance of each other, false otherwise.
- Throws:
IllegalArgumentException
- If the two strings have differing lengths.
-
humanReadableByteCount
Takes a long value representing the number of bytes and produces a human readable byte count.- Parameters:
bytes
- The number of bytes to create a human readable string for.- Returns:
- A human readable string of the number of bytes given.
-