Class StringUtils
- java.lang.Object
-
- org.apache.druid.java.util.common.StringUtils
-
public class StringUtils extends Object
As of OpenJDK / Oracle JDK 8, the JVM is optimized around String charset variable instead of Charset passing, that is exploited intoUtf8(String)andfromUtf8(byte[]).
-
-
Field Summary
Fields Modifier and Type Field Description static byte[]EMPTY_BYTESstatic CharsetUTF8_CHARSETDeprecated.static StringUTF8_STRING
-
Constructor Summary
Constructors Constructor Description StringUtils()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static Stringchop(String s, int maxBytes)Returns the string truncated to maxBytes.static intcompareUnicode(String a, String b)Compares two Java Strings in Unicode code-point order.static intcompareUtf8(byte[] a, byte[] b)Compares two UTF-8 byte strings in Unicode code-point order.static intcompareUtf8UsingJavaStringOrdering(byte[] a, byte[] b)Compares two UTF-8 byte strings in UTF-16 code-unit order.static intcompareUtf8UsingJavaStringOrdering(byte byte1, byte byte2)Compares two bytes from UTF-8 strings in such a way that the entire byte arrays are compared in UTF-16 code-unit order.static intcompareUtf8UsingJavaStringOrdering(ByteBuffer buf1, int position1, int length1, ByteBuffer buf2, int position2, int length2)Compares two UTF-8 byte strings in UTF-16 code-unit order.static byte[]decodeBase64(byte[] input)Decode an input byte array using theBase64encoding scheme and return a newly-allocated byte arraystatic byte[]decodeBase64String(String input)Decode an input string using theBase64encoding scheme and return a newly-allocated byte arraystatic StringemptyToNullNonDruidDataString(String string)Returns the given string if it is nonempty;nullotherwise.static byte[]encodeBase64(byte[] input)Convert an input byte array into a newly-allocated byte array using theBase64encoding schemestatic StringencodeBase64String(byte[] input)Convert an input byte array into a string using theBase64encoding schemestatic StringencodeForFormat(String s)Encodes a string "s" for insertion into a format string.static StringescapeSql(String str)This method is removed from commons lang3.static intestimatedBinaryLengthAsUTF8(String value)static StringfastLooseChop(String s, int maxBytes)Shorten "s" to "maxBytes" chars.static Stringformat(String message, Object... formatArgs)Equivalent of String.format(Locale.ENGLISH, message, formatArgs).static StringfromUtf8(byte[] bytes)static StringfromUtf8(byte[] bytes, int offset, int length)static StringfromUtf8(ByteBuffer buffer)Decodes a UTF-8 string from the remaining bytes of a non-null buffer.static StringfromUtf8(ByteBuffer buffer, int numBytes)Decodes a UTF-8 String fromnumBytesbytes starting at the current position of a buffer.static StringfromUtf8Nullable(ByteBuffer buffer)If buffer is Decodes a UTF-8 string from the remaining bytes of a buffer.static StringgetResource(Object ref, String resource)static Stringlpad(String base, int len, String pad)Returns the string left-padded with the string pad to a length of len characters.static StringmaybeAppendTrailingSlash(String s)static StringmaybeRemoveLeadingSlash(String s)static StringmaybeRemoveTrailingSlash(String s)static StringnonStrictFormat(String message, Object... formatArgs)Formats the string asformat(String, Object...), but instead of failing on illegal format, returns the concatenated format string and format arguments.static StringnullToEmptyNonDruidDataString(String string)Returns the given string if it is non-null; the empty string otherwise.static StringremoveChar(String s, char c)Removes all occurrences of the given char from the given string.static Stringrepeat(String s, int count)Returns a string whose value is the concatenation of the stringsrepeatedcounttimes.static Stringreplace(String s, String target, String replacement)Replaces all occurrences of the given target substring in the given string with the given replacement string.static StringreplaceChar(String s, char c, String replacement)Replaces all occurrences of the given char in the given string with the given replacement string.static Stringrpad(String base, int len, String pad)Returns the string right-padded with the string pad to a length of len characters.static StringtoLowerCase(String s)static StringtoUpperCase(String s)static byte[]toUtf8(String string)Converts a string to a UTF-8 byte array.static ByteBuffertoUtf8ByteBuffer(String string)Converts a string to UTF-8 bytes, returning them as a newly-allocated on-heapByteBuffer.static byte[]toUtf8Nullable(String string)static inttoUtf8WithLimit(String string, ByteBuffer byteBuffer)Encodes "string" into the buffer "byteBuffer", using no more than the number of bytes remaining in the buffer.static byte[]toUtf8WithNullToEmpty(String string)static StringurlDecode(String s)static StringurlEncode(String s)Encodes a String in application/x-www-form-urlencoded format, with one exception: "+" in the encoded form is replaced with "%20".static Stringutf8Base64(String input)Convert an input to base 64 and return the utf8 string of that byte array
-
-
-
Field Detail
-
EMPTY_BYTES
public static final byte[] EMPTY_BYTES
-
UTF8_CHARSET
@Deprecated public static final Charset UTF8_CHARSET
Deprecated.
-
UTF8_STRING
public static final String UTF8_STRING
-
-
Method Detail
-
estimatedBinaryLengthAsUTF8
public static int estimatedBinaryLengthAsUTF8(String value)
-
toUtf8WithNullToEmpty
public static byte[] toUtf8WithNullToEmpty(String string)
-
compareUnicode
public static int compareUnicode(String a, String b)
Compares two Java Strings in Unicode code-point order. Order is consistent withcompareUtf8(byte[], byte[]), but is not consistent withString.compareTo(String).
-
compareUtf8
public static int compareUtf8(byte[] a, byte[] b)Compares two UTF-8 byte strings in Unicode code-point order. Equivalent to a comparison of the two byte arrays as if they were unsigned bytes. Order is consistent withcompareUnicode(String, String), but is not consistent withString.compareTo(String). For an ordering consistent withString.compareTo(String), usecompareUtf8UsingJavaStringOrdering(byte[], byte[])instead.
-
compareUtf8UsingJavaStringOrdering
public static int compareUtf8UsingJavaStringOrdering(byte[] a, byte[] b)Compares two UTF-8 byte strings in UTF-16 code-unit order. Order is consistent withString.compareTo(String), but is not consistent withcompareUnicode(String, String)orcompareUtf8(byte[], byte[]).
-
compareUtf8UsingJavaStringOrdering
public static int compareUtf8UsingJavaStringOrdering(ByteBuffer buf1, int position1, int length1, ByteBuffer buf2, int position2, int length2)
Compares two UTF-8 byte strings in UTF-16 code-unit order. Order is consistent withString.compareTo(String), but is not consistent withcompareUnicode(String, String)orcompareUtf8(byte[], byte[]).
-
compareUtf8UsingJavaStringOrdering
public static int compareUtf8UsingJavaStringOrdering(byte byte1, byte byte2)Compares two bytes from UTF-8 strings in such a way that the entire byte arrays are compared in UTF-16 code-unit order. Compatible withcompareUtf8UsingJavaStringOrdering(byte[], byte[])andcompareUtf8UsingJavaStringOrdering(ByteBuffer, int, int, ByteBuffer, int, int).
-
fromUtf8
public static String fromUtf8(byte[] bytes)
-
fromUtf8
public static String fromUtf8(byte[] bytes, int offset, int length)
-
fromUtf8
public static String fromUtf8(ByteBuffer buffer, int numBytes)
Decodes a UTF-8 String fromnumBytesbytes starting at the current position of a buffer. Advances the position of the buffer bynumBytes.
-
fromUtf8
public static String fromUtf8(ByteBuffer buffer)
Decodes a UTF-8 string from the remaining bytes of a non-null buffer. Advances the position of the buffer byBuffer.remaining(). UsefromUtf8Nullable(ByteBuffer)if the buffer might be null.
-
fromUtf8Nullable
@Nullable public static String fromUtf8Nullable(@Nullable ByteBuffer buffer)
If buffer is Decodes a UTF-8 string from the remaining bytes of a buffer. Advances the position of the buffer byBuffer.remaining(). If the value is null, this method returns null. If the buffer will never be null, usefromUtf8(ByteBuffer)instead.
-
toUtf8
public static byte[] toUtf8(String string)
Converts a string to a UTF-8 byte array.- Throws:
NullPointerException- if "string" is null
-
toUtf8ByteBuffer
@Nullable public static ByteBuffer toUtf8ByteBuffer(@Nullable String string)
Converts a string to UTF-8 bytes, returning them as a newly-allocated on-heapByteBuffer. If "string" is null, returns null.
-
toUtf8WithLimit
public static int toUtf8WithLimit(String string, ByteBuffer byteBuffer)
Encodes "string" into the buffer "byteBuffer", using no more than the number of bytes remaining in the buffer. Will only encode whole characters. The byteBuffer's position and limit may be changed during operation, but will be reset before this method call ends.- Returns:
- the number of bytes written, which may be shorter than the full encoded string length if there is not enough room in the output buffer.
-
format
public static String format(String message, Object... formatArgs)
Equivalent of String.format(Locale.ENGLISH, message, formatArgs).
-
nonStrictFormat
public static String nonStrictFormat(String message, Object... formatArgs)
Formats the string asformat(String, Object...), but instead of failing on illegal format, returns the concatenated format string and format arguments. Should be used for unimportant formatting like logging, exception messages, typically not directly.
-
encodeForFormat
@Nullable public static String encodeForFormat(@Nullable String s)
Encodes a string "s" for insertion into a format string. Returns null if the input is null.
-
urlEncode
@Nullable public static String urlEncode(@Nullable String s)
Encodes a String in application/x-www-form-urlencoded format, with one exception: "+" in the encoded form is replaced with "%20". application/x-www-form-urlencoded encodes spaces as "+", but we use this to encode non-form data as well.- Parameters:
s- String to be encoded- Returns:
- application/x-www-form-urlencoded format encoded String, but with "+" replaced with "%20".
-
removeChar
public static String removeChar(String s, char c)
Removes all occurrences of the given char from the given string. This method is an optimal version ofs.replace("c", "").
-
replaceChar
public static String replaceChar(String s, char c, String replacement)
Replaces all occurrences of the given char in the given string with the given replacement string. This method is an optimal version ofs.replace("c", replacement).
-
replace
public static String replace(String s, String target, String replacement)
Replaces all occurrences of the given target substring in the given string with the given replacement string. This method is an optimal version ofs.replace(target, replacement).
-
nullToEmptyNonDruidDataString
public static String nullToEmptyNonDruidDataString(@Nullable String string)
Returns the given string if it is non-null; the empty string otherwise. This method should only be used at places where null to empty conversion is irrelevant to null handling of the data.- Parameters:
string- the string to test and possibly return- Returns:
stringitself if it is non-null;""if it is null
-
emptyToNullNonDruidDataString
@Nullable public static String emptyToNullNonDruidDataString(@Nullable String string)
Returns the given string if it is nonempty;nullotherwise. This method should only be used at places where null to empty conversion is irrelevant to null handling of the data.- Parameters:
string- the string to test and possibly return- Returns:
stringitself if it is nonempty;nullif it is empty or null
-
utf8Base64
public static String utf8Base64(String input)
Convert an input to base 64 and return the utf8 string of that byte array- Parameters:
input- The string to convert to base64- Returns:
- the base64 of the input in string form
-
encodeBase64
public static byte[] encodeBase64(byte[] input)
Convert an input byte array into a newly-allocated byte array using theBase64encoding scheme- Parameters:
input- The byte array to convert to base64- Returns:
- the base64 of the input in byte array form
-
encodeBase64String
public static String encodeBase64String(byte[] input)
Convert an input byte array into a string using theBase64encoding scheme- Parameters:
input- The byte array to convert to base64- Returns:
- the base64 of the input in string form
-
decodeBase64
public static byte[] decodeBase64(byte[] input)
Decode an input byte array using theBase64encoding scheme and return a newly-allocated byte array- Parameters:
input- The byte array to decode from base64- Returns:
- a newly-allocated byte array
-
decodeBase64String
public static byte[] decodeBase64String(String input)
Decode an input string using theBase64encoding scheme and return a newly-allocated byte array- Parameters:
input- The string to decode from base64- Returns:
- a newly-allocated byte array
-
repeat
public static String repeat(String s, int count)
Returns a string whose value is the concatenation of the stringsrepeatedcounttimes.If count or length is zero then the empty string is returned.
This method may be used to create space padding for formatting text or zero padding for formatting numbers.
- Parameters:
count- number of times to repeat- Returns:
- A string composed of this string repeated
counttimes or the empty string if count or length is zero. - Throws:
IllegalArgumentException- if thecountis negative.
-
lpad
@Nonnull public static String lpad(@Nonnull String base, int len, @Nonnull String pad)
Returns the string left-padded with the string pad to a length of len characters. If str is longer than len, the return value is shortened to len characters. This function is migrated from flink's scala function with minor refactor https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/runtime/functions/ScalarFunctions.scala - Modified to handle empty pad string. - Padding of negative length return an empty string.- Parameters:
base- The base string to be paddedlen- The length of padded stringpad- The pad string- Returns:
- the string left-padded with pad to a length of len or null if the pad is empty or the len is less than 0.
-
rpad
@Nonnull public static String rpad(@Nonnull String base, int len, @Nonnull String pad)
Returns the string right-padded with the string pad to a length of len characters. If str is longer than len, the return value is shortened to len characters. This function is migrated from flink's scala function with minor refactor https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/runtime/functions/ScalarFunctions.scala - Modified to handle empty pad string. - Modified to only copy the pad string if needed (this implementation mimics lpad). - Padding of negative length return an empty string.- Parameters:
base- The base string to be paddedlen- The length of padded stringpad- The pad string- Returns:
- the string right-padded with pad to a length of len or null if the pad is empty or the len is less than 0.
-
chop
@Nullable public static String chop(@Nullable String s, int maxBytes)
Returns the string truncated to maxBytes. If given string input is shorter than maxBytes, then it remains the same.- Parameters:
s- The input string to possibly be truncatedmaxBytes- The max bytes that string input will be truncated to- Returns:
- the string after truncated to maxBytes
-
fastLooseChop
@Nullable public static String fastLooseChop(@Nullable String s, int maxBytes)
Shorten "s" to "maxBytes" chars. Fast and loose because these are *chars* not *bytes*. Usechop(String, int)for slower, but accurate chopping.
-
-