Package nl.vpro.util
Class TextUtil
- java.lang.Object
-
- nl.vpro.util.TextUtil
-
-
Field Summary
Fields Modifier and Type Field Description static Pattern
ILLEGAL_PATTERN
Reusable pattern for matching text against illegal characters
-
Method Summary
All Methods Static Methods Concrete Methods Deprecated Methods Modifier and Type Method Description static @PolyNull String
controlEach(@PolyNull CharSequence s, @NonNull Character control)
static @PolyNull String
getLexico(@PolyNull String title, Locale locale)
Returns the 'lexicographic' presentation of a title.static boolean
isValid(@NonNull String input)
Checks if given text input complies to POMS standard.static @PolyNull String
normalizeWhiteSpace(@PolyNull String input)
Replaces any occurrences of 1 of more white space characters by one space.static @PolyNull String
normalizeWhiteSpacePreserveNewlines(@PolyNull String input)
static @PolyNull String
overLine(@PolyNull CharSequence s)
Gives a representation of the string which is completely 'overlined' (using unicode control characters)static @PolyNull String
overLineDouble(@PolyNull CharSequence s)
Gives a representation of the string which is completely 'double overlined' (using unicode control characters)static @PolyNull String
replaceHtmlEscapedNonBreakingSpace(@PolyNull String input)
Replaces all non breaking space entities( ) with a normal white space character.static @PolyNull String
replaceLineBreaks(@PolyNull String input)
Replaces all line separators with a single white space character.static @PolyNull String
replaceNonBreakingSpace(@PolyNull String input)
Replaces all non breaking space characters ( ) with a normal white space character.static @PolyNull String
replaceOdd(@PolyNull String input)
Replaces 'odd' characters with a normal white space character.static @PolyNull String
sanitize(@PolyNull String input)
Aggressively removes all tags and escaped HTML characters from the given input and replaces some characters that might lead to problems for end users.static String
select(String... options)
Deprecated.Can easily be achieved with stream filterObjects.nonNull(Object)
static @PolyNull String
strikeThrough(@PolyNull CharSequence s)
Gives a representation of the string which is completely 'stroke through' (using unicode control characters)static @PolyNull String
stripHtml(@PolyNull String input)
Strips html like tags from the input.static @PolyNull String
truncate(@PolyNull String text, int max)
static @PolyNull String
truncate(@PolyNull String text, int max, boolean ellipses)
static @PolyNull String
underDiaeresis(@PolyNull CharSequence s)
Gives a representation of the string which is completely 'diaeresised under' (using unicode control characters)static @PolyNull String
underLine(@PolyNull CharSequence s)
Gives a representation of the string which is completely 'underlined' (using unicode control characters)static @PolyNull String
underLineDouble(@PolyNull CharSequence s)
Gives a representation of the string which is completely 'double underlined' (using unicode control characters)static @PolyNull String
unescapeHtml(@PolyNull String input)
Un-escapes all html escape entities.static @PolyNull String
unhtml(@PolyNull String input)
-
-
-
Field Detail
-
ILLEGAL_PATTERN
public static final Pattern ILLEGAL_PATTERN
Reusable pattern for matching text against illegal characters
-
-
Method Detail
-
isValid
public static boolean isValid(@NonNull String input)
Checks if given text input complies to POMS standard.- See Also:
for a rough check
-
normalizeWhiteSpace
public static @PolyNull String normalizeWhiteSpace(@PolyNull String input)
Replaces any occurrences of 1 of more white space characters by one space.
-
normalizeWhiteSpacePreserveNewlines
public static @PolyNull String normalizeWhiteSpacePreserveNewlines(@PolyNull String input)
-
replaceLineBreaks
public static @PolyNull String replaceLineBreaks(@PolyNull String input)
Replaces all line separators with a single white space character. The line separator character ( ) is forbidden in most modern browsers. These browsers won't render any text containing this character.
-
replaceNonBreakingSpace
public static @PolyNull String replaceNonBreakingSpace(@PolyNull String input)
Replaces all non breaking space characters ( ) with a normal white space character.
-
replaceOdd
public static @PolyNull String replaceOdd(@PolyNull String input)
Replaces 'odd' characters with a normal white space character.
-
replaceHtmlEscapedNonBreakingSpace
public static @PolyNull String replaceHtmlEscapedNonBreakingSpace(@PolyNull String input)
Replaces all non breaking space entities( ) with a normal white space character.
-
unescapeHtml
public static @PolyNull String unescapeHtml(@PolyNull String input)
Un-escapes all html escape entities. For example: Replaces "&" with "&".
-
stripHtml
public static @PolyNull String stripHtml(@PolyNull String input)
Strips html like tags from the input. All content between tags, even non-html content is being removed.- Parameters:
input
- a piece of HTML or text containing some HTML markup- Returns:
- One line representing only the textual content of the input
- See Also:
for multiline interpretation
-
unhtml
public static @PolyNull String unhtml(@PolyNull String input)
- Parameters:
input
- A piece of HTML- Returns:
- A piece of plain text, currently only supporting breaks, paragraphs, and lists. Empty paragraphs and multiple linebreaks are removed.
- Since:
- 2.30
-
sanitize
public static @PolyNull String sanitize(@PolyNull String input)
Aggressively removes all tags and escaped HTML characters from the given input and replaces some characters that might lead to problems for end users.- Returns:
- A single line of text
-
getLexico
public static @PolyNull String getLexico(@PolyNull String title, Locale locale)
Returns the 'lexicographic' presentation of a title. This means that articles are stripped and moved to the end of the string. Currently only supported for dutch.
-
select
@Deprecated public static String select(String... options)
Deprecated.Can easily be achieved with stream filterObjects.nonNull(Object)
Selects first non null of the parameters.
-
strikeThrough
public static @PolyNull String strikeThrough(@PolyNull CharSequence s)
Gives a representation of the string which is completely 'stroke through' (using unicode control characters)- Since:
- 2.11
-
underLine
public static @PolyNull String underLine(@PolyNull CharSequence s)
Gives a representation of the string which is completely 'underlined' (using unicode control characters)- Since:
- 2.11
-
underLineDouble
public static @PolyNull String underLineDouble(@PolyNull CharSequence s)
Gives a representation of the string which is completely 'double underlined' (using unicode control characters)- Since:
- 2.11
-
overLine
public static @PolyNull String overLine(@PolyNull CharSequence s)
Gives a representation of the string which is completely 'overlined' (using unicode control characters)- Since:
- 2.11
-
overLineDouble
public static @PolyNull String overLineDouble(@PolyNull CharSequence s)
Gives a representation of the string which is completely 'double overlined' (using unicode control characters)- Since:
- 2.11
-
underDiaeresis
public static @PolyNull String underDiaeresis(@PolyNull CharSequence s)
Gives a representation of the string which is completely 'diaeresised under' (using unicode control characters)- Since:
- 2.11
-
controlEach
public static @PolyNull String controlEach(@PolyNull CharSequence s, @NonNull Character control)
- Since:
- 2.11
-
-