Package nl.vpro.util
Class TextUtil
java.lang.Object
nl.vpro.util.TextUtil
See JIRA
- Since:
- 1.5
- Author:
- Roelof Jan Koekoek
-
Field Summary
Modifier and TypeFieldDescriptionstatic final Pattern
Reusable pattern for matching text against illegal characters -
Method Summary
Modifier and TypeMethodDescriptionstatic @PolyNull String
controlEach
(@PolyNull CharSequence s, @NonNull Character control) static @PolyNull String
Returns the 'lexicographic' presentation of a title.static boolean
Checks if given text input complies to POMS standard.static boolean
Checks if given text input complies to POMS standard.static @PolyNull String
normalizeWhiteSpace
(@PolyNull String input) Replaces any occurrences of 1 of more white space characters by one space.static @PolyNull String
normalizeWhiteSpacePreserveNewlines
(@PolyNull String input) static @PolyNull String
overLine
(@PolyNull CharSequence s) Gives a representation of the string which is completely 'overlined' (using unicode control characters)static @PolyNull String
overLineDouble
(@PolyNull CharSequence s) Gives a representation of the string which is completely 'double overlined' (using unicode control characters)static @PolyNull String
replaceHtmlEscapedNonBreakingSpace
(@PolyNull String input) Replaces all non-breaking space entities( ) with a normal white space character.static @PolyNull String
replaceLineBreaks
(@PolyNull String input) Replaces all line separators with a single white space character.static @PolyNull String
replaceNonBreakingSpace
(@PolyNull String input) Replaces all non-breaking space characters ( ) with a normal white space character.static @PolyNull String
replaceOdd
(@PolyNull String input) Replaces 'odd' characters with a normal white space character.static @PolyNull String
Aggressively removes all tags and escaped HTML characters from the given input and replaces some characters that might lead to problems for end users.static String
Deprecated.static @PolyNull String
strikeThrough
(@PolyNull CharSequence s) Gives a representation of the string which is completely 'stroke through' (using unicode control characters)static @PolyNull String
Strips html like tags from the input.static @PolyNull String
static @PolyNull String
static @PolyNull String
underDiaeresis
(@PolyNull CharSequence s) Gives a representation of the string which is completely 'diaeresised under' (using unicode control characters)static @PolyNull String
underLine
(@PolyNull CharSequence s) Gives a representation of the string which is completely 'underlined' (using unicode control characters)static @PolyNull String
underLineDouble
(@PolyNull CharSequence s) Gives a representation of the string which is completely 'double underlined' (using unicode control characters)static @PolyNull String
unescapeHtml
(@PolyNull String input) Un-escapes all html escape entities.static @PolyNull String
-
Field Details
-
ILLEGAL_PATTERN
Reusable pattern for matching text against illegal characters
-
-
Method Details
-
isValid
Checks if given text input complies to POMS standard.- See Also:
-
isValid
Checks if given text input complies to POMS standard.- See Also:
-
normalizeWhiteSpace
Replaces any occurrences of 1 of more white space characters by one space. -
normalizeWhiteSpacePreserveNewlines
-
replaceLineBreaks
Replaces all line separators with a single white space character. The line separator character ( ) is forbidden in most modern browsers. These browsers won't render any text containing this character. -
replaceNonBreakingSpace
Replaces all non-breaking space characters ( ) with a normal white space character. -
replaceOdd
Replaces 'odd' characters with a normal white space character. -
replaceHtmlEscapedNonBreakingSpace
Replaces all non-breaking space entities( ) with a normal white space character. -
unescapeHtml
Un-escapes all html escape entities. For example: Replaces "&" with "&". -
stripHtml
Strips html like tags from the input. All content between tags, even non-html content is being removed.- Parameters:
input
- a piece of HTML or text containing some HTML markup- Returns:
- One line representing only the textual content of the input
- See Also:
-
unhtml
- Parameters:
input
- A piece of HTML- Returns:
- A piece of plain text, currently only supporting breaks, paragraphs, and lists. Empty paragraphs and multiple linebreaks are removed.
- Since:
- 2.30
-
sanitize
Aggressively removes all tags and escaped HTML characters from the given input and replaces some characters that might lead to problems for end users.- Returns:
- A single line of text
-
getLexico
Returns the 'lexicographic' presentation of a title. This means that articles are stripped and moved to the end of the string. Currently only supported for dutch. -
select
Deprecated.Can easily be achieved with stream filterObjects.nonNull(Object)
Selects first non-null of the parameters. -
truncate
-
truncate
-
strikeThrough
Gives a representation of the string which is completely 'stroke through' (using unicode control characters)- Since:
- 2.11
-
underLine
Gives a representation of the string which is completely 'underlined' (using unicode control characters)- Since:
- 2.11
-
underLineDouble
Gives a representation of the string which is completely 'double underlined' (using unicode control characters)- Since:
- 2.11
-
overLine
Gives a representation of the string which is completely 'overlined' (using unicode control characters)- Since:
- 2.11
-
overLineDouble
Gives a representation of the string which is completely 'double overlined' (using unicode control characters)- Since:
- 2.11
-
underDiaeresis
Gives a representation of the string which is completely 'diaeresised under' (using unicode control characters)- Since:
- 2.11
-
controlEach
- Since:
- 2.11
-
Objects.nonNull(Object)