Class Prose


  • public class Prose
    extends java.lang.Object
    Constants and utilities for working with the structure of text.

    Work in progress.

    Author:
    Garret Wilson
    • Constructor Detail

      • Prose

        public Prose()
    • Method Detail

      • getHeadingType

        public static int getHeadingType​(java.lang.String text)
        Checks to see which type of heading is represented by the given text.

        The following types of headings are checked as numbered sections:

        • VOLUME_HEADING
        • BOOK_HEADING
        • ARTICLE_HEADING
        • PART_HEADING
        • CHAPTER_HEADING
        • ACT_HEADING
        • SCENE_HEADING

        The following types of headings are checked by the presence of title labels:

        • CONTENTS_HEADING
        • PREFACE_HEADING
        • FOREWORD_HEADING
        • INTRODUCTION_HEADING
        • AFTERWORD_HEADING
        • BIBLIOGRAPHY_HEADING
        • GLOSSARY_HEADING
        • INDEX_HEADING
        • GOSPEL_HEADING

        The following type of headings is recognized by being in uppercase and not consisting entirely of digits:

        • SUB_HEADING

        The following type of headings is recognized by being correctly capitalized on a single line:

        • TITLE_HEADING

        The following type of headings is recognized by its containing only the characters '*' and/or '-', or the string "page" after punctuation is removed, and appears only on a single line:

        • PAGE_BREAK_HEADING
        Parameters:
        text - The text to check for heading type.
        Returns:
        The type of heading, or NO_HEADING if the heading could not be determined.
        See Also:
        containsTitleLabel(java.lang.String, java.lang.String), getSectionNumber(java.lang.String, java.lang.String)
      • isBreak

        public static boolean isBreak​(java.lang.String text)
        Determines if the given text is a page break.

        A break is one of the following conditions:

        • Text comprised solely of the following characters: '*', '-', '_', em-dash, and/or en-dash.
        • The word "page" surrounded by only punctuation and/or whitespace.
        Parameters:
        text - The text to check.
        Returns:
        true if the text is a page break heading.
      • isPageNumber

        public static boolean isPageNumber​(java.lang.String text)
        Determines if the given text is a page number. A page number appears on a single line and has a page indicator and a number, along with an optional select set of symbols but no other letters.
        Parameters:
        text - The text to check.
        Returns:
        true if the text is a page number.
      • isSubHeading

        public static boolean isSubHeading​(java.lang.String text)
        Determines if the given text is a subheading. A subheading is in all uppercase, is not composed completely of digits, and is not surrounded by quotation marks.
        Parameters:
        text - The text to check.
        Returns:
        true if the text is a subheading.
      • isTitleHeading

        public static boolean isTitleHeading​(java.lang.String text)
        Determines if the given text is a title heading. A title heading appears on a single line and capitalizes each word, except for some exception words (such as "the" and "of").

        Popular prepositions used from Heather MacFadyen, University of Ottawa, at http://www.uottawa.ca/academic/arts/writcent/hypergrammar/preposit.html

        Parameters:
        text - The text to check.
        Returns:
        true if the text is a title heading.
      • isTitleCapitalizationRequired

        public static boolean isTitleCapitalizationRequired​(java.lang.String word)
        Determines if the given string should be capitalized if appearing in a title.
        Parameters:
        word - The word to test for title capitalization.
        Returns:
        true if the word should be capitalized in a title.
      • containsTitleLabel

        public static boolean containsTitleLabel​(java.lang.String text,
                                                 java.lang.String label)
        Determines if the given line contains a title label. The given label is matched case insensitively, but its first character must be capitalized.
        Parameters:
        text - The text to check.
        label - The label to match.
        Returns:
        true if the given text contains the given title label.
      • containsOnlyIgnoreCase

        public static boolean containsOnlyIgnoreCase​(java.lang.String text,
                                                     java.lang.String[] requiredLabels,
                                                     java.lang.String[] optionalLabels)
        Determines if the given line contains several required and optional labels, compared without case sensitivity.
        Parameters:
        text - The text to check.
        requiredLabels - The labels that must be present.
        optionalLabels - The labels that are optional.
        Returns:
        true if the given text contains the required labels and only the required and optional labels.
      • getSectionNumber

        public static int getSectionNumber​(java.lang.String text,
                                           java.lang.String sectionLabel)
        Determines whether the given text is a numbered section heading, based upon the specified heading label. If so, the section number is returned.

        A numbered section appears in one of the following formats on a single line, with "chapter" used to represent the section label:

        • Chapter 11
        • Chapter 11: The Frightened Goat
        • Chapter XI
        • Chapter XI: The Frightened Goat
        • Eleventh Chapter
        • The Eleventh Chapter
        Parameters:
        text - The string to test for a section heading.
        sectionLabel - The label used for this type of section, case-insensitive.
        Returns:
        The value of this section, or -1 if the string does not appear to be of the given section type.
      • isQuoted

        public static boolean isQuoted​(java.lang.String string)
        Determines if the given string is quoted.
        Parameters:
        string - The string to check.
        Returns:
        true if the string's first or last non-whitespace character is some sort of quotation mark.