Class XWPFWordExtractor

  • All Implemented Interfaces:
    Closeable, AutoCloseable

    public class XWPFWordExtractor
    extends org.apache.poi.ooxml.extractor.POIXMLTextExtractor
    Helper class to extract text from an OOXML Word file
    • Field Detail

      • SUPPORTED_TYPES

        public static final XWPFRelation[] SUPPORTED_TYPES
    • Method Detail

      • setFetchHyperlinks

        public void setFetchHyperlinks​(boolean fetch)
        Should we also fetch the hyperlinks, when fetching the text content? Default is to only output the hyperlink label, and not the contents
      • setConcatenatePhoneticRuns

        public void setConcatenatePhoneticRuns​(boolean concatenatePhoneticRuns)
        Should we concatenate phonetic runs in extraction. Default is true
        Parameters:
        concatenatePhoneticRuns -
      • getText

        public String getText()
        Description copied from class: POITextExtractor
        Retrieves all the text from the document. How cells, paragraphs etc are separated in the text is implementation specific - see the javadocs for a specific project for details.
        Specified by:
        getText in class POITextExtractor
        Returns:
        All the text from the document