Class HWPFDocumentCore

    • Constructor Detail

      • HWPFDocumentCore

        public HWPFDocumentCore​(InputStream istream)
                         throws IOException
        This constructor loads a Word document from an InputStream.
        Parameters:
        istream - The InputStream that contains the Word document.
        Throws:
        IOException - If there is an unexpected IOException from the passed in InputStream.
      • HWPFDocumentCore

        public HWPFDocumentCore​(POIFSFileSystem pfilesystem)
                         throws IOException
        This constructor loads a Word document from a POIFSFileSystem
        Parameters:
        pfilesystem - The POIFSFileSystem that contains the Word document.
        Throws:
        IOException - If there is an unexpected IOException from the passed in POIFSFileSystem.
      • HWPFDocumentCore

        public HWPFDocumentCore​(DirectoryNode directory)
                         throws IOException
        This constructor loads a Word document from a specific point in a POIFSFileSystem, probably not the default. Used typically to open embeded documents.
        Parameters:
        directory - The DirectoryNode that contains the Word document.
        Throws:
        IOException - If there is an unexpected IOException from the passed in POIFSFileSystem.
    • Method Detail

      • verifyAndBuildPOIFS

        public static POIFSFileSystem verifyAndBuildPOIFS​(InputStream istream)
                                                   throws IOException
        Takes an InputStream, verifies that it's not RTF or PDF, builds a POIFSFileSystem from it, and returns that.
        Throws:
        IOException
      • getRange

        public abstract Range getRange()
        Returns the range which covers the whole of the document, but excludes any headers and footers.
      • getOverallRange

        public abstract Range getOverallRange()
        Returns the range that covers all text in the file, including main text, footnotes, headers and comments
      • getDocumentText

        public String getDocumentText()
        Returns document text, i.e. text information from all text pieces, including OLE descriptions and field codes
      • getCharacterTable

        public CHPBinTable getCharacterTable()
      • getParagraphTable

        public PAPBinTable getParagraphTable()
      • getStyleSheet

        public StyleSheet getStyleSheet()
      • getListTables

        public ListTables getListTables()
      • getFontTable

        public FontTable getFontTable()
      • getMainStream

        @Internal
        public byte[] getMainStream()