Package org.apache.poi.extractor
Interface POIOLE2TextExtractor
-
- All Superinterfaces:
java.lang.AutoCloseable
,java.io.Closeable
,POITextExtractor
- All Known Implementing Classes:
EventBasedExcelExtractor
,ExcelExtractor
,HPSFPropertiesExtractor
,OutlookTextExtractor
,PublisherTextExtractor
,VisioTextExtractor
,Word6Extractor
,WordExtractor
public interface POIOLE2TextExtractor extends POITextExtractor
Common Parent for OLE2 based Text Extractors of POI Documents, such as .doc, .xls You will typically find the implementation of a given format's text extractor under org.apache.poi.[format].extractor .- See Also:
ExcelExtractor
,VisioTextExtractor
,WordExtractor
-
-
Method Summary
All Methods Instance Methods Abstract Methods Default Methods Modifier and Type Method Description default DocumentSummaryInformation
getDocSummaryInformation()
Returns the document information metadata for the documentPOIDocument
getDocument()
Return the underlying POIDocumentdefault POITextExtractor
getMetadataTextExtractor()
Returns an HPSF powered text extractor for the document properties metadata, such as title and author.default DirectoryEntry
getRoot()
Return the underlying DirectoryEntry of this document.default SummaryInformation
getSummaryInformation()
Returns the summary information metadata for the document.-
Methods inherited from interface org.apache.poi.extractor.POITextExtractor
close, getFilesystem, getText, isCloseFilesystem, setCloseFilesystem
-
-
-
-
Method Detail
-
getDocSummaryInformation
default DocumentSummaryInformation getDocSummaryInformation()
Returns the document information metadata for the document- Returns:
- The Document Summary Information or null if it could not be read for this document.
-
getSummaryInformation
default SummaryInformation getSummaryInformation()
Returns the summary information metadata for the document.- Returns:
- The Summary information for the document or null if it could not be read for this document.
-
getMetadataTextExtractor
default POITextExtractor getMetadataTextExtractor()
Returns an HPSF powered text extractor for the document properties metadata, such as title and author.- Specified by:
getMetadataTextExtractor
in interfacePOITextExtractor
- Returns:
- an instance of POIExtractor that can extract meta-data.
-
getRoot
default DirectoryEntry getRoot()
Return the underlying DirectoryEntry of this document.- Returns:
- the DirectoryEntry that is associated with the POIDocument of this extractor.
-
getDocument
POIDocument getDocument()
Return the underlying POIDocument- Specified by:
getDocument
in interfacePOITextExtractor
- Returns:
- the underlying POIDocument
-
-