Package org.apache.poi.xssf.extractor
Class XSSFEventBasedExcelExtractor
- java.lang.Object
-
- org.apache.poi.xssf.extractor.XSSFEventBasedExcelExtractor
-
- All Implemented Interfaces:
java.io.Closeable
,java.lang.AutoCloseable
,POITextExtractor
,POIXMLTextExtractor
,ExcelExtractor
- Direct Known Subclasses:
XSSFBEventBasedExcelExtractor
public class XSSFEventBasedExcelExtractor extends java.lang.Object implements POIXMLTextExtractor, ExcelExtractor
Implementation of a text extractor from OOXML Excel files that uses SAX event based parsing.
-
-
Constructor Summary
Constructors Constructor Description XSSFEventBasedExcelExtractor(java.lang.String path)
XSSFEventBasedExcelExtractor(OPCPackage container)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description POIXMLProperties.CoreProperties
getCoreProperties()
Returns the core document propertiesPOIXMLProperties.CustomProperties
getCustomProperties()
Returns the custom document propertiesPOIXMLDocument
getDocument()
Returns opened documentPOIXMLProperties.ExtendedProperties
getExtendedProperties()
Returns the extended document propertiesOPCPackage
getFilesystem()
boolean
getFormulasNotResults()
boolean
getIncludeCellComments()
boolean
getIncludeHeadersFooters()
boolean
getIncludeSheetNames()
boolean
getIncludeTextBoxes()
java.util.Locale
getLocale()
OPCPackage
getPackage()
Returns the opened OPCPackage container.java.lang.String
getText()
Processes the file and returns the textboolean
isCloseFilesystem()
void
processSheet(XSSFSheetXMLHandler.SheetContentsHandler sheetContentsExtractor, Styles styles, Comments comments, SharedStrings strings, java.io.InputStream sheetInputStream)
Processes the given sheetvoid
setCloseFilesystem(boolean doCloseFilesystem)
void
setConcatenatePhoneticRuns(boolean concatenatePhoneticRuns)
Concatenate text from <rPh> text elements in SharedStringsTable Default is true;void
setFormulasNotResults(boolean formulasNotResults)
Should we return the formula itself, and not the result it produces? Default is falsevoid
setIncludeCellComments(boolean includeCellComments)
Should cell comments be included? Default is falsevoid
setIncludeHeadersFooters(boolean includeHeadersFooters)
Should headers and footers be included? Default is truevoid
setIncludeSheetNames(boolean includeSheetNames)
Should sheet names be included? Default is truevoid
setIncludeTextBoxes(boolean includeTextBoxes)
Should text from textboxes be included? Default is truevoid
setLocale(java.util.Locale locale)
-
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.poi.ooxml.extractor.POIXMLTextExtractor
checkMaxTextSize, close, getMetadataTextExtractor
-
-
-
-
Constructor Detail
-
XSSFEventBasedExcelExtractor
public XSSFEventBasedExcelExtractor(java.lang.String path) throws XmlException, OpenXML4JException, java.io.IOException
- Throws:
XmlException
OpenXML4JException
java.io.IOException
-
XSSFEventBasedExcelExtractor
public XSSFEventBasedExcelExtractor(OPCPackage container) throws XmlException, OpenXML4JException, java.io.IOException
- Throws:
XmlException
OpenXML4JException
java.io.IOException
-
-
Method Detail
-
setIncludeSheetNames
public void setIncludeSheetNames(boolean includeSheetNames)
Should sheet names be included? Default is true- Specified by:
setIncludeSheetNames
in interfaceExcelExtractor
- Parameters:
includeSheetNames
-true
if the sheet names should be included
-
getIncludeSheetNames
public boolean getIncludeSheetNames()
- Returns:
- whether to include sheet names
- Since:
- 3.16-beta3
-
setFormulasNotResults
public void setFormulasNotResults(boolean formulasNotResults)
Should we return the formula itself, and not the result it produces? Default is false- Specified by:
setFormulasNotResults
in interfaceExcelExtractor
- Parameters:
formulasNotResults
-true
if the formula itself is returned
-
getFormulasNotResults
public boolean getFormulasNotResults()
- Returns:
- whether to include formulas but not results
- Since:
- 3.16-beta3
-
setIncludeHeadersFooters
public void setIncludeHeadersFooters(boolean includeHeadersFooters)
Should headers and footers be included? Default is true- Specified by:
setIncludeHeadersFooters
in interfaceExcelExtractor
- Parameters:
includeHeadersFooters
-true
if headers and footers should be included
-
getIncludeHeadersFooters
public boolean getIncludeHeadersFooters()
- Returns:
- whether or not to include headers and footers
- Since:
- 3.16-beta3
-
setIncludeTextBoxes
public void setIncludeTextBoxes(boolean includeTextBoxes)
Should text from textboxes be included? Default is true
-
getIncludeTextBoxes
public boolean getIncludeTextBoxes()
- Returns:
- whether or not to extract textboxes
- Since:
- 3.16-beta3
-
setIncludeCellComments
public void setIncludeCellComments(boolean includeCellComments)
Should cell comments be included? Default is false- Specified by:
setIncludeCellComments
in interfaceExcelExtractor
- Parameters:
includeCellComments
-true
if cell comments should be included
-
getIncludeCellComments
public boolean getIncludeCellComments()
- Returns:
- whether cell comments should be included
- Since:
- 3.16-beta3
-
setConcatenatePhoneticRuns
public void setConcatenatePhoneticRuns(boolean concatenatePhoneticRuns)
Concatenate text from <rPh> text elements in SharedStringsTable Default is true;- Parameters:
concatenatePhoneticRuns
- true if runs should be concatenated, false otherwise
-
setLocale
public void setLocale(java.util.Locale locale)
-
getLocale
public java.util.Locale getLocale()
- Returns:
- locale
- Since:
- 3.16-beta3
-
getPackage
public OPCPackage getPackage()
Returns the opened OPCPackage container.- Specified by:
getPackage
in interfacePOIXMLTextExtractor
- Returns:
- the opened OPCPackage
-
getCoreProperties
public POIXMLProperties.CoreProperties getCoreProperties()
Returns the core document properties- Specified by:
getCoreProperties
in interfacePOIXMLTextExtractor
- Returns:
- the core document properties
-
getExtendedProperties
public POIXMLProperties.ExtendedProperties getExtendedProperties()
Returns the extended document properties- Specified by:
getExtendedProperties
in interfacePOIXMLTextExtractor
- Returns:
- the extended document properties
-
getCustomProperties
public POIXMLProperties.CustomProperties getCustomProperties()
Returns the custom document properties- Specified by:
getCustomProperties
in interfacePOIXMLTextExtractor
- Returns:
- the custom document properties
-
processSheet
public void processSheet(XSSFSheetXMLHandler.SheetContentsHandler sheetContentsExtractor, Styles styles, Comments comments, SharedStrings strings, java.io.InputStream sheetInputStream) throws java.io.IOException, org.xml.sax.SAXException
Processes the given sheet- Throws:
java.io.IOException
org.xml.sax.SAXException
-
getText
public java.lang.String getText()
Processes the file and returns the text- Specified by:
getText
in interfaceExcelExtractor
- Specified by:
getText
in interfacePOITextExtractor
- Returns:
- All the text from the document
-
getDocument
public POIXMLDocument getDocument()
Description copied from interface:POIXMLTextExtractor
Returns opened document- Specified by:
getDocument
in interfacePOITextExtractor
- Specified by:
getDocument
in interfacePOIXMLTextExtractor
- Returns:
- the opened document
-
setCloseFilesystem
public void setCloseFilesystem(boolean doCloseFilesystem)
- Specified by:
setCloseFilesystem
in interfacePOITextExtractor
- Parameters:
doCloseFilesystem
-true
(default), if underlying resources/filesystem should be closed onPOITextExtractor.close()
-
isCloseFilesystem
public boolean isCloseFilesystem()
- Specified by:
isCloseFilesystem
in interfacePOITextExtractor
- Returns:
true
, if resources/filesystem should be closed onPOITextExtractor.close()
-
getFilesystem
public OPCPackage getFilesystem()
- Specified by:
getFilesystem
in interfacePOITextExtractor
- Returns:
- The underlying resources/filesystem
-
-