Package org.apache.poi.ooxml.extractor
Class POIXMLExtractorFactory
- java.lang.Object
-
- org.apache.poi.ooxml.extractor.POIXMLExtractorFactory
-
- All Implemented Interfaces:
ExtractorProvider
public final class POIXMLExtractorFactory extends java.lang.Object implements ExtractorProvider
Figures out the correct POITextExtractor for your supplied document, and returns it.Note 1 - will fail for many file formats if the POI Scratchpad jar is not present on the runtime classpath
Note 2 - rather than using this, for most cases you would be better off switching to Apache Tika instead!
-
-
Constructor Summary
Constructors Constructor Description POIXMLExtractorFactory()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description boolean
accepts(FileMagic fm)
POITextExtractor
create(java.io.File f, java.lang.String password)
Create Extractor via filePOITextExtractor
create(java.io.InputStream inp, java.lang.String password)
Create Extractor via InputStreamPOIXMLTextExtractor
create(OPCPackage pkg)
Tries to determine the actual type of file and produces a matching text-extractor for it.POITextExtractor
create(DirectoryNode poifsDir, java.lang.String password)
Create Extractor from POIFS nodePOITextExtractor
create(POIFSFileSystem fs)
static java.lang.Boolean
getAllThreadsPreferEventExtractors()
Should all threads prefer event based over usermodel based extractors? (usermodel extractors tend to be more accurate, but use more memory) Default is to use the thread level setting, which defaults to false.static boolean
getPreferEventExtractor()
Should this thread use event based extractors is available? Checks the all-threads one first, then thread specific.static boolean
getThreadPrefersEventExtractors()
Should this thread prefer event based over usermodel based extractors? (usermodel extractors tend to be more accurate, but use more memory) Default is false.static void
setAllThreadsPreferEventExtractors(java.lang.Boolean preferEventExtractors)
Should all threads prefer event based over usermodel based extractors? If set, will take preference over the Thread level setting.static void
setThreadPrefersEventExtractors(boolean preferEventExtractors)
Should this thread prefer event based over usermodel based extractors? Will only be used if the All Threads setting is null.-
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.poi.extractor.ExtractorProvider
identifyEmbeddedResources
-
-
-
-
Method Detail
-
accepts
public boolean accepts(FileMagic fm)
- Specified by:
accepts
in interfaceExtractorProvider
-
getThreadPrefersEventExtractors
public static boolean getThreadPrefersEventExtractors()
Should this thread prefer event based over usermodel based extractors? (usermodel extractors tend to be more accurate, but use more memory) Default is false.
-
getAllThreadsPreferEventExtractors
public static java.lang.Boolean getAllThreadsPreferEventExtractors()
Should all threads prefer event based over usermodel based extractors? (usermodel extractors tend to be more accurate, but use more memory) Default is to use the thread level setting, which defaults to false.
-
setThreadPrefersEventExtractors
public static void setThreadPrefersEventExtractors(boolean preferEventExtractors)
Should this thread prefer event based over usermodel based extractors? Will only be used if the All Threads setting is null.
-
setAllThreadsPreferEventExtractors
public static void setAllThreadsPreferEventExtractors(java.lang.Boolean preferEventExtractors)
Should all threads prefer event based over usermodel based extractors? If set, will take preference over the Thread level setting.
-
getPreferEventExtractor
public static boolean getPreferEventExtractor()
Should this thread use event based extractors is available? Checks the all-threads one first, then thread specific.
-
create
public POITextExtractor create(java.io.File f, java.lang.String password) throws java.io.IOException
Description copied from interface:ExtractorProvider
Create Extractor via file- Specified by:
create
in interfaceExtractorProvider
- Parameters:
f
- the filepassword
- the password ornull
if not encrypted- Returns:
- the extractor
- Throws:
java.io.IOException
- if file can't be read or parsed
-
create
public POITextExtractor create(java.io.InputStream inp, java.lang.String password) throws java.io.IOException
Description copied from interface:ExtractorProvider
Create Extractor via InputStream- Specified by:
create
in interfaceExtractorProvider
- Parameters:
inp
- the streampassword
- the password ornull
if not encrypted- Returns:
- the extractor
- Throws:
java.io.IOException
- if stream can't be read or parsed
-
create
public POIXMLTextExtractor create(OPCPackage pkg) throws java.io.IOException
Tries to determine the actual type of file and produces a matching text-extractor for it.- Parameters:
pkg
- AnOPCPackage
.- Returns:
- A
POIXMLTextExtractor
for the given file. - Throws:
java.io.IOException
- If an error occurs while reading the filejava.lang.IllegalArgumentException
- If no matching file type could be found.
-
create
public POITextExtractor create(POIFSFileSystem fs) throws java.io.IOException
- Throws:
java.io.IOException
-
create
public POITextExtractor create(DirectoryNode poifsDir, java.lang.String password) throws java.io.IOException
Description copied from interface:ExtractorProvider
Create Extractor from POIFS node- Specified by:
create
in interfaceExtractorProvider
- Parameters:
poifsDir
- the nodepassword
- the password ornull
if not encrypted- Returns:
- the extractor
- Throws:
java.io.IOException
- if node can't be parsed
-
-