Package 

Class PdfboxTextExtractor

  • All Implemented Interfaces:
    io.mfj.textricator.extractor.TextExtractor , java.lang.AutoCloseable

    
    public final class PdfboxTextExtractor
     implements TextExtractor
                        
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
    • Field Summary

      Fields 
      Modifier and Type Field Description
    • Enum Constant Summary

      Enum Constants 
      Enum Constant Description
    • Method Summary

      Modifier and Type Method Description
      List<Text> extract(Integer pageNumber) Extract text from the PDF, calling the callback for each text block.
      Integer getPageCount() Get the number of pages.
      Unit close()
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • PdfboxTextExtractor

        PdfboxTextExtractor(InputStream input)