Package 

Interface TextExtractor

  • All Implemented Interfaces:
    java.lang.AutoCloseable

    
    public interface TextExtractor
     implements AutoCloseable
                        

    Interface to extract text from a PDF.

    Create an instance and call extract for each page.

    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
    • Field Summary

      Fields 
      Modifier and Type Field Description
    • Constructor Summary

      Constructors 
      Constructor Description
    • Enum Constant Summary

      Enum Constants 
      Enum Constant Description
    • Method Summary

      Modifier and Type Method Description
      abstract Integer getPageCount() Get the number of pages.
      abstract List<Text> extract(Integer pageNumber) Extract text from the PDF, calling the callback for each text block.
      • Methods inherited from class io.mfj.textricator.extractor.TextExtractor

        close
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

    • Method Detail

      • extract

         abstract List<Text> extract(Integer pageNumber)

        Extract text from the PDF, calling the callback for each text block.

        Parameters:
        pageNumber - Page to extract text from