PdfOcr

Abstract Value Members

abstract def loadPdfDocument(path: Path)(implicit ec: ExecutionContext): Future[PdfDocument]

Attributes
protected
abstract val tesseract: Tesseract

Attributes
protected

Concrete Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def hashCode(): Int

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
def makeSearchablePdf(in: Path, out: Path, languages: Seq[Locale], progress: (Int, Int) ⇒ Boolean)(implicit ec: ExecutionContext): Future[Unit]

Runs OCR on each page of the input that has fewer than 100 characters of text, and outputs a valid, searchable PDF.
Runs OCR on each page of the input that has fewer than 100 characters of text, and outputs a valid, searchable PDF.
This method can throw some exceptions that are entirely natural:
* PdfEncryptedException: the input PDF needs a password. * PdfInvalidException: the input PDF contains unrecoverable errors. * TesseractMissingException: Tesseract cannot be run. * TesseractLanguageMissingException: Tesseract needs a language file.
It may also throw exceptions you should probably never see:
* FileNotFoundException: the input file or output directory is missing. * SecurityException: you cannot read the input or write the output. * TesseractFailedException: Tesseract did not run properly. * OutOfMemoryException: PDFBox has an evil bug.
If this method returns a failure, or if progress() returns false, out will not be written.
in
Path to input, which must be a valid PDF file.
out
Path to output, which will be overwritten or deleted.
languages
Languages to use for OCR.
progress
Method to call with (nPagesCompleted, nPagesTotal) every page. The first call will be (0, nPagesTotal) and the last call will be (nPagesTotal, nPagesTotal). If the method ever returns false, the future will resolve and out will not be written. (This is how callers can cancel a lengthy OCR Process.)
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Related Docs: object PdfOcr | package pdfocr

trait PdfOcr extends AnyRef

Abstract Value Members

abstract def loadPdfDocument(path: Path)(implicit ec: ExecutionContext): Future[PdfDocument]

abstract val tesseract: Tesseract

Concrete Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

final def asInstanceOf[T0]: T0

def clone(): AnyRef

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def finalize(): Unit

final def getClass(): Class[_]

def hashCode(): Int

final def isInstanceOf[T0]: Boolean

def makeSearchablePdf(in: Path, out: Path, languages: Seq[Locale], progress: (Int, Int) ⇒ Boolean)(implicit ec: ExecutionContext): Future[Unit]

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from AnyRef

Inherited from Any

Ungrouped