Powered by TensorFlow 2 and PyTorch, anyone can seamlessly access OCR


Get the pre-trained model

End-to-end OCR is implemented in docTR using a two-stage approach: text detection (locating words), then text recognition (recognizing all characters in words). Therefore, an architecture for text detection and an architecture for text recognition can be selected from the list of available implementations.

from doctr.models import ocr_predictor

model = ocr_predictor(det_arch='db_resnet50', reco_arch='crnn_vgg16_bn', pretrained=True)

read file

Documentation can be interpreted from PDF or images:

from doctr.io import DocumentFile
pdf_doc = DocumentFile.from_pdf("path/to/your/doc.pdf").as_images()
# Image
single_img_doc = DocumentFile.from_images("path/to/your/img.jpg")
# Webpage
webpage_doc = DocumentFile.from_url("https://www.yoursite.com").as_images()
# Multiple page images
multi_img_doc = DocumentFile.from_images(["path/to/page1.jpg", "path/to/page2.jpg"])

Take the default pre-trained model as an example:

from doctr.io import DocumentFile
from doctr.models import ocr_predictor

model = ocr_predictor(pretrained=True)
doc = DocumentFile.from_pdf("path/to/your/doc.pdf").as_images()
# Analyze
result = model(doc)


Installing docTR requires Python 3.6 (or higher) and pip.

Due to the use of weasyprint, additional dependencies will be required if not running on a Linux system.

For macOS users, they can be installed as follows:

brew install cairo pango gdk-pixbuf libffi

For Windows users, these dependencies are included with GTK.

The latest version

The latest version of the package can be installed using pypi as follows:

#docTR #Homepage #Documentation #Download #OCR #Document #Text #Recognition #Library #News Fast Delivery

Leave a Comment

Your email address will not be published. Required fields are marked *