Kelvin OCR

Kelvin OCR

Kelvin OCR is a library that provides modular document and image OCR pipeline for the Kelvin Legal Data OS. It currently implements both local and remote engines using Tesseract, PaddleOCR, and API services from AWS, Azure, and Google. It is designed to be simple, powerful, and seamless to integrate into existing workflows.

Kelvin OCR automatically pre-processes input images, tunes OCR hyperparameters, and calculates quality scores uses Kelvin NLP and Kelvin Speller. Unlike other OCR libraries, Kelvin OCR is designed to be used in production workflows and is optimized for speed, accuracy, and downstream UI or process integration.

Input Formats

Kelvin OCR supports the input formats below:

FormatAPI (docker)Python
PNG
JPEG
TIFF
PDF⚠️

⚠ Note that PDF support is best provided by the Kelvin OCR API. This is because Kelvin uses a high-performance, secure PDF library that is difficult to compile or vendor on Windows and Mac OS X, so this functionality is most easily used via the API, which uses a Linux container.

Output Formats

Kelvin OCR supports the output formats below:

FormatAPI (docker)Python
Text
TSV
PDF
HOCR PDF

⚠ Note that PDF support is best provided by the Kelvin OCR API. This is because Kelvin uses a high-performance, secure PDF library that is difficult to compile or vendor on Windows and Mac OS X, so this functionality is most easily used via the API, which uses a Linux container.

API and Library Documentation

You can learn more about Kelvin OCR by reviewing the PyDoc documentation and OpenAPI schemas provided with the library. To access the PyDoc documentation, you can run the following command:

kelvin --docs kelvin.ocr

Examples