Kelvin Conversion Engine
On this page
All Formats, One API
Legal data workflows almost always involve the conversion of documents from one type to another. In cases like M&A diligence, source documents often include a variety of types like PDFs, Word documents, Excel spreadsheets, PowerPoint, HTML, and images. In other situations, such as CLM migrations or DMS search, the source documents might span many historical versions of Word, WordPerfect, EML, and PDF files.
Converting these documents often takes significant manual or engineering effort. Whether the conversion is to plain text for natural language processing or to PDF/Word for display and editing, the end result is often a frustrating process.
The Kelvin Conversion Engine solves this problem by providing a single API endpoint for a wide variety of document conversion and extraction tasks. including:
- identifying the exact formats of a document
- extracting textual information from a document
- extracting images or diagrams from a document
- converting between common word processing formats like .doc, .docx, and .wpd
- converting between word processing and display formats like PDF or HTML
The Kelvin Conversion Engine currently supports the following formats:
Format | Text Extraction | Image Extraction | Format Conversion |
---|---|---|---|
Word (.doc, 97-2003) | ✅ | ✅ | ✅ |
Word (.docx, 2003-) | ✅ | ✅ | ✅ |
WordPerfect (.wpd) | ✅ | ⚙️ | ✅ |
Excel (.xls, 97-2003) | ✅ | ⚙️ | ✅ |
Excel (.xlsx, 2003-) | ✅ | ⚙️ | ✅ |
PowerPoint (.ppt, 97-2003) | ✅ | ✅ | ✅ |
PowerPoint (.pptx, 2003-) | ✅ | ✅ | ✅ |
PDF ("digital") | ✅ | ✅ | ✅ |
PDF ("image or OCR") | ✅ | ⚙️ | ✅ |
HTML | ✅ | ✅ | ✅ |
EML | ✅ | ✅ | ✅ |