Kelvin Document Index

Introduction to key concepts

End-to-end document analytics in a single API

The Kelvin Document Index is a single API endpoint that supports a wide variety of document analytics tasks. While organizations can independently architect Kelvin libraries and APIs into their own solutions, the Document Index provides a simple, turnkey solution for many common use cases.

For example, the Document Index can be used to:

  • Upload and centrally store documents for collaboration
  • Identify exact and near-duplicate documents
  • Extract text from documents
  • Extract metadata from documents
  • Extract images and tables from documents
  • Extract common information like named entities (parties), dates, or financial amounts
  • Convert documents to common formats like PDF, HTML, and plain text
  • Find similar text and documents
  • Cluster and classify text and documents

The Document Index also natively supports many state-of-the-art workflows using large language models (LLMs) and generative AI. For example, the Document Index supports:

  • Direct summarization documents
  • Direct Q&A on documents
  • Memo-writing and other work product generation tasks
  • Retrieval-augmented tasks
  • "AutoGPT"-style workflows

Customization and Extension

The Document Index is built using the Kelvin API and Kelvin libraries itself, including Kelvin Source, Kelvin Vector, Kelvin NLP, Kelvin Speller, Kelvin OCR, and the Kelvin Office and Kelvin PDF APIs. This means that the Document Index can be used as a reference implementation for Kelvin, and that the Document Index can be extended and customized using the same tools that are available to Kelvin users.