Kelvin NLP

Kelvin NLP is comprehensive natural language processing library designed to provide both low-level and high-level functionality via Python or API access. The core functionality of the library is implemented without dependence on external frameworks like spaCy, NLTK, or OpenNLP, allowing for a high-performance, low-impact, permissive use at scale.

While not required, for convenience and compatibility, Kelvin NLP does provide an interface to libraries like spaCy or transformers and to external APIs like GPT from OpenAI/Azure Cognitive Services or Claude from Anthropic.

Kelvin NLP can be used as:

A high-performance Python library with minimal external dependencies and no remote APIs
A convenient Python library with integration into frameworks like spaCy or transformers
A stateless, high-performance API with an OpenAPI schema compatible with nearly 50 languages

The following durable storage options are available:

Filesystem
Postgres
SQL Server

Use Cases and Tasks

Kelvin NLP is designed to simplify the following common use cases:

Document Search
Document or Text Clustering
Document or Text Classification
Feature Engineering
Feature Storage

Kelvin NLP provides support for the following tasks:

Low-Level Tasks

tokenization of legal documents
segmentation of legal documents into sentences, paragraphs, clauses, sections, or subdocuments
classification of characters or tokens into common types (e.g., punctuation, enumeration, roman numeral)
generation of token or character n-gram sequences
efficient lookup structure construction (e.g., DAWG, B-K Trees,)
efficient similarity structure construction (e.g., locally-sensitive hashes or forests)
spare or dense frequency matrix construction
dictionary or stopword construction
JSONL or XML training sample construction
custom NLP model training via gensim, scikit-learn, and transformers

High-Level Tasks

word or phrase similarity (string or token distance metrics)
document or segment embedding construction (shallow via gensim, deep via transformers)
clustering by term, n-gram, embeddings, or LLM
classification by term, n-gram, embeddings, or LLM
information extraction (built-in offline models or external LLM)
- addresses
- dates
- durations
- entities (persons, companies, etc.)
- money or currency
- numeric values
- percentages or ratios
summarization via LLM (e.g., GPT, T5 or other transformers models)
knowledge graph extraction via LLM
question answering via LLM
generative patterns like drafting emails or memos via LLM

LLM Support

For more information about which large language models can be used with Kelvin NLP, please see the Supported LLMs page.

API and Library Documentation

You can learn more about Kelvin NLP by reviewing the PyDoc documentation and OpenAPI schemas provided with the library. To access the PyDoc documentation, you can run the following command:

kelvin --docs kelvin.nlp

Kelvin NLP

Kelvin NLP #

Use Cases and Tasks #

Low-Level Tasks #

High-Level Tasks #

LLM Support #

API and Library Documentation #

Examples #

Kelvin NLP

Use Cases and Tasks

Low-Level Tasks

High-Level Tasks

LLM Support

API and Library Documentation

Examples