Expertise Management

Finding the right person for the job

Law firms create value and build strong relationships by providing access to the right people at the right time. The larger the firm, however, the greater the challenge.

As the number of partners and associates reaches hundreds or thousands for many firms, it becomes increasingly difficult to find the right person with the right expertise, experience, and availability to meet a client's needs. This is especially true for new associates or lateral hires who may not have had the opportunity to build a reputation within the firm.

How then can a firm ensure that its clients are connected with the right people? How can corporate buyers find the right firm or attorney for their matter?

Expertise and experience management systems

The solution, in theory, is for firms to implement knowledge management systems that capture and organize information about their attorneys' expertise and experience. In practice, however, this is easier said than done.

The challenge is that attorneys are busy and don't have time to manually enter data into knowledge management systems. As a result, these systems are often incomplete and out of date. This makes it difficult for firms to find the right people for the job and for corporate buyers to find the right firm or attorney for their matter.

The real solution, then, is to implement or enrich knowledge management systems through the use of automation. While such systems have existed in some form for years, the dramatic improvements in large language models have unlocked a new level of accuracy and utility.

Thankfully, the Kelvin Legal Data OS makes it easy to build and deploy such systems. Keep reading to see how simple it is to create a system that automatically extracts expertise and experience data.

Using a standard taxonomy like SALI

Firms have historically managed their own standards related to expertise and experience, like the taxonomy of practice areas or industries. However, this "home-grown" approach leads to ongoing costs and inefficiencies. Furthermore, as more clients push firms to standardize their reporting or submit data for Requests for Proposals (RFPs), the value of open, shared standards has become clear.

The Standards Advancement for the Legal Industry (SALI) provides an answer to these problems. Motivated by the need for a common language to describe legal matters, SALI has developed a standard taxonomy for practice areas, industries, and other legal concepts - the Legal Matter Standard Specification.

The Kelvin Legal Data OS supports the SALI taxonomy out of the box, making it easy access information about areas of law, industries, or other legal concepts. This means that you can build knowledge management systems that are compatible with the SALI standard without worrying about the underlying data.

Let's look at how easy it is to use Kelvin to achieve this goal. We'll demonstrating how to use Kelvin Source, Kelvin NLP, and Kelvin Graph to retrieve firm profiles, match them up to SALI's Areas of Law and Industries, and then add Languages.

Retrieving Area of Law and Industry taxonomies

First, we'll load the SALI LMSS taxonomy data related to Areas of Law and Industries using Kelvin Graph. This will allow us to "ground" or "guide" a large language model like GPT-4 or MPT7B.

# imports
from kelvin.graph.models.sali.lmss import LMSSGraph

# create a graph from current LMSS standard
# see:
lmss_graph = LMSSGraph()

# get the list of top-level areas of law from the graph
aol_list = []
for concept in lmss_graph.get_areas_of_law(max_depth=1):

# get the list of top-level industries
industry_list = []
for concept in lmss_graph.get_industries(max_depth=1):

Retrieving firm profiles

Next, we'll use Kelvin Source to retrieve firm profiles from the Shearman & Sterling M&A Practice. Kelvin Source includes support for both simple HTTP clients as well as headless browser automation for more complex site scraping.

# load the page in a headless Chrome browser and return the rendered HTML
directory_url = ""
firm_source = HttpBrowserSource(directory_url)

# get the directory page
for page in firm_source:
    # parse the HTML to retrieve the attorneys
    html_doc = lxml.html.fromstring(

    # iterate through links that have /en/people/...
    for link in html_doc.xpath("//a[contains(@href, '/en/people/')]"):
        # get the url and link text
        url = link.attrib["href"]
        if "vcard" in url:


The output of this code is a list of URLs that point to individual attorney profiles, like this:


Using a large language model to label firm profiles

So far, we've managed to retrieve the list of attorneys and the URL to their profile. Next, we need to combine our SALI taxonomy data with a large language model to label each attorney profile with the appropriate Areas of Law and Industries.

To do this, we'll use Kelvin NLP to create a JSON-validating large language model engine. Then, we'll show how a simple step-by-step instruction prompt can be used to guide the model to perform the labeling task.

# load the page in a headless Chrome browser and return the rendered HTML
profile_source = HttpBrowserSource(profile_url)
for profile_page in profile_source:
  # parse the rendered HTML
  profile_doc = lxml.html.fromstring(

  # retrieve the overview and qualifications text
  overview_text = lxml.html.tostring(
      method="text", encoding="utf-8"

  qualifications_text = lxml.html.tostring(

  # combine them in markup for the LLM
  combined_text = f"""## Overview


The output of this code is a single string that contains the text from the overview and qualifications sections of the attorney's profile like this:

## Overview

George Casey is Global Managing Partner of the firm...

## Qualifications
        Boston University School of Law
        J.D., cum laude

Next, we'll use kelvin.nlp to create a JSON-validating large language model engine. Kelvin provides a number of other engines and related utilities, including open-source, on-premises LLMs like MPT7B or XML and YAML-validating engines.

# create the LLM engine with a JSON-validating harness
# this will ensure that valid JSON is returned as a proper Python object
llm_gpt35 = JSONEngine(OpenAIEngine("gpt-3.5-turbo"))

We support dozens of LLMs currently, including GPT-4 through OpenAI or Azure, Claude from Anthropic, or transformers models like MPT-7B, GPT4All, Dolly2, and more. It's easy to add your own LLM engine for any other text completion or chat assistant-style model.

Finally, we'll construct a system prompt and user prompt for each attorney profile, sending this as input to the LLM engine. The system prompt will guide the LLM to perform the labeling task, while the user prompt will provide the text from the attorney's profile. The Kelvin NLP engine will then ensure that a valid JSON object is returned based on our SALI taxonomy.

# create the system prompt to guide the chat completion model
system_prompt = f"""You are building a database of law firm personnel.

### Areas of Law

### Industries

### Instructions
1. Read the text content containing Overview and Qualifications.
2. Summarize the information about the attorney as a JSON object:
    - areas: one or more Areas of Law from above that the attorney practices
    - industries: one or more Industries from above that the attorney has experience in
    - languages: languages listed in Qualifications that the attorney speaks
3. If you cannot find any structured data, return an empty JSON object.
4. Do not return any other data."""

# create the user prompt with the attorney's profile text
user_prompt = f"""
### Text Content

### Structured Data

Executing this code will return a JSON object that contains the structured data for the attorney's profile, like this:

# execute the prompt
profile_data = llm_gpt35.get_completion_values(


  'areas': [
    'Corporate Law',
    'Securities and Financial Instruments Law'
  'industries': [
    'Finance and Insurance Services Industry',
    'Professional Services Industry'
  'languages': [

Integrating LLMs into your document management and timekeeping systems

Firm profiles are an excellent source of curated information. However, they may still be incomplete or out-of-data and they may not be available for all attorneys. More importantly, firm profile pages do not capture information about contracting experience, whether an attorney currently has spare capacity, or whether there are any potential conflicts.

To address these issues, the Kelvin Legal Data OS can integrate into a number of other systems like document management systems, timekeeping systems, CRM, or practice management systems. This allows you to build a complete picture of your firm's capabilities and experience, including information drawn from documents, matters, and time entries.

For example, you can use Kelvin Billing to query recent Aderant time data and determine an attorney's average utilization over the last few days. See the Kelvin Billing examples for more information on how to do this.

Next level: “air traffic control” for work allocation

Once you've built a complete picture of your firm's capabilities and experience, you can use Kelvin to take your project management and resource allocation to the next level. For example, you can build triage and routing systems that use LLMs to match incoming work to the best available attorney based on their experience, capacity, and conflicts. See the litigation and M&A/due diligence examples to see two specific examples of this in action.