Legal Knowledge Graphs with Python — Core Concepts

How Python builds and queries knowledge graphs that connect statutes, case law, judges, and legal concepts into a navigable legal intelligence system

Why graphs fit law perfectly

Legal knowledge is inherently relational. A statute is interpreted by court opinions. Those opinions cite earlier opinions. Judges write opinions and serve on specific courts. Regulations implement statutes. Parties appear in cases across jurisdictions and time periods. This web of relationships is exactly what graph databases are designed to store and query.

Traditional databases store legal information in tables — one for cases, one for statutes, one for judges. Finding connections requires joining tables, which gets slow and complex as relationships multiply. A graph database stores relationships as first-class citizens, making traversal queries (“find all cases within two citation hops of Brown v. Board of Education”) fast and natural.

The legal knowledge graph schema

A legal knowledge graph typically contains these entity types (nodes):

Statutes — laws passed by legislatures, with version history
Opinions — court decisions, with metadata like court, date, and judge
Judges — biographical information and court assignments
Courts — hierarchical structure (district → circuit → Supreme Court)
Parties — individuals and organizations involved in cases
Legal concepts — topics like “due process,” “fair use,” or “negligence”
Regulations — administrative rules implementing statutes

And these relationship types (edges):

CITES — one opinion references another
INTERPRETS — an opinion interprets a statute
OVERRULES / DISTINGUISHES — how opinions relate to precedent
DECIDED_BY — links opinions to judges
FILED_IN — links cases to courts
IMPLEMENTS — links regulations to authorizing statutes
ABOUT — links opinions and statutes to legal concepts

How Python builds the graph

Entity extraction

Python NLP pipelines extract entities from legal texts. spaCy with custom legal models identifies case names, statutory references, judge names, and legal concepts. eyecite extracts citation relationships. LexNLP pulls out dates, monetary values, and jurisdictions.

Relationship extraction

Once entities are identified, Python determines how they relate. Citation extraction creates CITES edges. Parsing opinion headers creates DECIDED_BY edges. Analyzing statutory text for phrases like “pursuant to Section 5 of the Act” creates INTERPRETS edges.

Graph storage

Python connects to graph databases using drivers like neo4j (for Neo4j), rdflib (for RDF/SPARQL stores), or networkx (for in-memory analysis). Neo4j is the most common choice for production legal knowledge graphs due to its query performance and Cypher query language.

Querying for legal intelligence

The graph enables queries that would be impractical in traditional databases:

“What statutes has the 9th Circuit interpreted differently from the 5th Circuit?” — traverse INTERPRETS edges filtered by court
“Show the chain of precedent from Miranda v. Arizona to the most recent Supreme Court citation” — shortest path traversal
“Which judges most frequently cite Justice Scalia’s opinions?” — aggregation over citation patterns
“Find all cases about fair use that cite both Sony v. Universal and Campbell v. Acuff-Rose” — multi-hop pattern matching

Common misconception

People think building a legal knowledge graph requires manually entering all relationships. In reality, Python automates most of the extraction. Court opinions follow structured conventions — citations are formatted predictably, opinion headers list judges and courts consistently, and statutory references use standard numbering. Automated extraction handles 85-90% of relationships, with human review for complex or ambiguous cases.

The one thing to remember: Legal knowledge graphs store the web of relationships between statutes, opinions, judges, and concepts as first-class data, enabling Python to answer complex legal research questions through graph traversal instead of keyword search.

pythonlegal-techknowledge-graphsgraph-databases