Named Entity Recognition in Python — Core Concepts

Named Entity Recognition (NER) identifies and classifies named entities in text into predefined categories. It is one of the foundational tasks in information extraction and powers applications from knowledge graph construction to automated compliance checking.

Standard Entity Types

Most NER systems recognize a common set of entity types:

  • PERSON — individual names (Marie Curie, Satya Nadella).
  • ORG — organizations (NASA, Goldman Sachs, UEFA).
  • GPE — geopolitical entities (France, New York, European Union).
  • DATE — absolute or relative dates (January 5th, last Tuesday).
  • MONEY — monetary values ($4.2 billion, €50).
  • LOC — non-political locations (Mount Everest, the Pacific Ocean).

Domain-specific NER adds custom types: drug names in healthcare, gene symbols in biology, product SKUs in retail.

How NER Works

Rule-Based Approaches

Pattern rules match entities using dictionaries and regular expressions. If you have a list of 500 drug names, a rule-based system can find them with perfect precision. The tradeoff: it misses anything not on the list, and maintaining those lists is manual work.

Statistical Models

Machine learning models learn to recognize entities from annotated training data. They consider features like:

  • The word itself and its neighbors.
  • Capitalization and word shape (Xxxx, dd/dd/dddd).
  • Part-of-speech tags.
  • Position in the sentence.

spaCy’s default NER model uses a transition-based neural network. It reads tokens left-to-right and predicts whether each token begins, continues, or is outside an entity.

Transformer Models

Models like BERT treat NER as a token classification task. Each token gets a label (B-PER for the beginning of a person name, I-PER for continuation, O for outside). Transformer-based NER achieves the highest accuracy, especially on ambiguous cases where context matters.

Python Libraries for NER

LibraryApproachSpeedAccuracyGPU Needed
spaCy (sm/lg)StatisticalFastGoodNo
spaCy (trf)TransformerSlowVery goodRecommended
Hugging FaceTransformerSlowBestYes
NLTKRule + statisticalMediumFairNo
StanzaNeuralMediumVery goodOptional
FlairStacked embeddingsSlowVery goodRecommended

For most projects, spaCy is the right starting point. Its models are fast enough for production and accurate enough for general-purpose entity types.

BIO Tagging Scheme

NER models use a tagging scheme to handle multi-word entities. The most common is BIO:

  • B-TYPE — beginning of an entity of TYPE.
  • I-TYPE — inside (continuation) of the entity.
  • O — not part of any entity.

Example: “Barack Obama visited New York”

TokenTag
BarackB-PER
ObamaI-PER
visitedO
NewB-GPE
YorkI-GPE

This encoding lets models handle entities of any length and distinguish adjacent entities of the same type.

Evaluation Metrics

NER is evaluated at the entity level, not the token level:

  • Exact match — the predicted entity must have the correct type AND the correct span boundaries. “Barack” alone when the gold label is “Barack Obama” counts as wrong.
  • Partial match — gives credit for overlapping spans. Useful during development but not standard for benchmarks.

Metrics are reported per entity type (precision, recall, F1 for PERSON, ORG, etc.) and as a micro-average across all types.

State-of-the-art models score 90-93 F1 on English news benchmarks (CoNLL-2003). On domain-specific text without fine-tuning, expect 70-80 F1.

Common Misunderstanding

People assume NER works equally well across all text types. It does not. Models trained on news articles struggle with social media (abbreviated names, slang), legal documents (unusual entity structures), and scientific papers (specialized nomenclature). Fine-tuning on even a few hundred annotated examples from your domain typically improves F1 by 10-15 points.

The one thing to remember: NER detects and classifies named entities in text — choose spaCy for speed and general use, fine-tune a transformer when accuracy on your specific domain matters most.

pythonnernlptext-processing

See Also

  • Python Adaptive Learning Systems How Python builds learning apps that adjust to each student like a personal tutor who knows exactly what you need next.
  • Python Airflow Learn Airflow as a timetable manager that makes sure data tasks run in the right order every day.
  • Python Altair Learn Altair through the idea of drawing charts by describing rules, not by hand-placing every visual element.
  • Python Automated Grading How Python grades homework and exams automatically, from simple answer keys to understanding written essays.
  • Python Batch Vs Stream Processing Batch processing is like doing laundry once a week; stream processing is like a self-cleaning shirt that cleans itself constantly.