Literate Programming — Deep Dive

Implement literate programming workflows in Python with nbdev, Quarto, and Pweave — from library development to reproducible research.

From Knuth’s WEB to modern Python

Donald Knuth’s original WEB system (1984) targeted Pascal. Its successor CWEB handled C. Both worked the same way: a .web source file contained interleaved documentation (in TeX) and code chunks. The WEAVE program produced a typeset document; TANGLE produced compilable source.

The key insight was order independence. In traditional programming, functions must appear in a compiler-friendly order. In literate programming, you present them in whatever order makes the explanation clearest. The tangle step reorders them for the machine.

Python partially sidesteps this problem because it is interpreted top-to-bottom and supports forward references through function/class definitions. But the narrative-first philosophy still applies: you can organise explanations for human understanding rather than interpreter convenience.

nbdev: full library development in notebooks

Architecture

nbdev, created by Jeremy Howard and the fast.ai team, takes literate programming to its logical conclusion. Your notebook is the source of truth. From it, nbdev generates:

Python modules — Cells marked with #| export become part of the package.
Documentation — Markdown cells and cell outputs become Quarto-rendered docs.
Tests — Cells without #| export become tests that run via nbdev_test.

Directory structure:

nbs/
  00_core.ipynb       # notebook source
  01_transforms.ipynb
my_package/
  core.py             # auto-generated by nbdev_export
  transforms.py
docs/
  core.html           # auto-generated by nbdev_docs

Workflow

# Create a new project
nbdev_new --lib_name my_package

# After editing notebooks:
nbdev_export    # notebook → .py modules
nbdev_test      # run all non-export cells as tests
nbdev_docs      # generate documentation site
nbdev_clean     # strip notebook metadata for clean diffs

Directives

nbdev uses comment directives at the top of cells:

#| export
def normalize(data: list[float]) -> list[float]:
    """Min-max normalize a list of numbers."""
    lo, hi = min(data), max(data)
    span = hi - lo
    if span == 0:
        return [0.0] * len(data)
    return [(x - lo) / span for x in data]

This cell becomes part of the package and appears in the documentation with its docstring. Cells without #| export are test/exploration cells:

# This cell is a test — nbdev_test runs it
result = normalize([10, 20, 30])
assert result == [0.0, 0.5, 1.0], f"Got {result}"

Real-world adoption

The fast.ai deep learning library (over 25,000 GitHub stars) is developed entirely with nbdev. Every function, class, and module originates in a notebook. This proves the approach scales beyond toy examples to production machine-learning frameworks.

Quarto for reproducible documents

How it works

A .qmd file is markdown with executable code blocks:

---
title: "Customer Churn Analysis"
format: html
jupyter: python3
---

## Data Loading

We load the telco churn dataset and inspect its shape.

```{python}
import pandas as pd
df = pd.read_csv("telco_churn.csv")
print(f"Rows: {df.shape[0]:,}, Columns: {df.shape[1]}")
```

## Key Finding

Customers on month-to-month contracts churn at 3× the rate
of those on two-year contracts.

```{python}
#| label: fig-churn-rate
#| fig-cap: "Churn rate by contract type"
import matplotlib.pyplot as plt
rates = df.groupby("Contract")["Churn"].mean()
rates.plot.bar()
plt.ylabel("Churn Rate")
plt.show()
```

Run quarto render analysis.qmd to produce a polished HTML report with executed code, rendered plots, and cross-references. The source file is plain text — it diffs cleanly in Git, unlike .ipynb JSON.

Quarto vs Jupyter Notebooks

Feature	Jupyter	Quarto
Source format	JSON (`.ipynb`)	Plain text (`.qmd`)
Git diffs	Noisy	Clean
Cross-references	Manual links	Automatic (`@fig-churn-rate`)
Multi-language	Separate kernels	Python + R + Julia in one doc
Output formats	HTML, PDF (via nbconvert)	HTML, PDF, Word, slides, books
Interactive execution	Yes (live kernel)	Yes (via Jupyter kernel)

For publishing and reproducible research, Quarto is superior. For interactive exploration, Jupyter notebooks remain more ergonomic.

Pweave: lightweight literate Python

Pweave processes .pmd files (Python-flavoured markdown):

# Data Summary

<<>>=
import numpy as np
data = np.random.randn(1000)
print(f"Mean: {data.mean():.4f}, Std: {data.std():.4f}")
@

pweave report.pmd executes the code and produces a markdown file with outputs inline. It is simpler than nbdev or Quarto — useful for one-off reports where full toolchain setup is overkill.

Building a literate programming workflow

Step 1: Choose your tool

Building a library → nbdev
Publishing reports/papers → Quarto
Quick internal documents → Jupyter + nbconvert
Legacy compatibility → Pweave or Sphinx doctest

Step 2: Establish conventions

One concept per notebook/document section.
Export only clean, tested functions — exploration stays in non-export cells.
Use meaningful headings that serve as a table of contents.
Pin dependencies in a requirements.txt or pyproject.toml alongside the notebooks.

Step 3: Integrate with CI

# GitHub Actions
- name: Export and test
  run: |
    nbdev_export
    nbdev_test --n_workers 4
    diff -q my_package/ my_package_backup/ || echo "Modules changed"

CI ensures the generated modules always match the notebook source. If someone edits the .py file directly (a common temptation), the diff step catches the divergence.

Step 4: Publish documentation

nbdev and Quarto both generate static sites deployable to GitHub Pages, Netlify, or any static host. Automate this in CI so documentation updates on every merge to main.

Tradeoffs

Benefit	Cost
Code and docs always in sync	Learning a new toolchain (nbdev/Quarto)
Narrative-first ordering aids understanding	Not all code maps neatly to a linear story
Tests live beside the code they test	IDE support for notebooks lags behind `.py` files
Great for onboarding and knowledge transfer	Team buy-in required — mixed workflows cause friction

When literate programming is not the answer

Literate programming works best for code with a strong narrative: data analyses, research, tutorials, and library development. It works less well for:

Large application codebases with hundreds of interacting modules. A Django web app does not benefit from being written as a story.
Performance-critical inner loops where the code speaks for itself and narrative adds noise.
Rapid prototyping where the goal is to ship fast, not explain deeply.

The pragmatic approach: use literate programming for the parts of your project that benefit from explanation, and traditional development for the rest.

One thing to remember: Literate programming is not a formatting style — it is a design philosophy. When you write code as a story, you think more carefully about why each piece exists, and that thinking produces better software.

pythondocumentationprogramming-paradigms