HTTP/2 Server Push — Core Concepts

HTTP/2 fundamentals

HTTP/2 replaced the text-based HTTP/1.1 protocol with a binary framing layer. The biggest change: multiplexing. In HTTP/1.1, a browser opens multiple TCP connections (typically 6 per domain) and sends one request per connection at a time. HTTP/2 sends multiple requests and responses over a single connection simultaneously using streams.

This matters in Python because frameworks like FastAPI, Django, and Flask all benefit from HTTP/2 when deployed behind an HTTP/2-capable server (Nginx, Caddy, or Hypercorn). Applications don’t need to change — the protocol improvement happens at the transport layer.

What server push was designed to solve

Every web page triggers a waterfall of requests. The browser fetches HTML, parses it, discovers CSS and JavaScript references, then fetches those. Each discovery-and-fetch cycle adds a round trip. On a 200ms latency connection, three sequential round trips cost 600ms before the page can even start rendering.

Server push aimed to eliminate this waterfall. When the server sends HTML, it can also push the CSS and JavaScript files on the same connection without waiting for the browser to request them. The browser receives these pushed resources and stores them in a local push cache, ready for immediate use when the HTML parser discovers the references.

How push works technically

  1. Client sends a request for /index.html
  2. Server responds with the HTML and also sends PUSH_PROMISE frames for /style.css and /app.js
  3. Server sends the content of those resources on separate streams within the same connection
  4. Browser stores pushed resources in a push-specific cache
  5. When the HTML parser encounters <link href="/style.css">, the browser finds it already available — zero additional latency

The rise and fall of server push

Google introduced server push as a key HTTP/2 feature. In practice, it had serious problems:

Cache invalidation. The server doesn’t know what the browser has cached. Pushing /style.css when the browser already has it cached wastes bandwidth. There’s no reliable way for the server to check the client’s cache before pushing.

Priority conflicts. Pushed resources compete with the main response for bandwidth. Pushing a large JavaScript bundle could delay the critical HTML response.

Complexity. Implementing push correctly required careful analysis of which resources to push, when to push them, and for which clients. Most deployments either pushed too much (wasting bandwidth) or too little (negligible benefit).

Chrome removed server push support in version 106 (September 2022). Firefox and Safari still support it technically, but the ecosystem has moved on.

What replaced it

103 Early Hints. Instead of pushing the actual files, the server sends a 103 status code with Link headers hinting at which resources the browser should preload:

HTTP/2 103 Early Hints
Link: </style.css>; rel=preload; as=style
Link: </app.js>; rel=preload; as=script

Then the server sends the actual 200 response with the HTML. The browser starts fetching the hinted resources in parallel with processing the HTML. This is simpler than push because the browser decides whether to fetch based on its own cache state.

Preload headers. Even without 103, using Link: </style.css>; rel=preload in the final response header triggers immediate fetching before the HTML body is parsed.

Resource bundling. Build tools like Vite and webpack bundle multiple files together, reducing the number of requests the browser needs to make.

Python and HTTP/2

Python ASGI servers handle HTTP/2 at the server level:

  • Hypercorn — full HTTP/2 support including server push APIs
  • Uvicorn — HTTP/1.1 only; relies on a reverse proxy for HTTP/2
  • Daphne — HTTP/2 support for Django Channels

Most production Python deployments use Nginx or Caddy as a reverse proxy that handles HTTP/2 with the client, then speaks HTTP/1.1 to the Python backend. This means the Python application code doesn’t need to be HTTP/2-aware.

Common misconception

“HTTP/2 makes my Python API faster.” HTTP/2 helps with latency — multiple requests in parallel over one connection. It doesn’t make your Python code execute faster. If your endpoint takes 500ms due to a slow database query, HTTP/2 won’t change that. It helps most when pages have many resources (images, CSS, JS) that benefit from multiplexed loading.

One thing to remember: HTTP/2 server push was an elegant idea that failed in practice because servers can’t know the client’s cache state — modern alternatives like 103 Early Hints achieve similar benefits with less waste.

pythonwebhttpperformance

See Also

  • Python Aiohttp Client Understand Aiohttp Client through a practical analogy so your Python decisions become faster and clearer.
  • Python Api Client Design Why building your own API client in Python is like creating a TV remote that only has the buttons you actually need.
  • Python Api Documentation Swagger Swagger turns your Python API into an interactive playground where anyone can click buttons to try it out — no coding required.
  • Python Api Mocking Responses Why testing with fake API responses is like rehearsing a play with stand-ins before the real actors show up.
  • Python Api Pagination Clients Why APIs send data in pages, and how Python handles it — like reading a book one chapter at a time instead of swallowing the whole thing.