ETag Caching — Core Concepts

How Python APIs use ETags and conditional requests to reduce bandwidth, prevent lost updates, and improve response times.

What ETags solve

Every time a client requests a resource, the server does the same work: query the database, serialize the response, send it over the network. If the data hasn’t changed since the last request, all that network transfer is wasted. ETags provide a mechanism for the server to say “nothing changed” with a minimal response.

ETags also solve a second problem: lost updates. When two users edit the same resource simultaneously, the last save wins and the first person’s changes are silently overwritten. ETags enable optimistic concurrency control by detecting when the resource changed between read and write.

How ETag caching works

The flow has two phases:

First request (no cache):

Client sends GET /users/5
Server responds with the user data plus an ETag header: ETag: "v1a2b3"
Client stores both the data and the ETag

Subsequent request (conditional):

Client sends GET /users/5 with If-None-Match: "v1a2b3"
Server generates the current ETag for user 5
If it matches → respond with 304 Not Modified (no body)
If it doesn’t match → respond with 200 OK, new data, and new ETag

The 304 response is tiny (just headers, no body), saving bandwidth and often reducing server processing time if the ETag check can happen before full serialization.

Strong vs weak ETags

Strong ETags guarantee byte-for-byte equivalence. If two responses have the same strong ETag, they are identical down to every byte.

ETag: "abc123"

Weak ETags indicate semantic equivalence — the content is meaningfully the same but may differ in whitespace, formatting, or metadata:

ETag: W/"abc123"

Weak ETags are useful when your API serialization might produce slightly different JSON output for the same data (field ordering, floating point representation). Strong ETags require deterministic serialization.

ETags for optimistic concurrency

Beyond caching, ETags prevent lost updates:

User A fetches the resource, gets ETag: "v1"
User B fetches the same resource, gets ETag: "v1"
User A updates it with If-Match: "v1" → succeeds, new ETag is "v2"
User B tries to update with If-Match: "v1" → server returns 412 Precondition Failed

User B must re-fetch the resource (now at "v2"), review the changes, and retry. This prevents silent data loss in collaborative applications.

ETag generation strategies

The ETag value needs to change whenever the resource changes. Common approaches:

Hash of content: MD5 or SHA-256 of the response body. Accurate but requires full serialization before the ETag check.
Database timestamp: Use the updated_at column. Fast to check but misses changes from related tables.
Version counter: Increment a version field on every update. Simple and fast.
Composite: Combine timestamp + related object versions for complex resources.

Last-Modified vs ETag

HTTP offers two conditional request mechanisms:

Feature	ETag	Last-Modified
Precision	Exact (hash or version)	1-second resolution
Multiple changes per second	Handles correctly	May miss updates
Format	Opaque string	Date timestamp
Header sent	`If-None-Match`	`If-Modified-Since`

ETags are more precise and flexible. Last-Modified is simpler and works well for static files. Many APIs use both — the ETag takes precedence when both are present.

Common misconception

“ETags eliminate server processing.” Not necessarily. If the ETag is a content hash, the server still needs to build the full response to compute it. The savings come from not transmitting that response. To avoid computation too, use a version counter or timestamp that can be checked against the database without building the response object.

One thing to remember: ETags let the server skip sending unchanged data (saving bandwidth) and detect concurrent edits (preventing lost updates) — two problems solved by one HTTP header.

pythonwebhttpcachingperformance