HTTP Caching Strategies — Core Concepts
Why HTTP caching matters for API clients
Every API call has a cost: network latency, server processing, rate limit consumption, and bandwidth. Many API responses don’t change between calls — a user’s profile, a product catalog, configuration data. Caching these responses locally avoids redundant work on both sides.
For Python services that call external APIs thousands of times per hour, caching can reduce latency by 90% and cut rate limit usage by 50-80%.
Cache-Control: the server’s caching instructions
The Cache-Control response header tells clients how to cache:
max-age=300— response is valid for 300 seconds; don’t ask again until thenno-cache— you may store it, but must revalidate before each useno-store— never store this response (sensitive data)private— only the end client may cache (not shared proxies)public— any cache (including CDNs) may store this
When max-age is present, your client can serve the response from local storage without any network call. This is the fastest form of caching.
ETags: fingerprints for responses
An ETag is a unique identifier for a specific version of a response, like a fingerprint. The server sends it with the response:
HTTP/1.1 200 OK
ETag: "abc123def456"
On the next request, your client sends the ETag back:
GET /users/42
If-None-Match: "abc123def456"
If nothing changed, the server responds with 304 Not Modified — no body, minimal data. Your client uses the cached version. If the data changed, the server sends the full new response with a new ETag.
Last-Modified: timestamp-based validation
Similar to ETags but uses timestamps. The server sends Last-Modified: Wed, 15 Jan 2025 10:00:00 GMT, and the client sends If-Modified-Since on the next request. The 304 mechanism works the same way.
ETags are more reliable than timestamps (they detect any change, not just time-based ones), but timestamps are simpler and supported by more servers.
Caching strategies for Python clients
Strategy 1: In-memory cache — fastest, but lost when the process restarts. Good for short-lived data in web servers.
Strategy 2: File-based cache — survives restarts, good for CLI tools and scripts. Libraries like requests-cache use SQLite by default.
Strategy 3: Redis/Memcached cache — shared across multiple processes and servers. Best for distributed services.
Strategy 4: Conditional requests only — no local storage, just send ETags/timestamps. Saves bandwidth but still requires a network round-trip. Useful when data changes frequently but responses are large.
The cache invalidation problem
The famous quote: “There are only two hard things in computer science: cache invalidation and naming things.” When cached data becomes stale but the cache doesn’t know it, users see outdated information.
Mitigation approaches:
- Short TTLs — cache for seconds, not hours, for frequently changing data
- Event-driven invalidation — when data changes, explicitly delete the cache entry
- Stale-while-revalidate — serve the cached version immediately while fetching a fresh copy in the background
What to cache and what not to
Good candidates: API responses with Cache-Control headers, reference data (country lists, config), paginated list pages, user profiles in read-heavy apps.
Bad candidates: real-time data (stock prices, chat messages), POST/PUT/DELETE responses, authentication tokens (use dedicated token storage), personalized data that varies per user.
Common misconception
Developers often implement their own caching with dictionaries, ignoring HTTP cache headers entirely. This misses the server’s guidance on cache duration and freshness. The HTTP caching protocol already solves most caching problems — use a library that respects these headers instead of reinventing cache logic from scratch.
The one thing to remember: HTTP caching combines time-based freshness (Cache-Control) with validation-based freshness (ETags) — respect both, and your Python client becomes faster while placing less load on the APIs it calls.
See Also
- Python Aiohttp Client Understand Aiohttp Client through a practical analogy so your Python decisions become faster and clearer.
- Python Api Client Design Why building your own API client in Python is like creating a TV remote that only has the buttons you actually need.
- Python Api Documentation Swagger Swagger turns your Python API into an interactive playground where anyone can click buttons to try it out — no coding required.
- Python Api Mocking Responses Why testing with fake API responses is like rehearsing a play with stand-ins before the real actors show up.
- Python Api Pagination Clients Why APIs send data in pages, and how Python handles it — like reading a book one chapter at a time instead of swallowing the whole thing.