Python SSE Client Consumption — Core Concepts
Why this matters
Server-Sent Events are becoming the default streaming protocol for AI applications. When you interact with ChatGPT, Claude, or any LLM API that streams responses token by token, the server uses SSE. Beyond AI, SSE powers real-time dashboards, live notifications, and event-driven architectures. Python developers consuming these streams need to understand the protocol, handle reconnections, and process events efficiently.
How SSE works
SSE is a one-directional protocol built on HTTP. The client makes a regular HTTP GET request, and the server responds with Content-Type: text/event-stream and keeps the connection open, sending events as plain text.
Each event is a block of text with specific fields:
event: price_update
data: {"symbol": "AAPL", "price": 178.52}
id: 1234
event: price_update
data: {"symbol": "GOOGL", "price": 141.80}
id: 1235
Key fields:
| Field | Purpose |
|---|---|
data | The event payload (required) |
event | Event type name (optional, defaults to “message”) |
id | Unique event ID for reconnection |
retry | Server-suggested reconnection delay in milliseconds |
Events are separated by blank lines. Multi-line data uses multiple data: lines that get concatenated with newlines.
SSE vs WebSockets vs polling
| Feature | SSE | WebSockets | Polling |
|---|---|---|---|
| Direction | Server → Client only | Bidirectional | Client → Server |
| Protocol | HTTP | Custom (ws://) | HTTP |
| Reconnection | Built-in with Last-Event-ID | Manual | Not applicable |
| Proxy/firewall friendly | Yes (regular HTTP) | Sometimes blocked | Yes |
| Complexity | Low | High | Low |
| Best for | Live feeds, AI streaming | Chat, gaming, collaboration | Low-frequency checks |
Choose SSE when you need server-to-client streaming over standard HTTP. Choose WebSockets when you need bidirectional communication. Choose polling when updates are infrequent (less than once per minute).
Consuming SSE in Python
The basic pattern for consuming SSE uses an HTTP library that supports streaming responses. Python’s httpx and requests both support this, and dedicated libraries like sseclient-py and httpx-sse parse the event format automatically.
The client:
- Opens a GET request with
Accept: text/event-stream. - Reads the response as a stream (not buffered to completion).
- Parses each text block into event objects.
- Processes events as they arrive.
Reconnection handling
The SSE specification includes built-in reconnection. When the connection drops:
- The client waits for the
retryinterval (default is usually 3 seconds). - The client reconnects, sending a
Last-Event-IDheader with the ID of the last received event. - The server resumes streaming from after that event.
This means properly implemented SSE streams do not lose events during brief network interruptions — the server fills in the gap on reconnection.
However, not all servers support Last-Event-ID. For servers that do not, the client must handle potential gaps in the event stream at the application level.
Processing patterns
Events can be processed in two ways:
Inline processing — handle each event as it arrives. Good for simple transformations and low-latency responses.
Buffered processing — collect events and process in batches. Good for database writes and analytics where individual event latency is less important than throughput.
Common misconception
“SSE connections waste resources because they stay open forever.” HTTP keep-alive connections are extremely lightweight — a few kilobytes of memory per connection on both client and server. The resource cost of maintaining an idle SSE connection is far less than the cost of polling every few seconds. Modern servers routinely handle hundreds of thousands of concurrent SSE connections.
When SSE breaks down
SSE has limitations:
- Client-to-server communication requires separate HTTP requests.
- Binary data is not supported natively (use Base64 encoding or switch to WebSockets).
- Some corporate proxies buffer responses, which delays event delivery. Adding
X-Accel-Buffering: nohelps with nginx. - Browser limit of 6 concurrent SSE connections per domain (not relevant for Python clients).
One thing to remember: SSE is HTTP streaming with built-in reconnection and event parsing. For server-to-client data flows — especially AI response streaming — it is simpler and more reliable than WebSockets. Your Python client opens one connection and receives a continuous stream of structured events.
See Also
- Python Api Rate Limit Handling Why APIs tell your Python program to slow down, and how to handle it gracefully — explained so anyone can follow along.
- Python Proxy Rotation Why Python programs disguise their internet address when collecting data, and how proxy rotation works — explained without any tech jargon.
- Python Web Scraping Ethics When is it okay to collect data from websites with Python, and when does it cross the line? The rules explained for everyone.
- Python Webhook Handlers How Python programs receive instant notifications from other services when something happens — explained without technical jargon.
- Ci Cd Why big apps can ship updates every day without turning your phone into a glitchy mess — CI/CD is the behind-the-scenes quality gate and delivery truck.