Python Work Queue Patterns — Core Concepts
The Producer-Consumer Model
Work queues implement the producer-consumer pattern. Producers create work items and put them into a queue. Consumers (workers) pull items from the queue and process them. The queue decouples producers from consumers — they don’t need to know about each other and can run at different speeds.
This decoupling gives you three benefits:
- Buffering — absorb traffic spikes without losing work
- Parallelism — multiple workers process items concurrently
- Resilience — if a worker crashes, the item can be reprocessed by another
In-Process Work Queues
queue.Queue with Threads
Python’s built-in thread-safe queue is the simplest work queue:
import queue
import threading
work_queue = queue.Queue(maxsize=100)
def worker():
while True:
item = work_queue.get()
if item is None:
break
process(item)
work_queue.task_done()
# Start 4 worker threads
threads = [threading.Thread(target=worker) for _ in range(4)]
for t in threads:
t.start()
# Producer
for task in generate_tasks():
work_queue.put(task)
# Wait for all tasks to complete
work_queue.join()
# Signal workers to stop
for _ in threads:
work_queue.put(None)
task_done() and join() coordinate completion: join() blocks until every item that was put() has been matched by a task_done() call.
asyncio.Queue with Coroutines
The async equivalent for I/O-bound work:
import asyncio
async def worker(name, queue):
while True:
item = await queue.get()
await process_async(item)
queue.task_done()
async def main():
queue = asyncio.Queue(maxsize=100)
workers = [asyncio.create_task(worker(f"w-{i}", queue))
for i in range(10)]
for task in generate_tasks():
await queue.put(task)
await queue.join()
for w in workers:
w.cancel()
Distributed Work Queues
When work needs to span multiple processes or machines, you need an external broker.
Redis Lists as Simple Queues
Redis LPUSH/BRPOP creates a basic work queue:
- Producer:
LPUSH queue_name task_data - Worker:
BRPOP queue_name timeout(blocking pop)
This is simple but not reliable — if a worker crashes after popping an item, that item is lost.
Redis Reliable Queue Pattern
Use BRPOPLPUSH (or BLMOVE in Redis 6.2+) to atomically pop from the work queue and push to a processing list. On success, remove from the processing list. On failure, a monitor process can move items from the processing list back to the work queue.
Celery/RQ
For production systems, frameworks like Celery and RQ handle the queue infrastructure, retries, result storage, and monitoring. They’re built on top of Redis or RabbitMQ and save you from reimplementing reliability patterns.
Queue Sizing
Should your queue be bounded or unbounded?
- Unbounded: Never blocks the producer but can consume unlimited memory. Suitable when you trust the production rate won’t outpace consumption long-term.
- Bounded: Blocks the producer when full, providing natural backpressure. Prevents memory issues but can cause producer slowdowns.
For most systems, bounded queues with monitoring on queue depth are the right default. Set the bound large enough for normal burst absorption but small enough to prevent memory exhaustion.
Acknowledgment Patterns
How does the system know a task was completed successfully?
- Auto-ack: Item is considered done as soon as it’s pulled from the queue. Simple but unsafe — worker crash = lost work.
- Manual ack: Worker explicitly confirms completion after processing. Unacknowledged items can be requeued. This is what
task_done()provides in Python’squeue.Queueand what RabbitMQ’sbasic.ackprovides.
For anything beyond toy projects, use manual acknowledgment.
Common Misconception
“More workers always means faster processing.” Adding workers beyond the number of available CPU cores (for CPU work) or beyond the I/O concurrency of your bottleneck (for I/O work) doesn’t help. It adds overhead from context switching, lock contention, and resource competition. Profile your system to find the optimal worker count.
One thing to remember: The three things every production work queue needs are bounded capacity (backpressure), manual acknowledgment (reliability), and worker health monitoring (resilience). Start with these and add complexity only when needed.
See Also
- Python Dead Letter Queues What happens to messages that can't be delivered — and why Python systems need a lost-and-found box.
- Python Delayed Task Execution How Python programs schedule tasks to run later — like setting an alarm for your code.
- Python Distributed Locks How Python programs take turns with shared resources — like a bathroom door lock, but for computers.
- Python Fan Out Fan In Pattern How Python splits big jobs into small pieces, runs them all at once, then puts the results back together.
- Python Message Deduplication Why computer messages sometimes get delivered twice — and how Python stops them from doing double damage.