Python Select and Polling — Core Concepts

The Problem: Watching Many Sockets

A chat server with 1,000 connected clients needs to know which sockets have data to read. Three naive approaches and why they fail:

  1. Blocking read on each socket — stuck waiting on Socket 1 while Socket 500 has data.
  2. One thread per socket — 1,000 threads consume memory and cause context-switching overhead.
  3. Non-blocking polling in a loop — wastes CPU spinning through sockets that have nothing.

I/O multiplexing solves this: ask the OS to watch all sockets at once and report which are ready.

The select Module

The oldest and most portable approach. select.select() takes three lists of file descriptors and returns which ones are ready:

import select
import socket

server = socket.socket()
server.bind(("0.0.0.0", 8080))
server.listen()
server.setblocking(False)

sockets = [server]

while True:
    readable, writable, errors = select.select(sockets, [], [], 1.0)
    
    for sock in readable:
        if sock is server:
            client, addr = server.accept()
            client.setblocking(False)
            sockets.append(client)
        else:
            data = sock.recv(4096)
            if data:
                process(data)
            else:
                sockets.remove(sock)
                sock.close()

The three arguments are: sockets to watch for reading, writing, and errors. The fourth is a timeout in seconds. select.select() blocks until at least one socket is ready or the timeout expires.

Limitation: select scans all file descriptors linearly — O(n) per call. On some systems, it’s limited to 1,024 file descriptors (the FD_SETSIZE constant).

The poll Alternative

select.poll() removes the FD_SETSIZE limit and uses a cleaner registration API:

import select

poller = select.poll()
poller.register(server, select.POLLIN)

fd_to_socket = {server.fileno(): server}

while True:
    events = poller.poll(1000)  # timeout in milliseconds
    
    for fd, event in events:
        sock = fd_to_socket[fd]
        if event & select.POLLIN:
            if sock is server:
                client, addr = server.accept()
                client.setblocking(False)
                poller.register(client, select.POLLIN)
                fd_to_socket[client.fileno()] = client
            else:
                data = sock.recv(4096)
                if not data:
                    poller.unregister(fd)
                    del fd_to_socket[fd]
                    sock.close()

poll is still O(n) per call (the kernel scans all registered FDs), but it handles more connections and has a cleaner interface than select.

Python 3.4+ includes selectors, which automatically uses the best available mechanism for your OS:

import selectors
import socket

sel = selectors.DefaultSelector()
# DefaultSelector picks:
# - epoll on Linux
# - kqueue on macOS/BSD
# - select on Windows

server = socket.socket()
server.bind(("0.0.0.0", 8080))
server.listen()
server.setblocking(False)

def accept_connection(server_sock, mask):
    client, addr = server_sock.accept()
    client.setblocking(False)
    sel.register(client, selectors.EVENT_READ, data=handle_client)

def handle_client(client_sock, mask):
    data = client_sock.recv(4096)
    if data:
        client_sock.sendall(data)  # echo
    else:
        sel.unregister(client_sock)
        client_sock.close()

sel.register(server, selectors.EVENT_READ, data=accept_connection)

while True:
    events = sel.select(timeout=1)
    for key, mask in events:
        callback = key.data
        callback(key.fileobj, mask)

The data parameter on register() lets you attach a callback or any context to each socket. DefaultSelector abstracts away the differences between epoll, kqueue, and select.

Comparing the Mechanisms

MechanismComplexityMax FDsPlatform
selectO(n)~1,024All
pollO(n)UnlimitedUnix
epollO(ready)~1,000,000Linux
kqueueO(ready)~100,000macOS/BSD

The critical difference: epoll and kqueue are O(number of ready FDs), not O(total FDs). With 10,000 connections where 5 have data, select/poll scan all 10,000 while epoll reports just the 5.

How This Relates to asyncio

asyncio’s event loop uses selectors.DefaultSelector internally. When you write:

data = await reader.read(4096)

Under the hood, asyncio registered the socket with the selector. When you await, the coroutine suspends, and the event loop goes back to selector.select(). When the socket has data, the selector reports it, and the event loop resumes your coroutine.

Understanding select/poll is understanding what asyncio does for you automatically.

Common Misconception

“You need to choose between select, poll, and epoll.” Not anymore. Use selectors.DefaultSelector() and let Python pick the best option for your platform. You only need the low-level select module for legacy code or very specific requirements.

One thing to remember: I/O multiplexing lets one thread watch thousands of sockets efficiently by asking the OS “who’s ready?” — use the selectors module for the right abstraction, or let asyncio handle it entirely.

pythonnetworkingsystems

See Also

  • Python Signal Handling How your Python program hears when the operating system taps it on the shoulder and says 'hey, stop' or 'hey, wake up.'
  • Ci Cd Why big apps can ship updates every day without turning your phone into a glitchy mess — CI/CD is the behind-the-scenes quality gate and delivery truck.
  • Containerization Why does software that works on your computer break on everyone else's? Containers fix that — and they're why Netflix can deploy 100 updates a day without the site going down.
  • Python 310 New Features Python 3.10 gave programmers a shape-sorting machine, friendlier error messages, and cleaner ways to say 'this or that' in type hints.
  • Python 311 New Features Python 3.11 made everything faster, error messages smarter, and let you catch several mistakes at once instead of stopping at the first one.