Python Select and Polling — Core Concepts

I/O multiplexing with Python's select, poll, and selectors modules — watching multiple sockets without threads.

The Problem: Watching Many Sockets

A chat server with 1,000 connected clients needs to know which sockets have data to read. Three naive approaches and why they fail:

Blocking read on each socket — stuck waiting on Socket 1 while Socket 500 has data.
One thread per socket — 1,000 threads consume memory and cause context-switching overhead.
Non-blocking polling in a loop — wastes CPU spinning through sockets that have nothing.

I/O multiplexing solves this: ask the OS to watch all sockets at once and report which are ready.

The `select` Module

The oldest and most portable approach. select.select() takes three lists of file descriptors and returns which ones are ready:

import select
import socket

server = socket.socket()
server.bind(("0.0.0.0", 8080))
server.listen()
server.setblocking(False)

sockets = [server]

while True:
    readable, writable, errors = select.select(sockets, [], [], 1.0)
    
    for sock in readable:
        if sock is server:
            client, addr = server.accept()
            client.setblocking(False)
            sockets.append(client)
        else:
            data = sock.recv(4096)
            if data:
                process(data)
            else:
                sockets.remove(sock)
                sock.close()

The three arguments are: sockets to watch for reading, writing, and errors. The fourth is a timeout in seconds. select.select() blocks until at least one socket is ready or the timeout expires.

Limitation: select scans all file descriptors linearly — O(n) per call. On some systems, it’s limited to 1,024 file descriptors (the FD_SETSIZE constant).

The `poll` Alternative

select.poll() removes the FD_SETSIZE limit and uses a cleaner registration API:

import select

poller = select.poll()
poller.register(server, select.POLLIN)

fd_to_socket = {server.fileno(): server}

while True:
    events = poller.poll(1000)  # timeout in milliseconds
    
    for fd, event in events:
        sock = fd_to_socket[fd]
        if event & select.POLLIN:
            if sock is server:
                client, addr = server.accept()
                client.setblocking(False)
                poller.register(client, select.POLLIN)
                fd_to_socket[client.fileno()] = client
            else:
                data = sock.recv(4096)
                if not data:
                    poller.unregister(fd)
                    del fd_to_socket[fd]
                    sock.close()

poll is still O(n) per call (the kernel scans all registered FDs), but it handles more connections and has a cleaner interface than select.

The `selectors` Module (Recommended)

Python 3.4+ includes selectors, which automatically uses the best available mechanism for your OS:

import selectors
import socket

sel = selectors.DefaultSelector()
# DefaultSelector picks:
# - epoll on Linux
# - kqueue on macOS/BSD
# - select on Windows

server = socket.socket()
server.bind(("0.0.0.0", 8080))
server.listen()
server.setblocking(False)

def accept_connection(server_sock, mask):
    client, addr = server_sock.accept()
    client.setblocking(False)
    sel.register(client, selectors.EVENT_READ, data=handle_client)

def handle_client(client_sock, mask):
    data = client_sock.recv(4096)
    if data:
        client_sock.sendall(data)  # echo
    else:
        sel.unregister(client_sock)
        client_sock.close()

sel.register(server, selectors.EVENT_READ, data=accept_connection)

while True:
    events = sel.select(timeout=1)
    for key, mask in events:
        callback = key.data
        callback(key.fileobj, mask)

The data parameter on register() lets you attach a callback or any context to each socket. DefaultSelector abstracts away the differences between epoll, kqueue, and select.

Comparing the Mechanisms

Mechanism	Complexity	Max FDs	Platform
`select`	O(n)	~1,024	All
`poll`	O(n)	Unlimited	Unix
`epoll`	O(ready)	~1,000,000	Linux
`kqueue`	O(ready)	~100,000	macOS/BSD

The critical difference: epoll and kqueue are O(number of ready FDs), not O(total FDs). With 10,000 connections where 5 have data, select/poll scan all 10,000 while epoll reports just the 5.

How This Relates to asyncio

asyncio’s event loop uses selectors.DefaultSelector internally. When you write:

data = await reader.read(4096)

Under the hood, asyncio registered the socket with the selector. When you await, the coroutine suspends, and the event loop goes back to selector.select(). When the socket has data, the selector reports it, and the event loop resumes your coroutine.

Understanding select/poll is understanding what asyncio does for you automatically.

Common Misconception

“You need to choose between select, poll, and epoll.” Not anymore. Use selectors.DefaultSelector() and let Python pick the best option for your platform. You only need the low-level select module for legacy code or very specific requirements.

One thing to remember: I/O multiplexing lets one thread watch thousands of sockets efficiently by asking the OS “who’s ready?” — use the selectors module for the right abstraction, or let asyncio handle it entirely.

pythonnetworkingsystems

Python Select and Polling — Core Concepts

The Problem: Watching Many Sockets

The select Module

The poll Alternative

The selectors Module (Recommended)

Comparing the Mechanisms

How This Relates to asyncio

Common Misconception

See Also

Related Topics

The `select` Module

The `poll` Alternative

The `selectors` Module (Recommended)