Python Sounddevice Recording — Core Concepts

Understand how sounddevice connects Python to audio hardware for recording, playback, and real-time streaming using PortAudio bindings and NumPy arrays.

What sounddevice does

Sounddevice provides Python bindings to PortAudio, the cross-platform audio I/O library used by Audacity, VLC, and many DAWs. It lets you record from microphones, play through speakers, and stream audio in real time — all using NumPy arrays as the data format.

Install with pip install sounddevice. NumPy is required. PortAudio ships bundled on most platforms.

Three modes of operation

Blocking (simple)

Record a fixed duration and wait:

import sounddevice as sd
recording = sd.rec(int(3 * 44100), samplerate=44100, channels=1)
sd.wait()  # blocks until recording finishes

Play back an array:

sd.play(recording, samplerate=44100)
sd.wait()

Non-blocking (callback-based)

For real-time processing, you provide a callback function that runs every time a new chunk of audio arrives or is needed:

def callback(indata, outdata, frames, time, status):
    outdata[:] = indata  # pass-through: mic → speaker

This is the heart of live audio applications — effects processors, voice changers, audio monitors.

Stream API

sd.InputStream, sd.OutputStream, and sd.Stream (full-duplex) give you explicit control over starting, stopping, and configuring audio streams. They support both callback and read/write interfaces.

Device selection

sd.query_devices() lists available audio hardware. Select a device by index or name:

sd.default.device = (input_device_index, output_device_index)

Each device has supported sample rates, channel counts, and latency ranges. Mismatched settings cause PortAudio errors at stream open time.

Sample rates and formats

Sample rate	Typical use
8 000 Hz	Telephone, low-quality voice
16 000 Hz	Speech recognition input
44 100 Hz	CD quality
48 000 Hz	Professional audio, video

Sounddevice works with float32 by default (samples in the -1.0 to 1.0 range). It also supports int16, int32, and float64 via the dtype parameter.

Latency

Latency is the delay between a sound entering the mic and your callback receiving it. Sounddevice exposes latency settings per stream — 'low', 'high', or explicit seconds. Lower latency means smaller buffer sizes, which increases CPU load and the risk of glitches (buffer underruns).

For live monitoring, aim for under 20 ms. For recording-only tasks, higher latency is fine and more stable.

Common misconception

Sounddevice records and plays raw PCM samples — it does not read or write audio files (MP3, WAV, etc.). To save a recording to disk, use soundfile.write(). To load an existing file, use soundfile.read() or librosa.load(), then pass the array to sounddevice.

How it fits with other tools

Sounddevice handles the hardware interface. Combine it with Librosa for analysis, Pydub for editing, NumPy/SciPy for DSP, and soundfile for reading/writing audio formats. For GUI applications, pair it with PyQt or Tkinter for level meters and waveform displays.

One thing to remember: Sounddevice is your Python-to-hardware audio bridge — it captures mic input and sends speaker output as NumPy arrays, enabling everything from simple recordings to real-time audio processing.

pythonsounddeviceaudiorecordingmicrophone