Python Sounddevice Recording — Core Concepts
What sounddevice does
Sounddevice provides Python bindings to PortAudio, the cross-platform audio I/O library used by Audacity, VLC, and many DAWs. It lets you record from microphones, play through speakers, and stream audio in real time — all using NumPy arrays as the data format.
Install with pip install sounddevice. NumPy is required. PortAudio ships bundled on most platforms.
Three modes of operation
Blocking (simple)
Record a fixed duration and wait:
import sounddevice as sd
recording = sd.rec(int(3 * 44100), samplerate=44100, channels=1)
sd.wait() # blocks until recording finishes
Play back an array:
sd.play(recording, samplerate=44100)
sd.wait()
Non-blocking (callback-based)
For real-time processing, you provide a callback function that runs every time a new chunk of audio arrives or is needed:
def callback(indata, outdata, frames, time, status):
outdata[:] = indata # pass-through: mic → speaker
This is the heart of live audio applications — effects processors, voice changers, audio monitors.
Stream API
sd.InputStream, sd.OutputStream, and sd.Stream (full-duplex) give you explicit control over starting, stopping, and configuring audio streams. They support both callback and read/write interfaces.
Device selection
sd.query_devices() lists available audio hardware. Select a device by index or name:
sd.default.device = (input_device_index, output_device_index)
Each device has supported sample rates, channel counts, and latency ranges. Mismatched settings cause PortAudio errors at stream open time.
Sample rates and formats
| Sample rate | Typical use |
|---|---|
| 8 000 Hz | Telephone, low-quality voice |
| 16 000 Hz | Speech recognition input |
| 44 100 Hz | CD quality |
| 48 000 Hz | Professional audio, video |
Sounddevice works with float32 by default (samples in the -1.0 to 1.0 range). It also supports int16, int32, and float64 via the dtype parameter.
Latency
Latency is the delay between a sound entering the mic and your callback receiving it. Sounddevice exposes latency settings per stream — 'low', 'high', or explicit seconds. Lower latency means smaller buffer sizes, which increases CPU load and the risk of glitches (buffer underruns).
For live monitoring, aim for under 20 ms. For recording-only tasks, higher latency is fine and more stable.
Common misconception
Sounddevice records and plays raw PCM samples — it does not read or write audio files (MP3, WAV, etc.). To save a recording to disk, use soundfile.write(). To load an existing file, use soundfile.read() or librosa.load(), then pass the array to sounddevice.
How it fits with other tools
Sounddevice handles the hardware interface. Combine it with Librosa for analysis, Pydub for editing, NumPy/SciPy for DSP, and soundfile for reading/writing audio formats. For GUI applications, pair it with PyQt or Tkinter for level meters and waveform displays.
One thing to remember: Sounddevice is your Python-to-hardware audio bridge — it captures mic input and sends speaker output as NumPy arrays, enabling everything from simple recordings to real-time audio processing.
See Also
- Python Arcade Library Think of a magical art table that draws your game characters, listens when you press buttons, and cleans up the mess — that's Python Arcade.
- Python Audio Fingerprinting Ever wonder how Shazam identifies a song from just a few seconds of noisy audio? Audio fingerprinting is the magic behind it, and Python can do it too.
- Python Barcode Generation Picture the stripy labels on grocery items to understand how Python can create those machine-readable barcodes from numbers.
- Python Cellular Automata Imagine a checkerboard where each square follows simple rules to turn on or off — and suddenly complex patterns emerge like magic.
- Python Godot Gdscript Bridge Imagine speaking English to a friend who speaks French, with a translator in the middle — that's how Python talks to the Godot game engine.