Python Buffer Protocol — Core Concepts

The buffer protocol is a CPython-level interface that allows objects to expose their internal memory to other objects without copying. It is the foundation for efficient data interchange between Python’s built-in types, NumPy, and C extensions.

The Problem: Data Copying

Consider converting a NumPy array to bytes:

import numpy as np
arr = np.array([1.0, 2.0, 3.0, 4.0], dtype=np.float64)

# Without buffer protocol: copy all 32 bytes
raw_bytes = bytes(arr)  # Creates a new copy in memory

# With buffer protocol: zero-copy view
view = memoryview(arr)  # Points to arr's actual memory

For a 1GB array, the difference between copying and sharing is the difference between seconds of delay and instant access.

How It Works

An object that supports the buffer protocol implements two C-level methods:

  1. bf_getbuffer: Fills a Py_buffer struct with information about the memory.
  2. bf_releasebuffer: Releases the buffer when the consumer is done.

The Py_buffer struct describes:

FieldMeaning
bufPointer to the actual memory
lenTotal size in bytes
itemsizeSize of each element
formatElement type (e.g., "d" for double, "i" for int)
ndimNumber of dimensions
shapeSize along each dimension
stridesBytes to skip to reach the next element in each dimension
readonlyWhether the memory can be modified

memoryview: The Python Interface

memoryview is the standard way to use the buffer protocol from Python code:

data = bytearray(b"Hello, World!")
view = memoryview(data)

# Access individual bytes
print(view[0])  # 72 (ASCII 'H')

# Slice without copying
sub = view[7:12]  # Points to "World" in original memory
sub[0] = ord('E')
print(data)  # bytearray(b'Hello, Eorld!')

Changes through the memoryview affect the original object. No data is duplicated.

Typed Views

You can cast memoryviews to interpret memory as different types:

raw = bytearray(16)
int_view = memoryview(raw).cast('i')  # View as 4-byte integers
int_view[0] = 42
int_view[1] = 100

# raw now contains the binary representation of 42 and 100

Which Objects Support It?

ObjectReadWrite
bytes❌ (immutable)
bytearray
array.array
numpy.ndarray
memoryviewDepends on source
str
list

Note: str and list do not support the buffer protocol. Lists contain pointers to objects, not contiguous data. Strings have internal encoding complexity that makes direct buffer access impractical.

Strides and Multi-Dimensional Arrays

Strides explain how to navigate multi-dimensional data in flat memory:

import numpy as np
arr = np.array([[1, 2, 3],
                [4, 5, 6]], dtype=np.int32)

view = memoryview(arr)
print(view.shape)    # (2, 3)
print(view.strides)  # (12, 4) — 12 bytes per row, 4 bytes per element

Strides enable views like transpositions without moving data:

transposed = arr.T
t_view = memoryview(transposed)
print(t_view.strides)  # (4, 12) — strides swapped, same memory

Real-World Usage Patterns

Sending Data Over a Network

import socket

data = bytearray(1024 * 1024)  # 1MB buffer
view = memoryview(data)
sent = 0
while sent < len(data):
    sent += sock.send(view[sent:])  # No copies during slicing

Sharing Between Libraries

import numpy as np
from PIL import Image

# NumPy array → PIL Image without copying
arr = np.random.randint(0, 255, (480, 640, 3), dtype=np.uint8)
img = Image.frombuffer("RGB", (640, 480), arr, "raw", "RGB", 0, 1)

Common Misconception

“Using memoryview is always faster than slicing bytes.” For small data (under ~1KB), the overhead of creating a memoryview object outweighs the copy savings. The buffer protocol shines with large data — megabytes and above — where avoiding copies makes a dramatic difference.

One Thing to Remember

The buffer protocol is Python’s mechanism for zero-copy data sharing. It lets objects expose their raw memory through a standard interface, enabling NumPy, Pandas, PIL, and dozens of other libraries to pass large datasets around without duplicating a single byte.

pythonbuffer-protocolmemoryviewzero-copynumpy

See Also

  • Python Boost Python Bindings Boost.Python lets C++ code talk to Python using clever C++ tricks, like teaching two people to understand each other through a shared phrasebook.
  • Python Capsule Api Python Capsules let C extensions secretly pass pointers to each other through Python, like friends passing a sealed envelope through a mailbox.
  • Python Cffi Bindings CFFI lets Python talk to fast C libraries, like giving your app a translator that speaks both languages at the same table.
  • Python Extension Modules Api The C Extension API is how Python lets you plug in hand-built C code, like adding a turbo engine under your Python program's hood.
  • Python Maturin Build Tool Maturin packages Rust code into Python libraries you can pip install, like a gift-wrapping service for super-fast code.