cattrs Structuring — Deep Dive
cattrs is the serialization counterpart to attrs, designed to convert between typed Python objects and unstructured data (dicts, lists, primitives) with minimal boilerplate and maximum performance. Understanding its internals — the hook system, converter pipeline, and customization mechanisms — unlocks patterns that are difficult or impossible with monolithic validation-serialization libraries.
1) Converter architecture
The Converter class is the central object. It maintains two registries: structuring hooks (raw → typed) and unstructuring hooks (typed → raw). When you call converter.structure(data, SomeType), it:
- Looks up the structuring hook for
SomeType. - If no hook exists, generates one based on the type’s structure.
- Caches the generated hook.
- Calls the hook with the raw data.
Hook generation uses functools and code generation internally. For an attrs class with 5 fields, cattrs generates a function roughly equivalent to:
def structure_MyClass(data, _):
return MyClass(
field_a=converter.structure(data["field_a"], int),
field_b=converter.structure(data["field_b"], str),
field_c=converter.structure(data.get("field_c"), Optional[float]),
field_d=converter.structure(data["field_d"], List[str]),
field_e=converter.structure(data["field_e"], NestedClass),
)
This generated function avoids the overhead of runtime introspection on every call.
2) Custom hooks
You register custom hooks with converter.register_structure_hook and converter.register_unstructure_hook:
from datetime import datetime
from cattrs import Converter
converter = Converter()
# Custom structuring: ISO string → datetime
converter.register_structure_hook(
datetime,
lambda v, _: datetime.fromisoformat(v) if isinstance(v, str) else v
)
# Custom unstructuring: datetime → ISO string
converter.register_unstructure_hook(
datetime,
lambda v: v.isoformat()
)
Hooks are looked up by exact type first, then by isinstance checks for base classes. You can also register hooks based on predicates using register_structure_hook_func:
import attrs
converter.register_structure_hook_func(
lambda t: attrs.has(t), # predicate: any attrs class
custom_attrs_structurer,
)
3) Strategy-based configuration
cattrs 23.x introduced strategies — high-level configuration patterns applied to a converter. Key strategies:
Rename strategy: Convert between naming conventions.
from cattrs.strategies import configure_tagged_union
from cattrs.gen import override, make_dict_structure_fn
# camelCase ↔ snake_case using gen overrides
def to_camel(name: str) -> str:
parts = name.split("_")
return parts[0] + "".join(p.capitalize() for p in parts[1:])
converter.register_structure_hook(
User,
make_dict_structure_fn(
User, converter,
**{a.name: override(rename=to_camel(a.name)) for a in attrs.fields(User)}
)
)
Tagged union strategy: Discriminate union types by a field value.
from cattrs.strategies import configure_tagged_union
@attrs.define
class Circle:
radius: float
@attrs.define
class Rectangle:
width: float
height: float
Shape = Circle | Rectangle
configure_tagged_union(Shape, converter, tag_name="type")
# {"type": "Circle", "radius": 5.0} → Circle(radius=5.0)
4) The gen module: generated converters
cattrs.gen provides fine-grained control over generated functions:
from cattrs.gen import make_dict_structure_fn, make_dict_unstructure_fn, override
structure_user = make_dict_structure_fn(
User, converter,
email=override(rename="emailAddress"),
_cattrs_forbid_extra_keys=True, # reject unknown fields
)
converter.register_structure_hook(User, structure_user)
The override function controls per-field behavior:
rename— map external field names to internal ones.omit— skip a field during structuring or unstructuring.struct_hook— custom structuring function for this specific field.unstruct_hook— custom unstructuring function for this specific field.
_cattrs_forbid_extra_keys=True adds strict validation — unknown keys in the input dict raise an exception. This catches API contract violations that would otherwise silently pass.
5) Preconf converters for specific formats
cattrs ships pre-configured converters for common serialization formats:
from cattrs.preconf.json import make_converter as json_converter
from cattrs.preconf.msgpack import make_converter as msgpack_converter
from cattrs.preconf.toml import make_converter as toml_converter
from cattrs.preconf.bson import make_converter as bson_converter
Each preconf converter registers appropriate hooks for types that need special handling in that format. The JSON converter, for example, serializes datetime to ISO strings and bytes to base64. The msgpack converter uses binary-native representations.
jc = json_converter()
data = jc.unstructure(user) # All values are JSON-compatible
json_bytes = json.dumps(data).encode()
6) Performance characteristics
cattrs’ generated-function approach makes it one of the fastest Python serialization libraries:
| Operation | cattrs | Pydantic v2 | Marshmallow |
|---|---|---|---|
| Structure 100k flat objects | ~250ms | ~200ms | ~800ms |
| Unstructure 100k flat objects | ~150ms | ~100ms | ~600ms |
| Structure nested (3 levels) | ~400ms | ~350ms | ~1200ms |
cattrs is within 20-30% of Pydantic v2 despite being pure Python (Pydantic v2 uses a Rust core). For most applications, this difference is irrelevant compared to network I/O and database access.
Tips for maximum performance:
- Warm up converters by calling
structure/unstructureonce per type before the hot path. - Use
make_dict_structure_fnwith_cattrs_detailed_validation=Falseto skip error accumulation and fail on first error (faster for valid data). - Prefer
tupleoverlistfor fixed-length sequences — cattrs optimizes tuple structuring.
7) Error handling
By default, cattrs raises ClassValidationError (a subclass of BaseExceptionGroup in Python 3.11+) that collects all structuring errors:
try:
user = converter.structure(bad_data, User)
except cattrs.ClassValidationError as e:
for sub_exc in e.exceptions:
print(f"Field error: {sub_exc}")
For nested structures, errors include the full path to the failing field. This is invaluable for API error responses — you can map cattrs errors directly to field-level error messages.
To disable error accumulation for performance:
from cattrs.gen import make_dict_structure_fn
fast_hook = make_dict_structure_fn(
User, converter,
_cattrs_detailed_validation=False,
)
This raises on the first error instead of collecting all of them.
8) Production patterns
API serialization layer:
# Define converter once at module level
api_converter = json_converter()
# Register all domain type hooks
configure_tagged_union(Event, api_converter, tag_name="event_type")
api_converter.register_structure_hook(datetime, lambda v, _: datetime.fromisoformat(v))
# In request handler
def handle_request(raw_json: dict) -> dict:
event = api_converter.structure(raw_json, Event)
result = process_event(event)
return api_converter.unstructure(result)
Configuration loading:
import tomllib
toml_conv = toml_converter()
with open("config.toml", "rb") as f:
raw = tomllib.load(f)
config = toml_conv.structure(raw, AppConfig)
# config is now a fully typed, validated AppConfig instance
Message queue serialization:
msgpack_conv = msgpack_converter()
# Publish
payload = msgpack.packb(msgpack_conv.unstructure(event))
channel.publish(payload)
# Consume
raw = msgpack.unpackb(message.body)
event = msgpack_conv.structure(raw, Event)
9) cattrs vs alternatives
| Aspect | cattrs | Pydantic v2 | Marshmallow |
|---|---|---|---|
| Model definition | External (attrs/dataclass) | Built-in BaseModel | External Schema classes |
| Conversion approach | Generated functions | Rust core parser | Python methods |
| Customization | Hook-based | Validator decorators | Hook decorators |
| Schema/model coupling | Decoupled | Tightly coupled | Decoupled |
| Union handling | Explicit strategies | Discriminated unions | Manual |
| Format preconfs | JSON, msgpack, TOML, BSON | JSON only | JSON only |
Choose cattrs when you want clean separation between your data models and serialization logic, especially in architectures where the same domain objects need different representations for different consumers.
One thing to remember: cattrs generates optimized, type-driven converter functions that keep serialization logic completely separate from your domain models — giving you fast, flexible, and maintainable data transformation across any format.
See Also
- Python Airflow Anti Patterns How Airflow Anti Patterns helps Python teams reduce surprises and keep systems predictable.
- Python Airflow Automation Playbook How Airflow Automation Playbook helps Python teams reduce surprises and keep systems predictable.
- Python Airflow Best Practices How Airflow Best Practices helps Python teams reduce surprises and keep systems predictable.
- Python Airflow Caching Patterns How Airflow Caching Patterns helps Python teams reduce surprises and keep systems predictable.
- Python Airflow Configuration Management How Airflow Configuration Management helps Python teams reduce surprises and keep systems predictable.