attrs Library — Deep Dive
attrs has been a foundational library in the Python ecosystem since 2015, used by projects like Twisted, pytest, Hypothesis, and thousands of production systems. Its design philosophy — “write classes without boilerplate, with correctness guarantees” — makes it the tool of choice when dataclasses are not enough. This deep dive covers its internals, advanced features, and real-world usage patterns.
1) How attrs generates classes
When you apply @attrs.define, a metaclass-free transformation happens:
- The decorator scans the class body for annotated attributes.
- For each attribute, it creates an
Attributedescriptor object storing the name, type, default, validator, converter, and metadata. - It generates
__init__,__repr__,__eq__, and optionally__hash__,__lt__/__gt__(iforder=True) as plain Python functions. - If
slots=True(the default in@define), it creates a new class with__slots__set to the attribute names and copies the generated methods over.
The slot class creation is why @define classes are not the “same” class object as the one you wrote — attrs creates a new class dynamically. This matters when using super() in complex inheritance hierarchies, though attrs handles the common cases correctly.
2) The attribute pipeline: converters → validators
Each attribute goes through a defined pipeline during __init__:
raw value → converter (if any) → validator (if any) → stored on instance
This ordering is intentional. Converters normalize input types so validators see a consistent type:
import attrs
@attrs.define
class Temperature:
celsius: float = attrs.field(
converter=float,
validator=[
attrs.validators.instance_of(float),
attrs.validators.ge(-273.15),
],
)
Here, passing Temperature("36.6") works: the string is converted to a float, then validated. Without the converter, instance_of(float) would reject the string.
3) Advanced validators
Beyond built-in validators, attrs supports several advanced patterns:
Conditional validation:
@attrs.define
class Shipment:
status: str = attrs.field(validator=attrs.validators.in_(["pending", "shipped", "delivered"]))
tracking_number: str | None = attrs.field(default=None)
def __attrs_post_init__(self):
if self.status == "shipped" and not self.tracking_number:
raise ValueError("Shipped items must have a tracking number")
Cross-field validation with __attrs_post_init__:
This method runs after all fields are set, making it the right place for validators that depend on multiple fields.
Composing validators with and_:
name_validator = attrs.validators.and_(
attrs.validators.instance_of(str),
attrs.validators.min_len(1),
attrs.validators.max_len(200),
attrs.validators.matches_re(r"^[a-zA-Z\s\-']+$"),
)
On-setattr validation:
By default, @define classes validate only in __init__. To validate on every attribute assignment, configure the on_setattr parameter:
@attrs.define(on_setattr=attrs.setters.validate)
class StrictConfig:
port: int = attrs.field(validator=attrs.validators.and_(
attrs.validators.instance_of(int),
attrs.validators.ge(1),
attrs.validators.le(65535),
))
Now config.port = -1 raises immediately.
4) Slots, memory, and performance
@attrs.define uses __slots__ by default. The impact:
- Memory: A slotted class with 5 attributes uses roughly 50% less memory per instance than a dict-backed class. For a million instances, this can mean hundreds of megabytes saved.
- Attribute access: Slot-based access is slightly faster than dict-based because CPython uses a fixed offset rather than a hash lookup.
- Limitation: No dynamic attributes. You cannot add
obj.new_field = valueat runtime. This is usually a feature (prevents typos) but can surprise developers used to ad-hoc monkey-patching.
Benchmark comparison for creating 1 million instances (Python 3.12):
| Class type | Time | Memory per instance |
|---|---|---|
Regular class (manual __init__) | ~1.8s | ~400 bytes |
dataclass | ~1.6s | ~400 bytes |
dataclass(slots=True) | ~1.3s | ~200 bytes |
attrs.define (slots default) | ~1.2s | ~200 bytes |
attrs.define(frozen=True) | ~1.4s | ~200 bytes |
attrs and slotted dataclasses are essentially identical in performance. The difference is in the feature set above the baseline.
5) Frozen classes and evolve
Frozen attrs classes are genuinely immutable at the Python level — __setattr__ and __delattr__ raise FrozenInstanceError. Combined with slots, there is no __dict__ to bypass.
To create modified copies, use attrs.evolve():
@attrs.define(frozen=True)
class Point:
x: float
y: float
p1 = Point(1.0, 2.0)
p2 = attrs.evolve(p1, x=3.0) # Point(x=3.0, y=2.0)
evolve() creates a new instance by extracting current values and overriding specified ones. It respects validators and converters on the new values.
6) Integration with cattrs
attrs defines the structure; cattrs handles conversion to and from external formats (dicts, JSON, YAML, msgpack). Together they form a powerful serialization stack:
import cattrs
@attrs.define
class User:
name: str
email: str
age: int
data = {"name": "Alice", "email": "alice@example.com", "age": 30}
user = cattrs.structure(data, User) # dict → User instance
output = cattrs.unstructure(user) # User instance → dict
cattrs generates optimized structuring/unstructuring functions at first call, then caches them. This makes it significantly faster than generic serialization approaches on repeated calls.
7) Factory defaults and dependency patterns
attrs supports factory defaults that run per-instance:
@attrs.define
class Session:
id: str = attrs.field(factory=lambda: uuid4().hex)
created_at: datetime = attrs.field(factory=datetime.utcnow)
metadata: dict = attrs.field(factory=dict)
Each instance gets its own UUID, timestamp, and dict — avoiding the shared-mutable-default trap.
For fields that depend on other fields, use __attrs_post_init__:
@attrs.define
class Rectangle:
width: float
height: float
area: float = attrs.field(init=False)
def __attrs_post_init__(self):
self.area = self.width * self.height
8) Testing attrs classes
attrs integrates well with property-based testing. Hypothesis can auto-generate instances using builds():
from hypothesis import given, strategies as st
@given(st.builds(Temperature, celsius=st.floats(min_value=-273.15, max_value=1e6)))
def test_temperature_round_trip(temp):
assert temp.celsius >= -273.15
For unit tests, attrs.asdict() and attrs.astuple() are useful for assertions without comparing object identity.
9) Migration path from dataclasses
If a project starts with dataclasses and outgrows them, migration to attrs is straightforward:
- Replace
@dataclasswith@attrs.define. - Replace
field(default_factory=...)withattrs.field(factory=...). - Add validators where needed — this was the reason for migrating.
- Enable
on_setattr=attrs.setters.validatefor classes that need runtime invariant enforcement.
The two can coexist in the same codebase. attrs does not conflict with dataclasses at the import or runtime level.
One thing to remember: attrs is the industrial-strength version of dataclasses — same ergonomics, but with validators, converters, and slot-based performance that make it safe for production data modeling where correctness matters.
See Also
- Python Airflow Anti Patterns How Airflow Anti Patterns helps Python teams reduce surprises and keep systems predictable.
- Python Airflow Automation Playbook How Airflow Automation Playbook helps Python teams reduce surprises and keep systems predictable.
- Python Airflow Best Practices How Airflow Best Practices helps Python teams reduce surprises and keep systems predictable.
- Python Airflow Caching Patterns How Airflow Caching Patterns helps Python teams reduce surprises and keep systems predictable.
- Python Airflow Configuration Management How Airflow Configuration Management helps Python teams reduce surprises and keep systems predictable.