attrs Library — Deep Dive

attrs has been a foundational library in the Python ecosystem since 2015, used by projects like Twisted, pytest, Hypothesis, and thousands of production systems. Its design philosophy — “write classes without boilerplate, with correctness guarantees” — makes it the tool of choice when dataclasses are not enough. This deep dive covers its internals, advanced features, and real-world usage patterns.

1) How attrs generates classes

When you apply @attrs.define, a metaclass-free transformation happens:

  1. The decorator scans the class body for annotated attributes.
  2. For each attribute, it creates an Attribute descriptor object storing the name, type, default, validator, converter, and metadata.
  3. It generates __init__, __repr__, __eq__, and optionally __hash__, __lt__/__gt__ (if order=True) as plain Python functions.
  4. If slots=True (the default in @define), it creates a new class with __slots__ set to the attribute names and copies the generated methods over.

The slot class creation is why @define classes are not the “same” class object as the one you wrote — attrs creates a new class dynamically. This matters when using super() in complex inheritance hierarchies, though attrs handles the common cases correctly.

2) The attribute pipeline: converters → validators

Each attribute goes through a defined pipeline during __init__:

raw value → converter (if any) → validator (if any) → stored on instance

This ordering is intentional. Converters normalize input types so validators see a consistent type:

import attrs

@attrs.define
class Temperature:
    celsius: float = attrs.field(
        converter=float,
        validator=[
            attrs.validators.instance_of(float),
            attrs.validators.ge(-273.15),
        ],
    )

Here, passing Temperature("36.6") works: the string is converted to a float, then validated. Without the converter, instance_of(float) would reject the string.

3) Advanced validators

Beyond built-in validators, attrs supports several advanced patterns:

Conditional validation:

@attrs.define
class Shipment:
    status: str = attrs.field(validator=attrs.validators.in_(["pending", "shipped", "delivered"]))
    tracking_number: str | None = attrs.field(default=None)

    def __attrs_post_init__(self):
        if self.status == "shipped" and not self.tracking_number:
            raise ValueError("Shipped items must have a tracking number")

Cross-field validation with __attrs_post_init__: This method runs after all fields are set, making it the right place for validators that depend on multiple fields.

Composing validators with and_:

name_validator = attrs.validators.and_(
    attrs.validators.instance_of(str),
    attrs.validators.min_len(1),
    attrs.validators.max_len(200),
    attrs.validators.matches_re(r"^[a-zA-Z\s\-']+$"),
)

On-setattr validation: By default, @define classes validate only in __init__. To validate on every attribute assignment, configure the on_setattr parameter:

@attrs.define(on_setattr=attrs.setters.validate)
class StrictConfig:
    port: int = attrs.field(validator=attrs.validators.and_(
        attrs.validators.instance_of(int),
        attrs.validators.ge(1),
        attrs.validators.le(65535),
    ))

Now config.port = -1 raises immediately.

4) Slots, memory, and performance

@attrs.define uses __slots__ by default. The impact:

  • Memory: A slotted class with 5 attributes uses roughly 50% less memory per instance than a dict-backed class. For a million instances, this can mean hundreds of megabytes saved.
  • Attribute access: Slot-based access is slightly faster than dict-based because CPython uses a fixed offset rather than a hash lookup.
  • Limitation: No dynamic attributes. You cannot add obj.new_field = value at runtime. This is usually a feature (prevents typos) but can surprise developers used to ad-hoc monkey-patching.

Benchmark comparison for creating 1 million instances (Python 3.12):

Class typeTimeMemory per instance
Regular class (manual __init__)~1.8s~400 bytes
dataclass~1.6s~400 bytes
dataclass(slots=True)~1.3s~200 bytes
attrs.define (slots default)~1.2s~200 bytes
attrs.define(frozen=True)~1.4s~200 bytes

attrs and slotted dataclasses are essentially identical in performance. The difference is in the feature set above the baseline.

5) Frozen classes and evolve

Frozen attrs classes are genuinely immutable at the Python level — __setattr__ and __delattr__ raise FrozenInstanceError. Combined with slots, there is no __dict__ to bypass.

To create modified copies, use attrs.evolve():

@attrs.define(frozen=True)
class Point:
    x: float
    y: float

p1 = Point(1.0, 2.0)
p2 = attrs.evolve(p1, x=3.0)  # Point(x=3.0, y=2.0)

evolve() creates a new instance by extracting current values and overriding specified ones. It respects validators and converters on the new values.

6) Integration with cattrs

attrs defines the structure; cattrs handles conversion to and from external formats (dicts, JSON, YAML, msgpack). Together they form a powerful serialization stack:

import cattrs

@attrs.define
class User:
    name: str
    email: str
    age: int

data = {"name": "Alice", "email": "alice@example.com", "age": 30}
user = cattrs.structure(data, User)  # dict → User instance
output = cattrs.unstructure(user)    # User instance → dict

cattrs generates optimized structuring/unstructuring functions at first call, then caches them. This makes it significantly faster than generic serialization approaches on repeated calls.

7) Factory defaults and dependency patterns

attrs supports factory defaults that run per-instance:

@attrs.define
class Session:
    id: str = attrs.field(factory=lambda: uuid4().hex)
    created_at: datetime = attrs.field(factory=datetime.utcnow)
    metadata: dict = attrs.field(factory=dict)

Each instance gets its own UUID, timestamp, and dict — avoiding the shared-mutable-default trap.

For fields that depend on other fields, use __attrs_post_init__:

@attrs.define
class Rectangle:
    width: float
    height: float
    area: float = attrs.field(init=False)

    def __attrs_post_init__(self):
        self.area = self.width * self.height

8) Testing attrs classes

attrs integrates well with property-based testing. Hypothesis can auto-generate instances using builds():

from hypothesis import given, strategies as st

@given(st.builds(Temperature, celsius=st.floats(min_value=-273.15, max_value=1e6)))
def test_temperature_round_trip(temp):
    assert temp.celsius >= -273.15

For unit tests, attrs.asdict() and attrs.astuple() are useful for assertions without comparing object identity.

9) Migration path from dataclasses

If a project starts with dataclasses and outgrows them, migration to attrs is straightforward:

  1. Replace @dataclass with @attrs.define.
  2. Replace field(default_factory=...) with attrs.field(factory=...).
  3. Add validators where needed — this was the reason for migrating.
  4. Enable on_setattr=attrs.setters.validate for classes that need runtime invariant enforcement.

The two can coexist in the same codebase. attrs does not conflict with dataclasses at the import or runtime level.

One thing to remember: attrs is the industrial-strength version of dataclasses — same ergonomics, but with validators, converters, and slot-based performance that make it safe for production data modeling where correctness matters.

pythonattrsclassesdata-modeling

See Also