Code Generation Patterns — Core Concepts

Why Generate Code?

Code generation solves a specific problem: when you need many similar pieces of code that differ in predictable ways. Writing them by hand is tedious, error-prone, and creates a maintenance burden. Generating them from a single source of truth ensures consistency and makes changes easy — update the generator, regenerate everything.

Common scenarios include API client libraries (generated from OpenAPI schemas), ORM models (generated from database schemas), serialization code, protocol buffers, and configuration-driven business logic.

Pattern 1: Template-Based Generation

The simplest approach — render code as text using a template engine:

from jinja2 import Template

model_template = Template("""
class {{ name }}:
    def __init__(self, {{ fields | join(', ') }}):
{% for field in fields %}
        self.{{ field }} = {{ field }}
{% endfor %}

    def __repr__(self):
        return f"{{ name }}({{ fields | join(', ') }})"
""")

output = model_template.render(name="User", fields=["name", "email", "age"])

Strengths: Easy to read, familiar to anyone who knows template engines, produces human-readable output.

Weaknesses: No syntax validation until you run the output. Indentation errors in templates create broken code. No type checking of the generated code.

Pattern 2: AST Construction

Build code as an Abstract Syntax Tree, then compile or unparse it:

import ast

def make_dataclass(name, fields):
    init_args = [ast.arg(arg=f) for f in fields]
    init_body = [
        ast.Assign(
            targets=[ast.Attribute(
                value=ast.Name(id='self', ctx=ast.Load()),
                attr=f, ctx=ast.Store()
            )],
            value=ast.Name(id=f, ctx=ast.Load()),
            lineno=0
        ) for f in fields
    ]
    init_func = ast.FunctionDef(
        name='__init__',
        args=ast.arguments(
            posonlyargs=[], args=[ast.arg(arg='self')] + init_args,
            vararg=None, kwonlyargs=[], kw_defaults=[],
            kwarg=None, defaults=[]
        ),
        body=init_body, decorator_list=[], returns=None,
        lineno=0, col_offset=0
    )
    cls = ast.ClassDef(
        name=name, bases=[], keywords=[],
        body=[init_func], decorator_list=[],
        lineno=0, col_offset=0
    )
    module = ast.Module(body=[cls], type_ignores=[])
    ast.fix_missing_locations(module)
    return module

Strengths: The AST is always syntactically valid (impossible to generate broken indentation). Can be analyzed and transformed further before compilation.

Weaknesses: Verbose and harder to read. Not ideal for generating large amounts of code.

Pattern 3: Runtime Class Creation with type()

Python’s type() function creates classes dynamically without generating source code at all:

def make_init(fields):
    def __init__(self, **kwargs):
        for field in fields:
            setattr(self, field, kwargs.get(field))
    return __init__

User = type("User", (), {
    "__init__": make_init(["name", "email", "age"]),
    "__repr__": lambda self: f"User(name={self.name!r})"
})

Strengths: No text manipulation, no parsing. The result is a real class immediately.

Weaknesses: Generated methods lack proper source locations for debugging. Stack traces show <lambda> or <dynamic> instead of meaningful file locations.

Pattern 4: Decorator and Metaclass Generation

Decorators and metaclasses can inspect a class definition and add generated methods:

def auto_repr(cls):
    fields = list(cls.__annotations__.keys())
    def __repr__(self):
        pairs = [f"{f}={getattr(self, f)!r}" for f in fields]
        return f"{cls.__name__}({', '.join(pairs)})"
    cls.__repr__ = __repr__
    return cls

@auto_repr
class User:
    name: str
    email: str

This is the pattern behind dataclasses, attrs, and Pydantic. They read annotations at class creation time and generate __init__, __repr__, __eq__, and other methods.

Pattern 5: Script-Based Code Generation

Some projects run a generation script as a build step, producing .py files that are committed to the repository:

python generate_models.py --schema api.yaml --output src/models/

The generated files are regular Python modules. Developers can read and debug them like any other code. The generator script is the source of truth; the generated files are artifacts.

Strengths: Full IDE support (autocompletion, type checking). No runtime generation overhead.

Weaknesses: Generated files can drift from their source if the generator is not re-run. Requires build process discipline.

Choosing the Right Pattern

PatternBest forAvoid when
Template-basedLarge files with complex structureOutput must be syntactically guaranteed
AST constructionSmall, precise code modificationsLarge amounts of code (too verbose)
type() / runtimeSimple classes and functionsDebugging matters (poor stack traces)
Decorators/metaclassesAdding methods to user-defined classesComplete class generation from scratch
Script-based generationAPI clients, protocol code, ORMsSchema changes frequently without rebuilds

A Common Misconception

Code generation does not mean using eval() or exec() on user input. Well-designed code generators take trusted input (schemas, configurations, type definitions) and produce code that is then imported normally. The generation step should happen at build time or import time with trusted data — never with arbitrary user-provided strings.

One thing to remember: Code generation is a legitimate engineering technique for eliminating repetitive coding. Python offers multiple approaches — from simple templates to AST manipulation to metaclass magic — each with different tradeoffs in readability, safety, and debuggability. Pick the simplest one that meets your needs.

pythonmetaprogramminglanguage-implementation

See Also

  • Python Source To Source Transformers Programs that rewrite your Python code for you — like a spelling checker that also fixes your grammar and updates old words.
  • Ci Cd Why big apps can ship updates every day without turning your phone into a glitchy mess — CI/CD is the behind-the-scenes quality gate and delivery truck.
  • Containerization Why does software that works on your computer break on everyone else's? Containers fix that — and they're why Netflix can deploy 100 updates a day without the site going down.
  • Python 310 New Features Python 3.10 gave programmers a shape-sorting machine, friendlier error messages, and cleaner ways to say 'this or that' in type hints.
  • Python 311 New Features Python 3.11 made everything faster, error messages smarter, and let you catch several mistakes at once instead of stopping at the first one.