Pulumi Infrastructure with Python — Deep Dive

Advanced Pulumi patterns for Python teams: component resources, testing strategies, state management, and multi-cloud architectures

Pulumi’s execution model

When you run pulumi up, the Pulumi CLI spawns your Python program as a subprocess. Your code communicates with the Pulumi engine via gRPC. Each resource instantiation sends a registration request to the engine, which tracks dependencies, computes diffs against stored state, and orchestrates cloud API calls.

This architecture means your Python code runs once during planning, not repeatedly. Side effects (like writing files or calling external APIs) happen at plan time. The engine handles the actual cloud operations.

Component resources

Component resources are the primary abstraction mechanism — Python classes that group related infrastructure:

import pulumi
from pulumi_aws import ec2, rds, elasticloadbalancingv2 as alb

class WebService(pulumi.ComponentResource):
    def __init__(self, name: str, opts=None, **kwargs):
        super().__init__("custom:WebService", name, {}, opts)

        vpc_id = kwargs["vpc_id"]
        subnet_ids = kwargs["subnet_ids"]
        instance_type = kwargs.get("instance_type", "t3.medium")
        db_engine = kwargs.get("db_engine", "postgres")

        # Application load balancer
        self.lb = alb.LoadBalancer(
            f"{name}-lb",
            subnets=subnet_ids,
            load_balancer_type="application",
            opts=pulumi.ResourceOptions(parent=self),
        )

        # RDS database
        self.db = rds.Instance(
            f"{name}-db",
            engine=db_engine,
            instance_class="db.t3.micro",
            allocated_storage=20,
            skip_final_snapshot=True,
            opts=pulumi.ResourceOptions(parent=self),
        )

        # Auto scaling group for Python app servers
        self.asg = ec2.LaunchTemplate(
            f"{name}-lt",
            instance_type=instance_type,
            image_id="ami-0abcdef1234567890",
            user_data=pulumi.Output.all(
                self.db.endpoint, self.db.port
            ).apply(
                lambda args: self._render_userdata(args[0], args[1])
            ),
            opts=pulumi.ResourceOptions(parent=self),
        )

        self.register_outputs({
            "lb_dns": self.lb.dns_name,
            "db_endpoint": self.db.endpoint,
        })

    @staticmethod
    def _render_userdata(db_host: str, db_port: int) -> str:
        import base64
        script = f"""#!/bin/bash
export DATABASE_URL=postgresql://app:secret@{db_host}:{db_port}/app
cd /opt/myapp
gunicorn app:create_app --bind 0.0.0.0:8000 --workers 4
"""
        return base64.b64encode(script.encode()).decode()

Usage becomes clean and reusable:

service = WebService("api",
    vpc_id=vpc.id,
    subnet_ids=[subnet.id for subnet in subnets],
    instance_type="t3.large",
)
pulumi.export("api_url", service.lb.dns_name)

Working with Outputs

Outputs are Pulumi’s solution to values that don’t exist until deployment. They behave like futures:

import pulumi
import pulumi_aws as aws

bucket = aws.s3.Bucket("data")

# Can't do this — bucket.id is an Output, not a string:
# url = f"s3://{bucket.id}/files"  # WRONG

# Use apply() to transform outputs
url = bucket.id.apply(lambda id: f"s3://{id}/files")

# Combine multiple outputs
combined = pulumi.Output.all(bucket.id, bucket.arn).apply(
    lambda args: {"name": args[0], "arn": args[1]}
)

# Conditional on output values
bucket_policy = bucket.id.apply(
    lambda id: create_policy(id) if id.startswith("prod") else None
)

A common Python pitfall: trying to use Output values in normal string operations. Always use .apply() or pulumi.Output.concat().

Testing infrastructure code

Unit testing with mocks

Pulumi provides a mocking framework that intercepts resource creation:

import pulumi
import unittest

class MyMocks(pulumi.runtime.Mocks):
    def new_resource(self, args):
        return [args.name + "_id", args.inputs]

    def call(self, args):
        return {}

pulumi.runtime.set_mocks(MyMocks())

# Now import your infrastructure code
from my_infra import bucket, service

class TestInfra(unittest.TestCase):
    @pulumi.runtime.test
    def test_bucket_versioning_enabled(self):
        def check_versioning(versioning):
            self.assertTrue(versioning["enabled"])
        return bucket.versioning.apply(check_versioning)

    @pulumi.runtime.test
    def test_service_uses_correct_instance_type(self):
        def check_instance(instance_type):
            self.assertEqual(instance_type, "t3.medium")
        return service.asg.instance_type.apply(check_instance)

Policy as code

Pulumi CrossGuard enforces rules across all stacks:

from pulumi_policy import (
    EnforcementLevel,
    PolicyPack,
    ResourceValidationPolicy,
)

def no_public_s3(args, report_violation):
    if args.resource_type == "aws:s3/bucket:Bucket":
        acl = args.props.get("acl", "private")
        if acl == "public-read" or acl == "public-read-write":
            report_violation("S3 buckets must not be publicly accessible")

PolicyPack("security", policies=[
    ResourceValidationPolicy(
        name="no-public-s3",
        description="Prevents public S3 buckets",
        validate=no_public_s3,
        enforcement_level=EnforcementLevel.MANDATORY,
    ),
])

Multi-stack patterns

Production setups often split infrastructure across stacks:

# network_stack/__main__.py
import pulumi
import pulumi_aws as aws

vpc = aws.ec2.Vpc("main", cidr_block="10.0.0.0/16")
pulumi.export("vpc_id", vpc.id)
pulumi.export("subnet_ids", [s.id for s in subnets])

# app_stack/__main__.py
import pulumi

network = pulumi.StackReference("myorg/network/production")
vpc_id = network.get_output("vpc_id")
subnet_ids = network.get_output("subnet_ids")

# Use these outputs to deploy application resources
service = WebService("api", vpc_id=vpc_id, subnet_ids=subnet_ids)

Stack references create explicit dependencies between infrastructure layers without tight coupling.

State management strategies

Backend	Best for	Tradeoffs
Pulumi Cloud	Teams, collaboration	Free tier limits, vendor dependency
S3 + DynamoDB	AWS-native teams	Self-managed locking, more setup
Azure Blob	Azure-native teams	Self-managed
Local file	Experiments only	No collaboration, easy to lose

Configure with:

pulumi login s3://my-state-bucket
pulumi login --local
pulumi login  # Pulumi Cloud (default)

Secrets management

Pulumi encrypts secrets in state automatically:

import pulumi

config = pulumi.Config()
db_password = config.require_secret("db_password")
# db_password is an Output[str] — encrypted at rest in state

# Set secrets via CLI:
# pulumi config set --secret db_password "hunter2"

You can also use external secret providers:

import pulumi_aws as aws

secret = aws.secretsmanager.Secret("db-creds")
version = aws.secretsmanager.SecretVersion("db-creds-v1",
    secret_id=secret.id,
    secret_string=pulumi.Output.json_dumps({
        "username": "admin",
        "password": db_password,
    }),
)

Dynamic providers

When Pulumi doesn’t have a native provider for a service, you can create custom resources:

from pulumi.dynamic import Resource, ResourceProvider, CreateResult

class SlackChannelProvider(ResourceProvider):
    def create(self, inputs):
        import requests
        resp = requests.post(
            "https://slack.com/api/conversations.create",
            headers={"Authorization": f"Bearer {inputs['token']}"},
            json={"name": inputs["name"]},
        )
        data = resp.json()
        return CreateResult(
            id_=data["channel"]["id"],
            outs={"channel_id": data["channel"]["id"]},
        )

    def delete(self, id, props):
        import requests
        requests.post(
            "https://slack.com/api/conversations.archive",
            headers={"Authorization": f"Bearer {props['token']}"},
            json={"channel": id},
        )

class SlackChannel(Resource):
    def __init__(self, name, token, channel_name, opts=None):
        super().__init__(
            SlackChannelProvider(),
            name,
            {"token": token, "name": channel_name, "channel_id": None},
            opts,
        )

This extends Pulumi to manage anything with an API — including internal tools and third-party SaaS.

CI/CD integration

# .github/workflows/deploy.yml
name: Deploy Infrastructure
on:
  push:
    branches: [main]
    paths: ["infra/**"]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"

      - name: Install dependencies
        run: pip install -r infra/requirements.txt

      - uses: pulumi/actions@v5
        with:
          command: up
          stack-name: myorg/production
          work-dir: infra
        env:
          PULUMI_ACCESS_TOKEN: ${{ secrets.PULUMI_ACCESS_TOKEN }}
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

Tradeoffs vs alternatives

Pulumi’s Python SDK excels at complex, logic-heavy infrastructure. The tradeoff is that you need Python runtime during deployment, your team must understand both infrastructure and Python patterns, and the community module ecosystem is smaller than Terraform’s. For straightforward infrastructure with minimal logic, Terraform’s declarative model may be simpler.

The one thing to remember: Pulumi’s power for Python teams comes from treating infrastructure as a real software engineering problem — with abstractions, tests, packages, and code review — not just configuration files to be managed.

pythoninfrastructurecloudiac