Python Terraform CDK — Deep Dive
CDKTF synthesis internals
When you run cdktf synth, the following happens:
- Python interpreter executes your code, building an in-memory tree of constructs
- Each construct resolves its properties, including references to other resources
- The tree is serialized to Terraform-compatible JSON in
cdktf.out/stacks/<stack-name>/ - Each stack gets its own
cdk.tf.jsonfile
The generated JSON is valid Terraform configuration. You can inspect it, run terraform plan directly against it, or import existing resources into it. Understanding this output helps debug issues where Python code does not produce the expected infrastructure.
Custom constructs
Constructs are the building blocks of CDKTF. You compose resources into reusable abstractions:
from constructs import Construct
from cdktf_cdktf_provider_aws.vpc import Vpc
from cdktf_cdktf_provider_aws.subnet import Subnet
from cdktf_cdktf_provider_aws.internet_gateway import InternetGateway
class StandardVpc(Construct):
def __init__(self, scope, id, *, cidr="10.0.0.0/16", az_count=3, environment="dev"):
super().__init__(scope, id)
self.vpc = Vpc(self, "vpc",
cidr_block=cidr,
enable_dns_hostnames=True,
enable_dns_support=True,
tags={"Environment": environment, "Name": f"{environment}-vpc"},
)
self.igw = InternetGateway(self, "igw",
vpc_id=self.vpc.id,
tags={"Name": f"{environment}-igw"},
)
self.public_subnets = []
for i in range(az_count):
subnet = Subnet(self, f"public-{i}",
vpc_id=self.vpc.id,
cidr_block=f"10.0.{i}.0/24",
availability_zone=f"us-east-1{'abcdef'[i]}",
map_public_ip_on_launch=True,
tags={"Name": f"{environment}-public-{i}"},
)
self.public_subnets.append(subnet)
Usage becomes a single line:
network = StandardVpc(self, "network", environment="production", az_count=3)
# Access: network.vpc.id, network.public_subnets[0].id
Publishing constructs as packages
Package your constructs as a pip-installable library:
my-infra-constructs/
├── pyproject.toml
├── src/
│ └── infra_constructs/
│ ├── __init__.py
│ ├── vpc.py
│ ├── database.py
│ └── cdn.py
Teams share a standard library of constructs — VPC, database cluster, CDN — ensuring consistency across all projects.
Multi-stack architectures
Stack dependencies
from cdktf import App, TerraformStack, TerraformOutput
class NetworkStack(TerraformStack):
def __init__(self, scope, id):
super().__init__(scope, id)
AwsProvider(self, "aws", region="us-east-1")
self.vpc = Vpc(self, "vpc", cidr_block="10.0.0.0/16")
TerraformOutput(self, "vpc_id", value=self.vpc.id)
class AppStack(TerraformStack):
def __init__(self, scope, id, *, vpc_id):
super().__init__(scope, id)
AwsProvider(self, "aws", region="us-east-1")
Instance(self, "app",
subnet_id=vpc_id, # Cross-stack reference
instance_type="t3.micro",
)
app = App()
network = NetworkStack(app, "network")
AppStack(app, "app", vpc_id=network.vpc.id)
app.synth()
CDKTF resolves cross-stack references through Terraform remote state data sources. The app stack reads the network stack’s output to get the VPC ID.
Environment-per-stack
environments = {
"dev": {"instance_type": "t3.micro", "count": 1, "region": "us-east-1"},
"staging": {"instance_type": "t3.small", "count": 2, "region": "us-east-1"},
"production": {"instance_type": "m5.large", "count": 4, "region": "us-east-1"},
}
app = App()
for env_name, config in environments.items():
WebStack(app, f"web-{env_name}", environment=env_name, **config)
app.synth()
Each environment gets its own stack with independent state, but they share the same Python code. Configuration differences are just parameters.
Remote state configuration
from cdktf import S3Backend
class ProductionStack(TerraformStack):
def __init__(self, scope, id):
super().__init__(scope, id)
S3Backend(self,
bucket="my-terraform-state",
key=f"stacks/{id}/terraform.tfstate",
region="us-east-1",
dynamodb_table="terraform-locks",
encrypt=True,
)
The S3 backend with DynamoDB locking is the standard for team environments. Every team member and CI/CD pipeline reads and writes state to the same place, with locking preventing concurrent modifications.
Testing infrastructure code
Unit testing with pytest
import pytest
from cdktf import Testing
from my_stacks import WebStack
class TestWebStack:
def test_creates_ec2_instance(self):
app = Testing.app()
stack = WebStack(app, "test", environment="test")
synthesized = Testing.synth(stack)
assert Testing.to_have_resource(synthesized, "aws_instance")
def test_instance_has_correct_type(self):
app = Testing.app()
stack = WebStack(app, "test", environment="production")
synthesized = Testing.synth(stack)
assert Testing.to_have_resource_with_properties(
synthesized,
"aws_instance",
{"instance_type": "m5.large"},
)
def test_all_resources_tagged(self):
app = Testing.app()
stack = WebStack(app, "test", environment="staging")
synthesized = Testing.synth(stack)
# Check that every resource has an Environment tag
plan = Testing.full_synth(stack)
for resource in plan.get("resource", {}).values():
for instance in resource.values():
tags = instance.get("tags", {})
assert "Environment" in tags
Snapshot testing
def test_stack_snapshot(snapshot):
app = Testing.app()
stack = WebStack(app, "test", environment="dev")
synthesized = Testing.synth(stack)
assert synthesized == snapshot
Snapshot tests catch unintended changes. If the generated Terraform JSON changes, the test fails and shows you a diff. Update the snapshot only after reviewing the change.
Secrets management
Using Terraform variables for secrets
from cdktf import TerraformVariable
class DatabaseStack(TerraformStack):
def __init__(self, scope, id):
super().__init__(scope, id)
db_password = TerraformVariable(self, "db_password",
type="string",
sensitive=True,
description="Database master password",
)
RdsInstance(self, "db",
engine="postgres",
instance_class="db.t3.micro",
master_password=db_password.string_value,
)
Pass the value at deploy time: cdktf deploy -- -var="db_password=secret123"
Integration with AWS Secrets Manager
from cdktf_cdktf_provider_aws.data_aws_secretsmanager_secret_version import (
DataAwsSecretsmanagerSecretVersion,
)
secret = DataAwsSecretsmanagerSecretVersion(self, "db-creds",
secret_id="production/database/credentials",
)
# Use in resources
RdsInstance(self, "db",
master_password=secret.secret_string,
)
This reads the secret at plan/apply time from AWS, never storing it in state as plaintext (though Terraform state can still contain sensitive values — encrypt your state backend).
CI/CD integration
GitHub Actions workflow
name: Infrastructure Deploy
on:
push:
branches: [main]
paths: ['infra/**']
jobs:
deploy:
runs-on: ubuntu-latest
permissions:
id-token: write # For OIDC
contents: read
steps:
- uses: actions/checkout@v4
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789012:role/cdktf-deploy
aws-region: us-east-1
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install dependencies
run: |
pip install -r infra/requirements.txt
npm install -g cdktf-cli
- name: Plan
run: cdktf plan
working-directory: infra
- name: Deploy
if: github.ref == 'refs/heads/main'
run: cdktf deploy --auto-approve
working-directory: infra
Drift detection
Schedule a periodic job that runs cdktf plan and alerts if the plan shows changes. This detects manual modifications made outside of CDKTF:
# In a scheduled CI job
result = subprocess.run(["cdktf", "plan", "--detailed-exit-code"], capture_output=True)
if result.returncode == 2:
# Changes detected — infrastructure has drifted
send_alert("Infrastructure drift detected", result.stdout.decode())
Performance considerations
- Synthesis time scales linearly with resource count. 100 resources synthesize in ~2 seconds; 10,000 resources in ~30 seconds
- Provider generation packages can be large.
cdktf-cdktf-provider-awsis ~200 MB installed. Use pre-built providers to avoid generating them locally - State operations are Terraform-native and scale well. CDKTF adds no overhead to plan/apply
Tradeoffs
| Consideration | CDKTF advantage | CDKTF disadvantage |
|---|---|---|
| Expressiveness | Full Python (loops, types, tests) | More code than HCL for simple cases |
| Ecosystem | All Terraform providers | Generated types can lag provider updates |
| Debugging | Python stack traces | Extra synthesis layer to understand |
| Team adoption | Python devs productive immediately | Ops teams may prefer HCL |
| Tooling | Full IDE support, pytest | Less community examples than HCL |
CDKTF makes sense when your infrastructure has patterns (repeated resources, environment variations, complex conditionals) that benefit from a real programming language. For a single static stack of 10 resources, plain HCL is simpler.
The one thing to remember: CDKTF’s power lies in treating infrastructure as real software — testable, composable, and versionable Python code — with Terraform’s proven engine handling the actual cloud provisioning underneath.
See Also
- Python Ansible Python Learn Ansible Python with a clear mental model so your Python code is easier to trust and maintain.
- Python Aws Boto3 Learn AWS Boto3 with a clear mental model so your Python code is easier to trust and maintain.
- Python Aws Dynamodb Python Learn AWS Dynamodb Python with a clear mental model so your Python code is easier to trust and maintain.
- Python Aws Lambda Python Learn AWS Lambda Python with a clear mental model so your Python code is easier to trust and maintain.
- Python Aws Lambda Use AWS Lambda with Python to remove setup chaos so Python projects stay predictable for every teammate.