Python YAML Processing — Core Concepts

YAML (YAML Ain’t Markup Language) is the configuration format of choice for DevOps tools. Python processes YAML through third-party libraries, with PyYAML being the most widely used.

Installation

pip install pyyaml

For advanced use cases (preserving comments, round-trip editing):

pip install ruamel.yaml

Reading YAML

Basic Loading

import yaml

with open("config.yml", encoding="utf-8") as f:
    config = yaml.safe_load(f)

safe_load parses YAML into Python types:

YAMLPython
key: value{"key": "value"}
- item (list)["item"]
42int
3.14float
true / yesTrue
null / ~None
2026-03-28datetime.date

From a String

data = yaml.safe_load("""
database:
  host: localhost
  port: 5432
  name: myapp
""")

Multiple Documents

YAML files can contain multiple documents separated by ---:

with open("multi.yml", encoding="utf-8") as f:
    docs = list(yaml.safe_load_all(f))

Writing YAML

import yaml

config = {
    "database": {
        "host": "localhost",
        "port": 5432,
    },
    "features": ["auth", "logging", "caching"],
}

with open("output.yml", "w", encoding="utf-8") as f:
    yaml.dump(config, f, default_flow_style=False, allow_unicode=True)

default_flow_style=False ensures block style (one item per line) instead of compact inline format.

Safety: safe_load vs. load

This is critical. Never use yaml.load() with untrusted input.

yaml.load() can execute arbitrary Python code through YAML tags:

# DANGEROUS YAML:
# !!python/os.system 'rm -rf /'

yaml.safe_load() only creates basic Python types (dicts, lists, strings, numbers). Always use it unless you explicitly need custom type construction with trusted input.

YAML Gotchas

The Norway Problem

YAML interprets certain strings as booleans:

# This YAML:
country: NO

# Becomes this Python:
{"country": False}  # "NO" is treated as boolean False!

The same happens with yes, on, off, true, false, and their variations. Always quote strings that could be misinterpreted:

country: "NO"

Accidental Dates

# This YAML:
version: 1.10

# Becomes this Python:
{"version": 1.1}  # Trailing zero lost!

And 2026-03-28 becomes a datetime.date object, not a string. Quote values when you need exact string preservation.

Indentation Sensitivity

YAML uses spaces only (never tabs). A single space difference changes meaning:

# This creates a dict inside a list:
items:
  - name: Widget
    price: 10

# This creates two separate list items:
items:
  - name: Widget
  - price: 10

ruamel.yaml for Round-Trip Editing

PyYAML loses comments when writing. ruamel.yaml preserves them:

from ruamel.yaml import YAML

yaml_rt = YAML()
yaml_rt.preserve_quotes = True

with open("config.yml") as f:
    config = yaml_rt.load(f)

config["database"]["port"] = 5433

with open("config.yml", "w") as f:
    yaml_rt.dump(config, f)
# Comments and formatting are preserved!

This is essential for tools that modify user configuration files — losing comments angers users.

YAML vs. Other Config Formats

FeatureYAMLJSONTOML
CommentsYesNoYes
Human readabilityExcellentGoodExcellent
ComplexityHighLowMedium
Surprising behaviorsManyFewFew
Python stdlibNoYesYes (3.11+)

Common Misconception

“YAML is just JSON with better syntax.” YAML is a superset of JSON (valid JSON is valid YAML), but it adds significant complexity: anchors, aliases, tags, multi-line strings, multiple document support, and implicit type conversions. This complexity is why many projects are migrating to TOML for configuration.

One Thing to Remember

Always use safe_load for security, quote ambiguous strings to avoid the Norway problem, and reach for ruamel.yaml when you need to edit YAML files without losing comments.

pythonyamlconfigurationtext-processingdevops

See Also

  • Python Fuzzy Matching Fuzzywuzzy Find out how Python's FuzzyWuzzy library matches messy, misspelled text — like a friend who understands you even when you mumble.
  • Python Regex Lookahead Lookbehind Learn how Python regex can peek ahead or behind without grabbing text — like checking what's next in line without stepping forward.
  • Python Regex Named Groups Learn how Python regex named groups let you label the pieces you capture — like putting name tags on your search results.
  • Python Regex Patterns Discover how Python regex patterns work like a secret code for finding hidden text treasures in any document.
  • Python Regular Expressions Learn how Python can find tricky text patterns fast, like spotting every phone number hidden in a messy page.