pathlib Filesystem — Core Concepts
Why pathlib Over os.path
Before Python 3.4, file path operations required importing os.path and chaining function calls:
import os
path = os.path.join(os.path.expanduser("~"), "documents", "report.pdf")
name = os.path.splitext(os.path.basename(path))[0]
With pathlib, the same becomes:
from pathlib import Path
path = Path.home() / "documents" / "report.pdf"
name = path.stem
Pathlib paths are objects, not strings. They carry methods for every common file operation, making code more readable and less error-prone.
Creating Paths
from pathlib import Path
# From string
p = Path("/home/alice/data.csv")
# Current directory
cwd = Path.cwd()
# Home directory
home = Path.home()
# Relative path
rel = Path("src") / "main.py"
# From parts
p = Path("usr", "local", "bin") # PosixPath('usr/local/bin')
Path Properties
Every path has attributes for its components:
p = Path("/home/alice/project/report.final.pdf")
p.name # 'report.final.pdf' — full filename
p.stem # 'report.final' — filename without last extension
p.suffix # '.pdf' — last extension
p.suffixes # ['.final', '.pdf'] — all extensions
p.parent # PosixPath('/home/alice/project')
p.parents # ['/home/alice/project', '/home/alice', '/home', '/']
p.parts # ('/', 'home', 'alice', 'project', 'report.final.pdf')
p.anchor # '/'
File Operations
Reading and writing
# Read entire file
content = Path("config.json").read_text(encoding="utf-8")
data = Path("image.png").read_bytes()
# Write
Path("output.txt").write_text("Hello, world!", encoding="utf-8")
Path("data.bin").write_bytes(b"\x00\x01\x02")
Checking existence and type
p = Path("some/path")
p.exists() # Does it exist?
p.is_file() # Is it a regular file?
p.is_dir() # Is it a directory?
p.is_symlink() # Is it a symbolic link?
Creating and removing
# Create directory (and parents)
Path("output/reports/2026").mkdir(parents=True, exist_ok=True)
# Remove file
Path("temp.txt").unlink(missing_ok=True) # missing_ok since Python 3.8
# Remove empty directory
Path("empty_dir").rmdir()
Renaming and moving
# Rename (works as move within same filesystem)
Path("old_name.txt").rename("new_name.txt")
# Replace (overwrites if target exists)
Path("draft.txt").replace("final.txt")
Pattern Matching and Traversal
glob — find files by pattern
# All Python files in a directory
list(Path("src").glob("*.py"))
# Recursive: all Python files in subdirectories too
list(Path("src").rglob("*.py"))
# Complex patterns
list(Path(".").glob("**/*.test.js"))
iterdir — list directory contents
for item in Path("src").iterdir():
if item.is_file():
print(f"File: {item.name} ({item.stat().st_size} bytes)")
Path Manipulation
p = Path("/home/alice/report.txt")
# Change extension
p.with_suffix(".pdf") # /home/alice/report.pdf
# Change filename
p.with_name("summary.txt") # /home/alice/summary.txt
# Change stem (keep extension)
p.with_stem("final") # /home/alice/final.txt (Python 3.9+)
# Resolve to absolute
Path("./relative").resolve() # /full/absolute/path/relative
# Relative path between two paths
p.relative_to("/home/alice") # PosixPath('report.txt')
Cross-Platform Behavior
Path() returns the right type for your OS:
- Linux/macOS:
PosixPath - Windows:
WindowsPath
For creating paths for a different OS (parsing, not I/O):
PurePosixPath— always uses/PureWindowsPath— always uses\
Common Misconception
“pathlib is slower than os.path.” While individual operations have slight overhead (object creation vs. string manipulation), the difference is negligible in practice. File I/O (disk access) dominates the time, not path construction. The readability gains far outweigh any micro-performance difference.
pathlib vs os.path Quick Reference
| Task | os.path | pathlib |
|---|---|---|
| Join paths | os.path.join(a, b) | a / b |
| Get filename | os.path.basename(p) | p.name |
| Get extension | os.path.splitext(p)[1] | p.suffix |
| Get parent dir | os.path.dirname(p) | p.parent |
| Check exists | os.path.exists(p) | p.exists() |
| Read file | open(p).read() | p.read_text() |
| Home directory | os.path.expanduser("~") | Path.home() |
One thing to remember: pathlib makes file paths into first-class objects with methods for every common operation. Use / to build paths, .read_text() / .write_text() for file content, .glob() for finding files, and forget about os.path.join forever.
See Also
- Python Atexit How Python's atexit module lets your program clean up after itself right before it shuts down.
- Python Bisect Sorted Lists How Python's bisect module finds things in sorted lists the way you'd find a word in a dictionary — by jumping to the middle.
- Python Contextlib How Python's contextlib module makes the 'with' statement work for anything, not just files.
- Python Copy Module Why copying data in Python isn't as simple as it sounds, and how the copy module prevents sneaky bugs.
- Python Dataclass Field Metadata How Python dataclass fields can carry hidden notes — like sticky notes on a filing cabinet that tools read automatically.