Python PyOxidizer Distribution — Deep Dive
Python distribution mechanics
PyOxidizer does not use your system Python. It downloads (or builds) a special standalone Python distribution optimized for embedding. These distributions come from the python-build-standalone project (also by Gregory Szorc) and differ from regular Python in key ways:
- Statically linked — libpython and many dependencies (OpenSSL, zlib, etc.) are compiled as static libraries.
- Relocatable — no hardcoded paths, works from any filesystem location.
- Stripped of unnecessary components — no
tkinter, noidle, reduced standard library.
# pyoxidizer.bzl — selecting a distribution
def make_exe():
# Default: latest CPython for current platform
dist = default_python_distribution()
# Or specify exactly
dist = PythonDistribution(
url="https://github.com/indygreg/python-build-standalone/releases/...",
sha256="abc123...",
)
The distribution is cached locally after first download. Subsequent builds reuse it.
Oxidized importer deep dive
The oxidized importer (oxidized_importer module) is a Rust-based Python meta path finder that replaces the standard PathFinder. It stores module data in a custom packed format:
Data storage format
[Module Index]
module_name → (bytecode_offset, bytecode_length, source_offset, source_length)
[Bytecode Section]
Concatenated .pyc data for all modules
[Source Section] (optional)
Concatenated .py source for all modules
All sections are stored as raw bytes in the Rust binary’s .data or .rodata segment. At startup, the importer builds an in-memory hash map from the index.
Import resolution
# When Python encounters: import myapp.utils
# The oxidized importer:
# 1. Looks up "myapp.utils" in the hash map (O(1))
# 2. Gets (bytecode_offset, bytecode_length)
# 3. Creates a memoryview into the binary's data section
# 4. Calls marshal.loads() on the bytecode
# 5. Returns the code object — no disk I/O at any point
For extension modules (.so/.pyd), the importer has three strategies:
- In-memory loading — on platforms that support it (Linux with
memfd_create), load the shared library from memory without writing to disk. - Temporary file — extract to a temp directory and load with
dlopen. - Filesystem relative — store alongside the binary and load normally.
# Configure extension module handling
policy = dist.make_python_packaging_policy()
policy.extension_module_filter = "minimal" # Only essential extensions
policy.resources_location = "in-memory" # Prefer in-memory loading
policy.resources_location_fallback = "filesystem-relative" # Fallback
Starlark configuration in depth
The pyoxidizer.bzl configuration file uses Starlark, a Python-like language. Key patterns:
Dependency management
def make_exe():
dist = default_python_distribution()
policy = dist.make_python_packaging_policy()
python_config = dist.make_python_interpreter_config()
python_config.run_module = "myapp"
python_config.allocator_backend = "rust" # Use Rust allocator
python_config.oxidized_importer = True # Enable oxidized importer
exe = dist.to_python_executable(
name="myapp",
packaging_policy=policy,
config=python_config,
)
# Install from PyPI
exe.add_python_resources(exe.pip_install(["flask==3.0.0", "gunicorn"]))
# Install from local path
exe.add_python_resources(exe.pip_install(["-e", "."]))
# Install from requirements file
exe.add_python_resources(exe.pip_install(["-r", "requirements.txt"]))
# Add individual files
exe.add_python_resources(exe.read_package_root(
path="src",
packages=["myapp"],
))
return exe
Resource filtering
def make_exe():
dist = default_python_distribution()
policy = dist.make_python_packaging_policy()
# Exclude test modules to reduce size
policy.exclude_test_packages()
# Set where resources go
policy.resources_location = "in-memory"
policy.resources_location_fallback = "filesystem-relative:lib"
# Custom filter function
def resource_filter(policy, resource):
# Exclude specific packages
if resource.name.startswith("tests."):
resource.add_include = False
if resource.name == "matplotlib":
resource.add_location = "filesystem-relative:lib"
return True
policy.register_resource_callback(resource_filter)
Multi-target builds
def make_exe_linux():
dist = default_python_distribution()
exe = dist.to_python_executable(name="myapp-linux", ...)
return exe
def make_exe_macos():
dist = PythonDistribution(url="...", sha256="...")
exe = dist.to_python_executable(name="myapp-macos", ...)
return exe
register_target("linux", make_exe_linux)
register_target("macos", make_exe_macos)
resolve_targets()
Rust integration patterns
PyOxidizer’s Rust underpinnings enable deeper integration:
Rust + Python hybrid application
// src/main.rs
use pyembed::MainPythonInterpreter;
fn main() {
// Do Rust-native work
let config = load_config();
// Initialize embedded Python
let interp = MainPythonInterpreter::new(
pyembed::default_python_config()
).unwrap();
interp.with_gil(|py| {
// Call Python code from Rust
let myapp = py.import("myapp").unwrap();
let result = myapp.call_method1("process", (config.input_path,)).unwrap();
println!("Python returned: {}", result);
});
}
This pattern lets you write performance-critical startup code in Rust while keeping business logic in Python.
Custom memory allocator
# pyoxidizer.bzl
python_config.allocator_backend = "rust" # jemalloc-like Rust allocator
python_config.allocator_raw = True # Bypass Python's pymalloc
python_config.allocator_debug = False # No debug overhead
Using Rust’s allocator instead of Python’s default pymalloc can improve memory usage patterns, especially for long-running applications.
Handling problematic packages
Some Python packages resist embedding:
Packages that inspect __file__
# Many packages do this:
data_dir = os.path.dirname(__file__) # Fails when loaded from memory
# Fix: configure filesystem-relative placement for that package
policy.resources_location_fallback = "filesystem-relative:lib"
Packages with native extensions
# Some extensions can't be loaded from memory
# Place them on the filesystem
def resource_filter(policy, resource):
if resource.name in ("numpy", "pandas", "scipy"):
resource.add_location = "filesystem-relative:lib"
return True
Packages using pkg_resources or importlib.metadata
# These need special handling — ensure metadata files are included
exe.add_python_resources(exe.pip_install(
["--no-binary", ":none:", "my-package"],
))
Build optimization
Reducing binary size
# Strip debug symbols (pyoxidizer.bzl)
exe = dist.to_python_executable(
name="myapp",
packaging_policy=policy,
config=python_config,
)
# Post-build stripping
# Linux
# strip build/x86_64-unknown-linux-musl/release/install/myapp
# Exclude unused standard library modules
policy.include_classified_resources = False
policy.include_distribution_sources = False
policy.include_distribution_resources = False
Typical binary sizes:
| Content | Approximate size |
|---|---|
| Python interpreter only | 15-20 MB |
| + standard library | 25-35 MB |
| + small app (Flask) | 35-50 MB |
| + heavy deps (numpy, pandas) | 80-150 MB |
Build caching
PyOxidizer caches the Rust compilation and Python distribution. First builds take 2-5 minutes; incremental builds (code changes only) take 30-60 seconds.
# CI caching strategy
# Cache these directories:
# - ~/.cache/pyoxidizer/ (Python distributions)
# - target/ (Rust compilation cache)
Production CI/CD pipeline
# GitHub Actions
name: Build PyOxidizer
on: [push, pull_request]
jobs:
build:
strategy:
matrix:
include:
- os: ubuntu-latest
target: x86_64-unknown-linux-musl
- os: macos-latest
target: x86_64-apple-darwin
- os: windows-latest
target: x86_64-pc-windows-msvc
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
- name: Install PyOxidizer
run: pip install pyoxidizer
- name: Build
run: pyoxidizer build --release
- name: Test binary
run: |
./build/*/release/install/myapp --version
./build/*/release/install/myapp --self-test
- name: Upload
uses: actions/upload-artifact@v4
with:
name: myapp-${{ matrix.target }}
path: build/*/release/install/
Alternatives landscape
PyOxidizer pioneered the “embed Python in a native binary” approach. The ecosystem has evolved:
- PyApp — a simpler Rust-based approach inspired by PyOxidizer, with easier configuration.
- Nuitka — compiles Python to C, producing similar single-file binaries.
- cx_Freeze — mature packaging tool, less innovative but very reliable.
- Briefcase — BeeWare’s packaging tool, focused on mobile and desktop apps.
If PyOxidizer’s complexity is a barrier, PyApp is worth evaluating as a lighter alternative with the same Rust-embedding philosophy.
One thing to remember: PyOxidizer’s architecture — embedding Python in a Rust binary with an in-memory importer — represents the most technically ambitious approach to Python distribution, trading build complexity for the fastest possible startup and the cleanest possible deployment artifact.
See Also
- Python Appimage Distribution An AppImage is like a portable app on a USB stick — download one file, double-click it, and your Python program runs on any Linux computer without installing anything.
- Python Briefcase Native Apps Imagine a travel agent who repacks your suitcase for each country's customs — Briefcase converts your Python app into proper native packages for every platform.
- Python Flatpak Packaging Flatpak wraps your Python app in a safe bubble that works on every Linux system — like a snow globe that keeps your program perfect inside.
- Python Mypyc Compilation Your type hints are not just for documentation — mypyc turns them into speed boosts by compiling typed Python into fast C extensions.
- Python Nuitka Compilation What if your Python code could run as fast as a race car instead of a bicycle? Nuitka translates Python into C to make that happen.