Python Conda Environments — Core Concepts
Why this topic matters
Python virtual environments (venv) handle Python packages, but many data science and scientific computing workflows depend on system-level libraries — CUDA for GPU computing, GDAL for geospatial work, or MKL for optimized linear algebra. Conda manages both Python packages and these non-Python dependencies in a unified system, making it the standard environment manager for data science teams.
How it works
Creating and using environments
# Create environment with specific Python version
conda create -n myproject python=3.11
# Activate it
conda activate myproject
# Install packages
conda install numpy pandas scikit-learn
# See what's installed
conda list
# Deactivate
conda deactivate
Each environment is a self-contained directory with its own Python interpreter, libraries, and binaries. Activating an environment adjusts your PATH so commands use that environment’s tools.
Environment files
For reproducibility, define environments in YAML:
# environment.yml
name: ml-project
channels:
- conda-forge
- defaults
dependencies:
- python=3.11
- numpy=1.26
- pandas>=2.0
- scikit-learn
- jupyter
- cudatoolkit=12.1
- pip:
- transformers
- wandb
Create from the file:
conda env create -f environment.yml
Update an existing environment:
conda env update -f environment.yml --prune
The --prune flag removes packages that were dropped from the YAML.
Key concepts
Channels
Channels are repositories where conda finds packages. The most important ones:
- defaults: Anaconda’s curated channel (comes with Anaconda/Miniconda)
- conda-forge: Community-maintained, largest selection, most up-to-date
- pytorch: Official PyTorch builds with CUDA support
- nvidia: NVIDIA’s GPU tools
Channel priority matters. Conda checks channels in order and uses the first match:
# Set conda-forge as highest priority
conda config --prepend channels conda-forge
conda config --set channel_priority strict
Strict channel priority prevents mixing packages from different channels, which avoids subtle binary incompatibilities.
Conda vs pip
| Aspect | Conda | Pip |
|---|---|---|
| Package scope | Any language (Python, C, R, CUDA) | Python only |
| Dependency solver | SAT solver, checks all constraints | Resolves iteratively |
| Environment management | Built-in | Separate tool (venv) |
| Package format | .conda / .tar.bz2 | .whl / .tar.gz |
| Package source | Conda channels | PyPI |
| Binary packages | Pre-built for each platform | Wheels (some need compilation) |
They can coexist: install conda packages first, then use pip for packages not available on conda channels. The environment.yml format supports this with the pip: section.
Solving and dependency resolution
Conda’s solver examines the entire dependency graph before installing anything. If package A needs NumPy 1.24 and package B needs NumPy 1.26, conda tells you about the conflict upfront rather than installing one and breaking the other.
This thoroughness comes at a cost — solving can be slow for large environments. The default solver (libmamba, integrated since conda 23.10) is significantly faster than the original solver.
Exporting and sharing
# Full export (platform-specific, exact versions)
conda env export > environment-lock.yml
# Cross-platform export (no build strings)
conda env export --no-builds > environment.yml
# Minimal export (only explicitly installed)
conda env export --from-history > environment-minimal.yml
The --from-history export is most portable — it lists only what you explicitly asked for, letting conda resolve platform-appropriate versions on the target machine.
Miniconda vs Anaconda
Miniconda: Minimal installer — just conda, Python, and essential packages (~80 MB). Install what you need.
Anaconda: Full distribution with 250+ pre-installed scientific packages (~3 GB). Ready to use immediately but heavy.
For most workflows, Miniconda with conda-forge is the recommended approach — you get exactly what you need without bloat.
Common misconception
“Conda replaces pip entirely.” Many Python packages exist only on PyPI, not on conda channels. The practical approach is conda-first for packages available there (especially those with C dependencies), then pip for the rest. The key rule: install conda packages first, pip packages second — pip installations don’t register with conda’s solver.
One thing to remember
Conda environments isolate entire software stacks — Python, C libraries, CUDA, and more — making them essential for data science workflows where pip and venv can’t manage the full dependency chain.
See Also
- Python Black Formatter Understand Black Formatter through a practical analogy so your Python decisions become faster and clearer.
- Python Bumpversion Release Change your software's version number in every file at once with a single command — no more find-and-replace mistakes.
- Python Changelog Automation Let your git commits write the changelog so you never forget what changed in a release.
- Python Ci Cd Python Understand CI CD Python through a practical analogy so your Python decisions become faster and clearer.
- Python Cicd Pipelines Use Python CI/CD pipelines to remove setup chaos so Python projects stay predictable for every teammate.