Rasterio Geospatial — Core Concepts
Rasterio wraps the GDAL/OGR library in a clean, Pythonic interface for reading and writing geospatial raster data — satellite imagery, digital elevation models (DEMs), land-cover grids, and more. It turns every raster file into NumPy arrays plus geographic metadata.
The dataset model
import rasterio
with rasterio.open("landsat_band4.tif") as src:
print(src.width, src.height) # pixel dimensions
print(src.count) # number of bands
print(src.dtypes) # per-band data type
print(src.crs) # coordinate reference system
print(src.bounds) # geographic bounding box
band1 = src.read(1) # read band 1 as a NumPy array
Every dataset carries an Affine transform that maps pixel coordinates to geographic coordinates. Row 0, column 0 corresponds to the upper-left corner of the bounding box, and each step moves by a fixed resolution.
Affine transforms — pixel to map
from rasterio.transform import xy
with rasterio.open("dem.tif") as src:
# Convert pixel (row=100, col=200) to longitude/latitude
lon, lat = xy(src.transform, 100, 200)
The inverse operation — map coordinate to pixel — uses ~src.transform (the inverse affine). This lets you query specific geographic locations without scanning the entire array.
Reading windows
Large rasters can be gigabytes. Rasterio lets you read rectangular subsets called windows so you never have to load the full file into memory.
from rasterio.windows import Window
with rasterio.open("big_image.tif") as src:
window = Window(col_off=1000, row_off=2000, width=512, height=512)
chip = src.read(1, window=window)
This is critical when processing continent-scale datasets tile by tile.
Writing rasters
import numpy as np
profile = src.profile
profile.update(dtype=rasterio.float32, count=1)
with rasterio.open("output.tif", "w", **profile) as dst:
dst.write(result_array.astype(np.float32), 1)
The profile carries CRS, transform, dimensions, compression, and data type so your output file keeps the same georeferencing as the input.
Reprojection
Different datasets often use different coordinate systems. Rasterio reprojects rasters on the fly.
from rasterio.warp import calculate_default_transform, reproject, Resampling
with rasterio.open("input.tif") as src:
transform, width, height = calculate_default_transform(
src.crs, "EPSG:3857", src.width, src.height, *src.bounds
)
kwargs = src.meta.copy()
kwargs.update(crs="EPSG:3857", transform=transform,
width=width, height=height)
with rasterio.open("reprojected.tif", "w", **kwargs) as dst:
for i in range(1, src.count + 1):
reproject(
source=rasterio.band(src, i),
destination=rasterio.band(dst, i),
src_transform=src.transform,
src_crs=src.crs,
dst_transform=transform,
dst_crs="EPSG:3857",
resampling=Resampling.nearest,
)
Masking with vector geometries
You can clip a raster to an area defined by a polygon — for example, extracting elevation data within a national-park boundary.
from rasterio.mask import mask
import json
with open("park_boundary.geojson") as f:
geom = json.load(f)["features"][0]["geometry"]
with rasterio.open("dem.tif") as src:
clipped, clipped_transform = mask(src, [geom], crop=True)
Common misconception
Many beginners assume Rasterio is only for satellite imagery. In practice it handles any gridded data: weather model outputs, bathymetry, population density grids, and synthetic aperture radar (SAR) data all use the same raster model.
The one thing to remember: Rasterio gives you NumPy arrays plus geographic context — read any raster, know exactly where each pixel sits, and write the result back as a properly georeferenced file.
See Also
- Python Adaptive Learning Systems How Python builds learning apps that adjust to each student like a personal tutor who knows exactly what you need next.
- Python Airflow Learn Airflow as a timetable manager that makes sure data tasks run in the right order every day.
- Python Altair Learn Altair through the idea of drawing charts by describing rules, not by hand-placing every visual element.
- Python Automated Grading How Python grades homework and exams automatically, from simple answer keys to understanding written essays.
- Python Batch Vs Stream Processing Batch processing is like doing laundry once a week; stream processing is like a self-cleaning shirt that cleans itself constantly.