Rasterio Geospatial — Core Concepts

Rasterio wraps the GDAL/OGR library in a clean, Pythonic interface for reading and writing geospatial raster data — satellite imagery, digital elevation models (DEMs), land-cover grids, and more. It turns every raster file into NumPy arrays plus geographic metadata.

The dataset model

import rasterio

with rasterio.open("landsat_band4.tif") as src:
    print(src.width, src.height)   # pixel dimensions
    print(src.count)               # number of bands
    print(src.dtypes)              # per-band data type
    print(src.crs)                 # coordinate reference system
    print(src.bounds)              # geographic bounding box
    band1 = src.read(1)            # read band 1 as a NumPy array

Every dataset carries an Affine transform that maps pixel coordinates to geographic coordinates. Row 0, column 0 corresponds to the upper-left corner of the bounding box, and each step moves by a fixed resolution.

Affine transforms — pixel to map

from rasterio.transform import xy

with rasterio.open("dem.tif") as src:
    # Convert pixel (row=100, col=200) to longitude/latitude
    lon, lat = xy(src.transform, 100, 200)

The inverse operation — map coordinate to pixel — uses ~src.transform (the inverse affine). This lets you query specific geographic locations without scanning the entire array.

Reading windows

Large rasters can be gigabytes. Rasterio lets you read rectangular subsets called windows so you never have to load the full file into memory.

from rasterio.windows import Window

with rasterio.open("big_image.tif") as src:
    window = Window(col_off=1000, row_off=2000, width=512, height=512)
    chip = src.read(1, window=window)

This is critical when processing continent-scale datasets tile by tile.

Writing rasters

import numpy as np

profile = src.profile
profile.update(dtype=rasterio.float32, count=1)

with rasterio.open("output.tif", "w", **profile) as dst:
    dst.write(result_array.astype(np.float32), 1)

The profile carries CRS, transform, dimensions, compression, and data type so your output file keeps the same georeferencing as the input.

Reprojection

Different datasets often use different coordinate systems. Rasterio reprojects rasters on the fly.

from rasterio.warp import calculate_default_transform, reproject, Resampling

with rasterio.open("input.tif") as src:
    transform, width, height = calculate_default_transform(
        src.crs, "EPSG:3857", src.width, src.height, *src.bounds
    )
    kwargs = src.meta.copy()
    kwargs.update(crs="EPSG:3857", transform=transform,
                  width=width, height=height)

    with rasterio.open("reprojected.tif", "w", **kwargs) as dst:
        for i in range(1, src.count + 1):
            reproject(
                source=rasterio.band(src, i),
                destination=rasterio.band(dst, i),
                src_transform=src.transform,
                src_crs=src.crs,
                dst_transform=transform,
                dst_crs="EPSG:3857",
                resampling=Resampling.nearest,
            )

Masking with vector geometries

You can clip a raster to an area defined by a polygon — for example, extracting elevation data within a national-park boundary.

from rasterio.mask import mask
import json

with open("park_boundary.geojson") as f:
    geom = json.load(f)["features"][0]["geometry"]

with rasterio.open("dem.tif") as src:
    clipped, clipped_transform = mask(src, [geom], crop=True)

Common misconception

Many beginners assume Rasterio is only for satellite imagery. In practice it handles any gridded data: weather model outputs, bathymetry, population density grids, and synthetic aperture radar (SAR) data all use the same raster model.

The one thing to remember: Rasterio gives you NumPy arrays plus geographic context — read any raster, know exactly where each pixel sits, and write the result back as a properly georeferenced file.

pythonrasteriogeospatialremote-sensing

See Also

  • Python Adaptive Learning Systems How Python builds learning apps that adjust to each student like a personal tutor who knows exactly what you need next.
  • Python Airflow Learn Airflow as a timetable manager that makes sure data tasks run in the right order every day.
  • Python Altair Learn Altair through the idea of drawing charts by describing rules, not by hand-placing every visual element.
  • Python Automated Grading How Python grades homework and exams automatically, from simple answer keys to understanding written essays.
  • Python Batch Vs Stream Processing Batch processing is like doing laundry once a week; stream processing is like a self-cleaning shirt that cleans itself constantly.