Image Inpainting in Python — Core Concepts

Understand traditional and diffusion-based inpainting approaches in Python, from OpenCV patch matching to Stable Diffusion text-guided fill.

Image inpainting reconstructs missing or masked regions of an image by synthesizing plausible content that blends seamlessly with the surrounding pixels. The technique spans from classical patch-matching algorithms to modern diffusion models capable of generating contextually appropriate content guided by text prompts.

Classical inpainting with OpenCV

OpenCV provides two traditional algorithms that work well for small regions like scratch removal or watermark elimination:

import cv2
import numpy as np

image = cv2.imread("photo.jpg")
mask = cv2.imread("mask.png", cv2.IMREAD_GRAYSCALE)
# mask: white (255) = region to fill, black (0) = keep

# Navier-Stokes method: good for thin regions (scratches, text)
result_ns = cv2.inpaint(image, mask, inpaintRadius=3, flags=cv2.INPAINT_NS)

# Telea method: fast marching, good for larger areas
result_telea = cv2.inpaint(image, mask, inpaintRadius=5, flags=cv2.INPAINT_TELEA)

These algorithms propagate color and gradient information from the boundary inward. They excel at small defects but fail on large regions because they have no understanding of what objects should look like — they only interpolate from edges.

Deep learning inpainting

Neural network-based approaches learn from millions of images what real-world content looks like, enabling them to fill large missing regions with plausible objects, textures, and structures:

# Using a pretrained LaMa model (Large Mask Inpainting)
from simple_lama_inpainting import SimpleLama

lama = SimpleLama()
result = lama(image, mask)

LaMa (Large Mask Inpainting) uses fast Fourier convolutions to capture both local texture patterns and global structure, making it effective even when half the image is missing.

Diffusion-based inpainting

Stable Diffusion’s inpainting model takes a different approach — it generates the masked region from scratch, guided by both the surrounding image context and a text prompt:

from diffusers import StableDiffusionInpaintPipeline
from diffusers.utils import load_image
import torch

pipe = StableDiffusionInpaintPipeline.from_pretrained(
    "runwayml/stable-diffusion-inpainting",
    torch_dtype=torch.float16,
).to("cuda")

image = load_image("room.png").resize((512, 512))
mask = load_image("mask.png").resize((512, 512))

result = pipe(
    prompt="a modern wooden bookshelf filled with colorful books",
    image=image,
    mask_image=mask,
    num_inference_steps=30,
    guidance_scale=7.5,
).images[0]

The model sees the unmasked pixels as context and generates new content for the masked area that matches both the text prompt and the visual surroundings.

The mask matters

The quality of inpainting depends heavily on mask design:

Tight masks that closely follow the object boundary produce cleaner results because the model has maximum context.

Padded masks with 10–30 extra pixels around the target area help the model blend the generated region more naturally, avoiding visible seams at mask edges.

Feathered masks with soft gradient edges create the smoothest transitions:

from PIL import Image, ImageFilter

mask = Image.open("mask.png").convert("L")
feathered = mask.filter(ImageFilter.GaussianBlur(radius=10))

Strength parameter

When using image-to-image inpainting, the strength parameter controls how much the model changes the masked region:

0.5–0.6: Gentle modification, keeps much of original texture
0.7–0.8: Balanced — generates new content while respecting context
0.9–1.0: Full regeneration of the masked area

Common misconception

Inpainting is not the same as object removal followed by background fill. Simple removal tools (like Photoshop’s content-aware fill) try to extend the background. Diffusion-based inpainting can generate entirely new objects in the masked region — replacing a chair with a plant, or an empty wall with a window. It is a generative process, not just an interpolation one.

When to use which approach

Approach	Best for	Limitations
OpenCV (classical)	Small scratches, thin text removal	Cannot handle large areas
LaMa (deep learning)	Large masks, texture continuation	No text guidance
SD Inpainting (diffusion)	Object replacement, creative fill	Requires GPU, slower

One thing to remember: Inpainting has evolved from simple edge interpolation (OpenCV) to texture-aware neural filling (LaMa) to fully generative text-guided synthesis (Stable Diffusion) — and your choice depends on whether you need to remove defects, extend textures, or create entirely new content in the masked area.

pythonimage-inpaintingcomputer-visiongenerative-ai