Python difflib Text Comparison — ELI5

Remember when a teacher handed back your essay with red marks showing what you changed from the draft? Crossed-out words, arrows pointing to moved sentences, and notes saying “this part is new.”

Python’s difflib does the same thing, but for any text. Give it two versions of something — an old file and a new file, yesterday’s report and today’s — and it highlights exactly what changed: what was added, removed, or modified.

You’ve probably seen this without knowing it. Every time you look at changes in Google Docs (“suggesting mode”) or see green and red lines on GitHub, something like difflib is working behind the scenes. Green means added, red means deleted.

But difflib does more than just “spot the difference.” It can also tell you how similar two texts are. Is “Python programming” close to “Pyhton programing”? difflib says yes — they’re about 90% the same. This is useful for spell-checking, finding near-duplicate files, or matching messy user input to clean options.

Think of it as two superpowers in one tool: a red pen that marks changes between two documents, and a similarity score that tells you how close two pieces of text are.

You don’t need to install anything — it comes with Python, ready to use.

The one thing to remember: difflib compares two texts and shows you exactly what changed — like a teacher’s red pen — plus it can score how similar any two strings are.

pythonstandard-librarytext-processing

See Also

  • Python Atexit How Python's atexit module lets your program clean up after itself right before it shuts down.
  • Python Bisect Sorted Lists How Python's bisect module finds things in sorted lists the way you'd find a word in a dictionary — by jumping to the middle.
  • Python Contextlib How Python's contextlib module makes the 'with' statement work for anything, not just files.
  • Python Copy Module Why copying data in Python isn't as simple as it sounds, and how the copy module prevents sneaky bugs.
  • Python Dataclass Field Metadata How Python dataclass fields can carry hidden notes — like sticky notes on a filing cabinet that tools read automatically.