gh-142939: difflib.get_close_matches performance by dg-pb · Pull Request #142940 · python/cpython

dg-pb · 2025-12-18T14:39:31Z

S="
from difflib import get_close_matches
possibilities = [f'word{i}' for i in range(1000)]
"

$PYEXE -m timeit -s $S 'get_close_matches("word", possibilities)'
# current - 8.0 ms
# after `ratio()` - 7.2 ms
# after `__contains__` - 6.9 ms

Issue: difflib.get_close_matches performance #142939

johnslavik

The code is now simpler, nice.

aisk · 2025-12-29T15:54:28Z

Hi, can you provide some microbenchmarks to ensure that there is really a performance boost with this change? It doesn’t need to be overly complex or highly precise.

dg-pb · 2025-12-29T22:14:37Z

Hi, can you provide some microbenchmarks to ensure that there is really a performance boost with this change?

Doesn't my small benchmark cover it?

Both changes are straight forward amendments that take place in a linear loop:

Avoid calling function twice when possible
Replace one way of doing containment with another

I don't think there is anything else that would provide additional information.

aisk · 2025-12-30T02:28:07Z

Ah, sorry, I missed the message in the main thread.

hauntsaninja

Looks good, thank you! I added a news entry

tim-one

Have to say I don't care about the speed of get_close_matches(). As the docs say,

  See also function get_close_matches() in this module, which shows how
simple code building on SequenceMatcher can be used to do useful work.

If it were written today, it would be a mere "recipe" instead (but the docs didn't have such things way back when).

That said, the code changes are good. Does no harm 😉. But I like switching to "in" more for clarity than for speed. There's no reason I can imagine for why "in" will always be faster. Capturing a bound method object (dict.__contains__) avoids the runtime expense of finding it anew each time membership is tested. The greater speed now for in comes from avoiding the Python-level function call to invoke the membership testing via the bound method object.

Tomorrow that advantage may vanish, or ever reverse. One reason for why, in general, micro-optimizations cost more in human time than they're generally worth.

minor difflib perf

0d03e64

bedevere-app Bot mentioned this pull request Dec 18, 2025

difflib.get_close_matches performance #142939

Closed

bedevere-app Bot added the awaiting review label Dec 18, 2025

johnslavik approved these changes Dec 18, 2025

View reviewed changes

bedevere-app Bot added awaiting core review and removed awaiting review labels Dec 18, 2025

blurb

fd8d393

hauntsaninja approved these changes Dec 30, 2025

View reviewed changes

bedevere-app Bot added awaiting merge and removed awaiting core review labels Dec 30, 2025

hauntsaninja enabled auto-merge (squash) December 30, 2025 05:12

tim-one approved these changes Dec 30, 2025

View reviewed changes

hauntsaninja merged commit 23ad9c5 into python:main Dec 30, 2025
87 of 89 checks passed

bedevere-app Bot removed the awaiting merge label Dec 30, 2025

dg-pb deleted the get_close_matches_perf branch December 30, 2025 15:45

thunder-coding pushed a commit to thunder-coding/cpython that referenced this pull request Feb 15, 2026

pythongh-142939: difflib.get_close_matches performance (python#142940)

ac3780b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gh-142939: difflib.get_close_matches performance#142940

gh-142939: difflib.get_close_matches performance#142940
hauntsaninja merged 2 commits intopython:mainfrom
dg-pb:get_close_matches_perf

dg-pb commented Dec 18, 2025 •

edited

Loading

Uh oh!

johnslavik left a comment

Uh oh!

aisk commented Dec 29, 2025

Uh oh!

dg-pb commented Dec 29, 2025

Uh oh!

aisk commented Dec 30, 2025

Uh oh!

hauntsaninja left a comment •

edited

Loading

Uh oh!

tim-one left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

dg-pb commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

johnslavik left a comment

Choose a reason for hiding this comment

Uh oh!

aisk commented Dec 29, 2025

Uh oh!

dg-pb commented Dec 29, 2025

Uh oh!

aisk commented Dec 30, 2025

Uh oh!

hauntsaninja left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tim-one left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

dg-pb commented Dec 18, 2025 •

edited

Loading

hauntsaninja left a comment •

edited

Loading