Skip to content

GH-137759:Limit _PyObject_HashFast to dicts keys, rename it, and mark it as Py_ALWAYS_INLINE#137828

Open
JasonMendoza2008 wants to merge 3 commits intopython:mainfrom
JasonMendoza2008:fix-issue-137759
Open

GH-137759:Limit _PyObject_HashFast to dicts keys, rename it, and mark it as Py_ALWAYS_INLINE#137828
JasonMendoza2008 wants to merge 3 commits intopython:mainfrom
JasonMendoza2008:fix-issue-137759

Conversation

@JasonMendoza2008
Copy link
Copy Markdown

@JasonMendoza2008 JasonMendoza2008 commented Aug 15, 2025

Context:

setobject.c used a mix of _PyObject_HashFast and PyObject_Hash. The consensus1 was that _PyObject_HashFast should only really be used for dict keys (string-only sets are not so omnipresent). This PR uses the standard PyObject_Hash without the fast path (making set[Any] faster and set[str] slower2), renames _PyObject_HashFast since its name was misleading and marked it as Py_ALWAYS_INLINE to force inlining.

Pretty sure this doesn't require a NEWS entry?

Footnotes

  1. There were discussions in the same issue around removing _PyObject_HashFast completely and adding the fast path for strings (and even potentially integers) to PyObject_Hash but there was no consensus so I'm not including it for now.

  2. In this comment , I run pyperformance on two versions of setobject.c, one using HashFast everywhere and one using the standard Hash everywhere. None of these correspond to the current implementation of setobject.c since the current implementation uses a mix of both. Obviously in all cases, dictionaries use HashFast.

@python-cla-bot
Copy link
Copy Markdown

python-cla-bot Bot commented Aug 15, 2025

All commit authors signed the Contributor License Agreement.

CLA signed

@bedevere-app

This comment was marked as resolved.

@bedevere-app

This comment was marked as resolved.

@ZeroIntensity
Copy link
Copy Markdown
Member

making set[Any] faster and set[str] slower

I'm not totally convinced that this is true. Detecting the fast-path for strings is just one pointer comparison, which really won't have a noticeable performance impact. I agree that we want _PyObject_HashFast/_PyObject_HashDictKey for inlining purposes, but I don't see why the fast path has to be limited to it.

@JasonMendoza2008
Copy link
Copy Markdown
Author

JasonMendoza2008 commented Aug 15, 2025

making set[Any] faster and set[str] slower

I'm not totally convinced that this is true. Detecting the fast-path for strings is just one pointer comparison, which really won't have a noticeable performance impact. I agree that we want _PyObject_HashFast/_PyObject_HashDictKey for inlining purposes, but I don't see why the fast path has to be limited to it.

I'm fine with adding a fast-path for strings and integers (but then do we even need _PyObject_HashFast/_PyObject_HashDictKey , we could just put everything in PyObject_Hash1). I didn't since there was no consensus.

Footnotes

  1. However, we probably can't inline that function since it's bigger? I'm not sure... I'm no expert in what's too big to be inlined.

@ZeroIntensity
Copy link
Copy Markdown
Member

but then do we even need HashFast, we could just put everything in PyObject_Hash

Yeah, because even if there's no speedup there, it's just simpler maintenance (in the future, we might want to make other assumptions about a dictionary key's hash that improve performance).

@JasonMendoza2008
Copy link
Copy Markdown
Author

JasonMendoza2008 commented Aug 15, 2025

I'm not totally convinced that this is true. Detecting the fast-path for strings is just one pointer comparison, which really won't have a noticeable performance impact

Playing devil's advocate: if we don't detect it, we're a a function pointer dereference and call away from getting the hash anyway, right? The hash is not recomputed without that fast-path since it's stored (for strings, but for integers it's just an AND operation if I remember correctly)?

@ZeroIntensity
Copy link
Copy Markdown
Member

A function pointer call has a lot more to do than just some reads, but maybe the overhead isn't noticeable?

@rhettinger rhettinger removed their request for review August 16, 2025 04:17
@JasonMendoza2008
Copy link
Copy Markdown
Author

JasonMendoza2008 commented Aug 16, 2025

A function pointer call has a lot more to do than just some reads, but maybe the overhead isn't noticeable?

The best thing to answer this without guessing is probably to compile both (or all three, if we also include a fast path for integers) and run pyperformance.

@JasonMendoza2008
Copy link
Copy Markdown
Author

JasonMendoza2008 commented Aug 25, 2025

making set[Any] faster and set[str] slower

I'm not totally convinced that this is true. Detecting the fast-path for strings is just one pointer comparison, which really won't have a noticeable performance impact. I agree that we want _PyObject_HashFast/_PyObject_HashDictKey for inlining purposes, but I don't see why the fast path has to be limited to it.

Here is a comparison using pyperformance for what it's worth.

Pyperformance version: 1.11.0
Report on Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.39
Number of logical CPUs: 24
Run 1 image
Run 2 (using the -r option which is supposed to get more accurate results by running everything more) image

Both versions were compiled with ./configure --enable-optimizations --with-lto

@vstinner
Copy link
Copy Markdown
Member

vstinner commented Dec 8, 2025

The set type has a fast-path for str hash for 20 years, since commit 9f1a679:

commit 9f1a6796eb83a2884df5fd93487634e46d8830a7
Author: Raymond Hettinger <python@rcn.com>
Date:   Sun Jul 31 01:16:36 2005 +0000

    Revised the set() and frozenset() implementaion to use its own internal
    data structure instead of using dictionaries.  Reduces memory consumption
    by 1/3 and provides modest speed-ups for most set operations.

(...)
+       if (PyString_CheckExact(key)) {
+               hash = ((PyStringObject *)key)->ob_shash;
+               if (hash == -1)
+                       hash = PyObject_Hash(key);
+       } else {
+               hash = PyObject_Hash(key);
+               if (hash == -1)
+                       return -1;
+       }
(...)

The consensus was that _PyObject_HashFast should only really be used for dict keys (string-only sets are not so omnipresent).

set implementation is very close to the dict implementation. So similar dict optimizations can be used in set.

marked it as Py_ALWAYS_INLINE to force inlining.

I dislike this macro, I prefer to rely on the compiler to decide if a function should be inlined or not. The compiler is smart enough to decide if a function should be inlined depending on the pressure on register allocation and other stuffs.

@vstinner
Copy link
Copy Markdown
Member

vstinner commented Dec 8, 2025

I tried to remove the optimization for str:

diff --git a/Objects/setobject.c b/Objects/setobject.c
index 85f4d7d4031..5c8139a7c5f 100644
--- a/Objects/setobject.c
+++ b/Objects/setobject.c
@@ -409,7 +409,7 @@ set_discard_entry(PySetObject *so, PyObject *key, Py_hash_t hash)
static int
set_add_key(PySetObject *so, PyObject *key)
{
-    Py_hash_t hash = _PyObject_HashFast(key);
+    Py_hash_t hash = PyObject_Hash(key);
    if (hash == -1) {
        set_unhashable_type(key);
        return -1;

Results on Linux with CPU isolation and Python built with gcc -O3:

Benchmark _PyObject_HashFast PyObject_Hash
set(range(256)) 5.58 us 5.16 us: 1.08x faster
set(map(chr, range(256))) 6.33 us 6.74 us: 1.06x slower
Geometric mean (ref) 1.01x faster

So yes, set(integers) is 8% faster, but set(strings) becomes 6% slower. The question is how common it is to create a set of strings? In case of doubt, I prefer to leave the code as it is.


Benchmark script:

import pyperf
runner = pyperf.Runner()
runner.timeit('set(range(256))',
    setup='data = tuple(range(256)); Set=set',
    stmt='Set(data)')
runner.timeit('set(map(chr, range(256)))',
    setup='data = tuple(map(chr, range(256))); Set=set',
    stmt='Set(data)')

@JasonMendoza2008
Copy link
Copy Markdown
Author

JasonMendoza2008 commented Dec 9, 2025

I tried to remove the optimization for str:

diff --git a/Objects/setobject.c b/Objects/setobject.c

index 85f4d7d4031..5c8139a7c5f 100644

--- a/Objects/setobject.c

+++ b/Objects/setobject.c

@@ -409,7 +409,7 @@ set_discard_entry(PySetObject *so, PyObject *key, Py_hash_t hash)

static int

set_add_key(PySetObject *so, PyObject *key)

{

-    Py_hash_t hash = _PyObject_HashFast(key);

+    Py_hash_t hash = PyObject_Hash(key);

    if (hash == -1) {

        set_unhashable_type(key);

        return -1;

Results on Linux with CPU isolation and Python built with gcc -O3:

| Benchmark | _PyObject_HashFast | PyObject_Hash |

|---------------------------|:------------------:|:---------------------:|

| set(range(256)) | 5.58 us | 5.16 us: 1.08x faster |

| set(map(chr, range(256))) | 6.33 us | 6.74 us: 1.06x slower |

| Geometric mean | (ref) | 1.01x faster |

So yes, set(integers) is 8% faster, but set(strings) becomes 6% slower. The question is how common it is to create a set of strings? In case of doubt, I prefer to leave the code as it is.


Benchmark script:

import pyperf

runner = pyperf.Runner()

runner.timeit('set(range(256))',

    setup='data = tuple(range(256)); Set=set',

    stmt='Set(data)')

runner.timeit('set(map(chr, range(256)))',

    setup='data = tuple(map(chr, range(256))); Set=set',

    stmt='Set(data)')

The problem is that set sometimes has the fast path and sometimes doesn't have it. So we can't really leave the code as it is, we have to either include it everywhere or nowhere right?

Nevermind I now realize I misunderstood what you meant by leaving the code as it is. Leaving the fast path everywhere and fixing where it was forgotten seem fine to me but other people (cf. issue) seemed to disagree so I'll let you guys decide.

@vstinner
Copy link
Copy Markdown
Member

vstinner commented Dec 9, 2025

Using _PyObject_HashFast() in set_intersection() sounds reasonable to me.

@github-actions
Copy link
Copy Markdown

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions Bot added the stale Stale PR or inactive for long period of time. label Apr 28, 2026
@JasonMendoza2008
Copy link
Copy Markdown
Author

Using _PyObject_HashFast() in set_intersection() sounds reasonable to me.

Should someone that has the authority to do so do that? Not sure how to move forward this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting review skip news stale Stale PR or inactive for long period of time.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants