Release the GIL across libmaxminddb lookup calls by efroemling · Pull Request #362 · maxmind/MaxMind-DB-Reader-python

efroemling · 2026-05-09T04:41:09Z

Summary

I've been running into some occasional long Python stalls across all threads and tracked them down to geoip lookups. I've found that a few Py_BEGIN_ALLOW_THREADS/Py_END_ALLOW_THREADS additions can prevent such stalls and greatly increase parallelism.

The lookup path - Reader.get() and Reader.get_with_prefix_len(), which back geoip2's city() / country() / etc. - holds the GIL across MMDB_lookup_sockaddr and MMDB_get_entry_data_list. Both walk the read-only mmap and touch no Python state, but if the relevant pages aren't resident - cold cache after start-up, slow EBS, memory pressure on a long-running process - a single page-fault inside the walk stalls every Python thread for the duration of the disk wait, not just the calling thread.

This patch wraps both calls in Py_BEGIN_ALLOW_THREADS / Py_END_ALLOW_THREADS. No change to fast-path behavior; on the slow path, only the calling thread waits.

Why it's safe

Py_MOD_GIL_NOT_USED is already declared (free-threaded build), so the maintainers have committed that this code is safe to run with no GIL.
The pthread_rwlock_t read lock is acquired before the wrap and released after, so concurrent close/destroy can't pull mmdb out from under us.
The wrapped sections touch only the mmap, the MMDB_s (immutable post-open), and stack-locals. No Python C-API calls inside.

Measured effect

Microbenchmark with a 10ms usleep inserted into get_record (stand-in for a page-fault stall), 10 threads × 100 lookups:

Variant	Wall-clock	Throughput	Effective parallelism
Current (GIL held)	12.13 s	82 ops/s	0.82×
Patched (`Py_BEGIN_ALLOW_THREADS`)	1.23 s	815 ops/s	8.14×

For warm-cache lookups the per-call cost is dominated by tree-walk work, so the GIL release/acquire overhead is in the noise - bench numbers without the artificial sleep are identical between variants.

Notes

A third MMDB_get_entry_data_list call exists inside the iterator's __iter__ path (around line 856 in 3.1.1). Same reasoning would apply, but it's a separate code path and I left it out of this PR to keep the change tightly scoped to the hot lookup path. Happy to follow up if you'd like it covered.

get_record() previously held the GIL across MMDB_lookup_sockaddr and MMDB_get_entry_data_list. These walk the read-only mmap and touch no Python state, but if the relevant pages aren't resident (cold cache, slow disk, memory pressure) a single page-fault inside the walk stalls every Python thread for the duration of the disk wait, not just the calling thread. Wrap both calls in Py_BEGIN_ALLOW_THREADS / Py_END_ALLOW_THREADS. The pthread_rwlock_t read lock is held throughout, and the module already declares Py_MOD_GIL_NOT_USED, so this just brings the GIL-build fast path in line with the existing free-threaded guarantees.

gemini-code-assist

Code Review

This pull request aims to improve concurrency by releasing the Python GIL during MaxMind DB lookups and data retrieval, preventing the interpreter from stalling during I/O or page-ins. However, the review identifies a critical safety issue: in 'GIL-only' mode (the default for standard Python builds), the internal rwlock is disabled, making the GIL the only protection against concurrent resource deallocation. Releasing the GIL in this state creates a race condition with Reader.close() that could lead to a use-after-free crash. The reviewer recommends wrapping the GIL release blocks in #ifndef MAXMINDDB_USE_GIL_ONLY to ensure safety across different build configurations.

gemini-code-assist · 2026-05-09T04:42:53Z

+    Py_BEGIN_ALLOW_THREADS
+    result = MMDB_lookup_sockaddr(mmdb, ip_address, &mmdb_error);
+    Py_END_ALLOW_THREADS


Releasing the GIL here is unsafe when the extension is compiled in 'GIL-only' mode (the default for standard Python builds, as defined on lines 24-26). In this mode, the internal rwlock is a no-op, and the GIL is the only mechanism protecting the mmdb structure from being freed by a concurrent Reader.close() call in another thread. If the GIL is released, a race condition occurs that can lead to a use-after-free crash.

To fix this safely, wrap the block in #ifndef MAXMINDDB_USE_GIL_ONLY. To enable this optimization for standard Python, you should also update the locking logic at the top of the file to use real locks (pthreads/Windows) even when the GIL is present.

#ifndef MAXMINDDB_USE_GIL_ONLY Py_BEGIN_ALLOW_THREADS #endif result = MMDB_lookup_sockaddr(mmdb, ip_address, &mmdb_error); #ifndef MAXMINDDB_USE_GIL_ONLY Py_END_ALLOW_THREADS #endif

gemini-code-assist · 2026-05-09T04:42:53Z

+    Py_BEGIN_ALLOW_THREADS
+    status = MMDB_get_entry_data_list(&result.entry, &entry_data_list);
+    Py_END_ALLOW_THREADS


Similar to the previous lookup call, releasing the GIL here is unsafe in 'GIL-only' mode because the rwlock does not provide actual synchronization. A concurrent call to Reader.close() could free the mmdb structure while MMDB_get_entry_data_list is accessing the mmap.

You should wrap this block in #ifndef MAXMINDDB_USE_GIL_ONLY or ensure that real locks are used even in standard Python builds.

#ifndef MAXMINDDB_USE_GIL_ONLY Py_BEGIN_ALLOW_THREADS #endif status = MMDB_get_entry_data_list(&result.entry, &entry_data_list); #ifndef MAXMINDDB_USE_GIL_ONLY Py_END_ALLOW_THREADS #endif

efroemling · 2026-05-09T05:16:35Z

@gemini-code-assist was right, and the CI failure on tests/threading_test.py::test_read_during_close confirmed it. On regular GIL builds, MAXMINDDB_USE_GIL_ONLY is defined, the rwlock is a { char dummy } no-op, and the GIL is the only thing keeping Reader.close() from racing with an in-progress lookup. My safety argument was based on a misreading.

The simple fix (#ifndef MAXMINDDB_USE_GIL_ONLY around the GIL release) makes the patch correct but turns it into a no-op on the build profile that motivated it - regular CPython, where slow mmap reads stall the whole interpreter via the GIL. So I explored the bigger change: make the rwlock unconditional, then release the GIL safely. Quick local numbers (macOS arm64, 16 cores, GeoLite2-City, single Reader):

Variant	1 thread (warm)	8 threads (warm)	10 threads w/ 10ms stall
Today	165k ops/s	168k ops/s	12.1s
GIL release alone (unsafe)	158k ops/s	161k ops/s	1.2s
Unconditional rwlock + GIL release	153k ops/s	99k ops/s	1.2s

Slow-path parallelism is preserved (~10x), and the single-thread warm-path cost is ~7% (split roughly 4% GIL handoff + 3% rwlock). But the 8-thread warm-path drops from 168k to 99k ops/s - GIL ping-pong plus rwlock cacheline contention. Asking every user to pay that for a benefit that only kicks in during disk stalls is a hard sell.

Going to close this for now. Might revisit from a different angle (opt-in build flag, or just MODE_MEMORY on my end). Thanks for the careful review.

gemini-code-assist · 2026-05-09T05:16:50Z

Thank you for the detailed follow-up and for sharing those performance benchmarks. It is completely understandable to prioritize the stability and performance of the common case over the edge-case stalls, especially given the non-trivial overhead introduced by the rwlock contention in multi-threaded scenarios. Closing the PR is a reasonable decision given these findings. I appreciate the thorough investigation into the trade-offs here; it provides valuable context for anyone who might revisit this problem in the future.

gemini-code-assist Bot reviewed May 9, 2026

View reviewed changes

efroemling closed this May 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release the GIL across libmaxminddb lookup calls#362

Release the GIL across libmaxminddb lookup calls#362
efroemling wants to merge 1 commit intomaxmind:mainfrom
efroemling:release-gil-on-lookup

efroemling commented May 9, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 9, 2026

Uh oh!

gemini-code-assist Bot May 9, 2026

Uh oh!

efroemling commented May 9, 2026

Uh oh!

gemini-code-assist Bot commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Conversation

efroemling commented May 9, 2026

Summary

Why it's safe

Measured effect

Notes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

efroemling commented May 9, 2026

Uh oh!

gemini-code-assist Bot commented May 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant