fix: Handle thread/process affinity when freeing objects#285
Merged
Conversation
scouten-adobe
approved these changes
Jun 15, 2026
andrewhalle
approved these changes
Jun 15, 2026
andrewhalle
left a comment
Contributor
There was a problem hiding this comment.
I thought about this a lot and discussed with Tania and I think this is a good solution. Calling fork() when there are multiple active threads triggers a DeprecationWarning on python 3.12+ (see discussion for details), so callers will probably be forced to deal with this correctly at some point. In the meantime, this change makes these functions fork-safe, and the usual case is memory space of the child process will be quickly replaced by an exec anyway, or else it will exit.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
The C2PA native library the Python SDK relies on (C FFI) uses a process-global pointer registry to track and free allocated pointers. That registry is guarded by a lock. When a Python process forks, that lock can be inherited in a locked state with no surviving thread to release it, so the next free in the child deadlocks. Some Python stacks do exactly that (using forks), setting up this case.
Changes
This PR stamps each wrapper with the PID that created it. Only the owning process can therefore free its very own memory during a garbage collector run (whenever the gc decides to run). A forked child skips the native free entirely (those objects belong to the parent anyway, the child should not even try to free those). Objects the child creates itself free normally, since their are "owned" by the right process/PID.
Additional details
Some Python stacks call os.fork() under the hood. When a process calls fork(), the child starts as a near-exact copy of the parent's memory. But... threads are the exception: only the thread that called fork() exists in the child. Every other thread is simply gone, never to be seen again and therefore not able to interact with the newly created flow.
Then, remember that the native library keeps the pointer registry behind a mutex (c2pa_c_ffi, get_registry, oncelock). If a thread was holding that mutex at the moment of the fork, the child inherits it locked, with no other surviving thread to release it since the things have now been "disconnected". The next time the child want to clean up a native object/handle/pointer, and that can happen when Python's garbage collector run whenever it sees fit (and call the deleters), it blocks on that mutex forever.
A native registry-side fix in the C FFI would mean a
pthread_atforkchild handler that correctly resets the lock, but that's POSIX-only... And it would only clear the deadlock situation, not decide ownership: which inherited objects the child may touch is something only the language binding has the context to know, which is what this change helps decide (the PID stamping)!Verifications
Benchmark results for scenarios capable of encountering the issue (all new scenarios added by this PR):
Note: Having a report already means it could run to end and did not deadlock.
Checklist
TO DOitems (or similar) have been entered as GitHub issues and the link to that issue has been included in a comment.