gh-140267: Fixed thread leak in multiprocessing.Manager accepter thread#141734
gh-140267: Fixed thread leak in multiprocessing.Manager accepter thread#141734prithviraj-chaudhuri wants to merge 9 commits intopython:mainfrom
Conversation
…od has the stop_event.is_set() == True.
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
ashm-dev
left a comment
There was a problem hiding this comment.
The changes to test_manager.py are incorrect: we shouldn't mask a process timeout (SIGTERM) as a success. The join() call is causing a deadlock because the thread is likely still blocked. Please fix the hang instead of changing the test expectations.
Can you tell me which one of the above would be a better solution or what drawbacks you see for the above options? |
|
@ashm-dev thanks for the approval. Some tests are still failing since the tests are keeping around files and that causes env changes. Trying to ensure all tests pass before merging. Would you be able to point me to where these files are being generated from? Or if there is a method to call that would clean them when the acceptor thread stops accepting. |
|
|
||
| def accepter(self): | ||
| while True: | ||
| while True and not self.stop_event.is_set(): |
There was a problem hiding this comment.
the leftover True and is a no-op and should be removed.
| process.current_process()._manager_server = self | ||
| accepter = None | ||
| try: | ||
| accepter = threading.Thread(target=self.accepter) |
There was a problem hiding this comment.
move these 2 lines outside of the try: to replace your = None initialization.
| sys.stdout = sys.__stdout__ | ||
| sys.stderr = sys.__stderr__ | ||
| for t in self._handler_threads: | ||
| t.join() |
There was a problem hiding this comment.
what prevents the handle_request threads from blocking indefinitely?
| t.join() | ||
| accepter.join() | ||
| self.listener.close() | ||
| sys.exit(0) |
There was a problem hiding this comment.
Fundamentally, the accepter and handle_request threads were all being set .daemon = True as they were never intended to be controlled and cleaned up. they'll all die as soon as the server process exits. attempting to join all of the threads is perhaps ideal but only works if they can be guaranteed not to be blocked on anything. Handler threads do blocking IO to a socket connection which is not guaranteed to have been closed because it could come from an uncooperative peer. without rewriting them to do non-blocking IO and check a stop_event themselves, joining them can't be reliable and could lead to anything making a connection to the socket being able to block cleanup.
This is likely why you see test failures for leftover env files with pymp- prefixed tmp directories or Windows \\pipe paths still laying around such as Warning -- test.test_concurrent_futures.test_process_pool leaked temporary files (1): pymp-cdk1jdmh with this change. Those are created (by Server for its self.listener via the Listener use of default_family and call to arbitrary_address(family)) to hold the socket that the manager process uses. If the manager process is blocked instead of exiting cleanly it may still exist.
|
Closing per #140267 (comment) as I don't think there is a real issue worth working on here. If you want to turn the stand alone multiprocessing.manager process into something using event driven non-blocking IO in managers.Server.serve_forever (perhaps it becomes modern and |
|
@gpshead Thanks for the feedback. Will look into using event driven non-blocking IO in managers.Server.serve_forever. |
accepter.join()when before the serve_forever() method exits.0or-signal.SIGTERM