Prevent tight loop if Plex app failed to launch#1164
Prevent tight loop if Plex app failed to launch#1164mdeneen wants to merge 1 commit intohome-assistant-libs:masterfrom
Conversation
I've been tracking this one down for a while in Home Assistant. Occasionally, when playing Plex media on a Google Home Mini, Home Assistant would get into a very tight loop (see home-assistant/core#161641) if there was a failure launching the Plex app on the device. This patch fixes that behavior by logging that it has happened, setting the event, and returning instead of raising an exception. Prior to this patch, once this exception happened, Home Assistant would get into a tight loop and continuously log errors until all disk space was consumed. I'm not super familiar with how pychromecast works, but understood enough to make it not fail. :-) Please review and apply if appropriate. Signed-off-by: Mark Deneen <mdeneen@gmail.com>
elupus
left a comment
There was a problem hiding this comment.
This does not seem right. Hone assistant should not catch that exception and retry indefinitely.
|
I can assure you that it does. |
|
Here is a log of it happening. The unhandled exception happens and Home Assistant logs until the device runs out of disk space. |
|
This is the tight loop, and _run_once() throws an exception: _check_connection() also throws an exception. initialize_connection() also throws an exception. Neither true nor false will be returned, since RequestFailed is not a ChromecastConnectionError. Once you get deeper into the code, it gets a lot more complex since I don't really understand how the controller subclass setup works yet. But I can tell that we never get to a point where there is any sort of delay or select/epoll/whatever python uses. Perhaps my pull request corrects this behavior in an accidental manner, but there is 100% something wrong with the way it works today in regards to exception handling. It is also worth noting that the connection retry will never succeed. The state is such that we will never escape this loop. Here is the behavior with my change: And then it continued to function. Again, I'm not saying that what I did was right. But I am saying that raising a RequestFailed gets things in an inoperable state. |
|
There probably is something wrong. We just need to fix it in a way that doesnt hide the error. It should bubble up. |
|
I looked into this a little bit more, and I think that I've finally figured it out. When the exception is raised, the callback is still present in SocketClient's _request_callbacks dictionary. Here's one of the first things that it does: The callback function is this: So it immediately raises an exception since msg_sent is False from SocketClient trying to clean things up. Would clearing the dictionary in the exception handler for run() be acceptable? Surely the code with the comment of "Make sure nobody is blocking" was intentional, but that's a bad path to take with Plex. |
I've been tracking this one down for a while in Home Assistant.
Occasionally, when playing Plex media on a Google Home Mini, Home Assistant would get into a very tight loop (see home-assistant/core#161641) if there was a failure launching the Plex app on the device. This patch fixes that behavior by logging that it has happened, setting the event, and returning instead of raising an exception.
Prior to this patch, once this exception happened, Home Assistant would get into a tight loop and continuously log errors until all disk space was consumed.
I'm not super familiar with how pychromecast works, but understood enough to make it not fail. :-)
Please review and apply if appropriate.