Skip to content

Prevent tight loop if Plex app failed to launch#1164

Open
mdeneen wants to merge 1 commit intohome-assistant-libs:masterfrom
mdeneen:plex_assertion
Open

Prevent tight loop if Plex app failed to launch#1164
mdeneen wants to merge 1 commit intohome-assistant-libs:masterfrom
mdeneen:plex_assertion

Conversation

@mdeneen
Copy link
Copy Markdown

@mdeneen mdeneen commented Feb 21, 2026

I've been tracking this one down for a while in Home Assistant.

Occasionally, when playing Plex media on a Google Home Mini, Home Assistant would get into a very tight loop (see home-assistant/core#161641) if there was a failure launching the Plex app on the device. This patch fixes that behavior by logging that it has happened, setting the event, and returning instead of raising an exception.

Prior to this patch, once this exception happened, Home Assistant would get into a tight loop and continuously log errors until all disk space was consumed.

I'm not super familiar with how pychromecast works, but understood enough to make it not fail. :-)

Please review and apply if appropriate.

I've been tracking this one down for a while in Home Assistant.

Occasionally, when playing Plex media on a Google Home Mini, Home Assistant would
get into a very tight loop (see home-assistant/core#161641)
if there was a failure launching the Plex app on the device.  This patch fixes that
behavior by logging that it has happened, setting the event, and returning instead
of raising an exception.

Prior to this patch, once this exception happened, Home Assistant would get into a
tight loop and continuously log errors until all disk space was consumed.

I'm not super familiar with how pychromecast works, but understood enough to make
it not fail.  :-)

Please review and apply if appropriate.

Signed-off-by: Mark Deneen <mdeneen@gmail.com>
Copy link
Copy Markdown
Collaborator

@elupus elupus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem right. Hone assistant should not catch that exception and retry indefinitely.

@mdeneen
Copy link
Copy Markdown
Author

mdeneen commented Feb 21, 2026

I can assure you that it does.

@mdeneen
Copy link
Copy Markdown
Author

mdeneen commented Feb 21, 2026

log.gz

Here is a log of it happening. The unhandled exception happens and Home Assistant logs until the device runs out of disk space.

@mdeneen
Copy link
Copy Markdown
Author

mdeneen commented Feb 21, 2026

This is the tight loop, and _run_once() throws an exception:

            try:
                if self._run_once() == 1:
                    break
            except Exception:  # pylint: disable=broad-except
                self._force_recon = True
                self.logger.exception(
                    "[%s(%s):%s] Unhandled exception in worker thread, attempting reconnect",
                    self.fn or "",
                    self.host,
                    self.port,
                )
        """Receive from the socket and handle data."""
        # pylint: disable=too-many-branches, too-many-statements, too-many-return-statements
                            
        try:            
            if not self._check_connection():
                return 0
        except ChromecastConnectionError:
            return 1

_check_connection() also throws an exception.

        if reset:
            self.receiver_controller.disconnected()
            for channel in self._open_channels:
                self.disconnect_channel(channel)
            self._report_connection_status(
                ConnectionStatus(
                    CONNECTION_STATUS_LOST, NetworkAddress(self.host, self.port), None
                )
            )
            try:
                self.initialize_connection()
            except ChromecastConnectionError:
                self.stop.set()
            return False
        return True

initialize_connection() also throws an exception. Neither true nor false will be returned, since RequestFailed is not a ChromecastConnectionError.

Once you get deeper into the code, it gets a lot more complex since I don't really understand how the controller subclass setup works yet. But I can tell that we never get to a point where there is any sort of delay or select/epoll/whatever python uses.

Perhaps my pull request corrects this behavior in an accidental manner, but there is 100% something wrong with the way it works today in regards to exception handling.

It is also worth noting that the connection retry will never succeed. The state is such that we will never escape this loop.

Here is the behavior with my change:

2026-02-20 16:05:00.643 WARNING (Thread-10) [pychromecast.controllers] PlexController.play_media: unable to launch app, device unavailable?
2026-02-20 16:05:07.803 ERROR (Thread-10) [pychromecast.socket_client] [Craft Room speaker(192.168.9.87):8009] Failed to connect to service HostServiceInfo(host='192.168.9.87', port=8009), retrying in 5.0s

And then it continued to function.

Again, I'm not saying that what I did was right. But I am saying that raising a RequestFailed gets things in an inoperable state.

@elupus
Copy link
Copy Markdown
Collaborator

elupus commented Feb 21, 2026

There probably is something wrong. We just need to fix it in a way that doesnt hide the error. It should bubble up.

@mdeneen
Copy link
Copy Markdown
Author

mdeneen commented Apr 3, 2026

I looked into this a little bit more, and I think that I've finally figured it out.

When the exception is raised, the callback is still present in SocketClient's _request_callbacks dictionary. Here's one of the first things that it does:

        for callback_function in self._request_callbacks.values():
            callback_function(False, None)

The callback function is this:

            if not msg_sent:
                raise RequestFailed("PlexController.play_media")
            try:
                self._send_start_play(media, **kwargs)
            finally:
                self.play_media_event.set()

So it immediately raises an exception since msg_sent is False from SocketClient trying to clean things up.

Would clearing the dictionary in the exception handler for run() be acceptable?

            try:
                if self._run_once() == 1:
                    break
            except Exception:  # pylint: disable=broad-except
                self._force_recon = True
                self._request_callbacks = {}
                self.logger.exception(
                    "[%s(%s):%s] Unhandled exception in worker thread, attempting reconnect",
                    self.fn or "",
                    self.host,
                    self.port,
                )

Surely the code with the comment of "Make sure nobody is blocking" was intentional, but that's a bad path to take with Plex.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants