Prevent tight loop if Plex app failed to launch by mdeneen · Pull Request #1164 · home-assistant-libs/pychromecast

mdeneen · 2026-02-21T16:40:32Z

I've been tracking this one down for a while in Home Assistant.

Occasionally, when playing Plex media on a Google Home Mini, Home Assistant would get into a very tight loop (see home-assistant/core#161641) if there was a failure launching the Plex app on the device. This patch fixes that behavior by logging that it has happened, setting the event, and returning instead of raising an exception.

Prior to this patch, once this exception happened, Home Assistant would get into a tight loop and continuously log errors until all disk space was consumed.

I'm not super familiar with how pychromecast works, but understood enough to make it not fail. :-)

Please review and apply if appropriate.

I've been tracking this one down for a while in Home Assistant. Occasionally, when playing Plex media on a Google Home Mini, Home Assistant would get into a very tight loop (see home-assistant/core#161641) if there was a failure launching the Plex app on the device. This patch fixes that behavior by logging that it has happened, setting the event, and returning instead of raising an exception. Prior to this patch, once this exception happened, Home Assistant would get into a tight loop and continuously log errors until all disk space was consumed. I'm not super familiar with how pychromecast works, but understood enough to make it not fail. :-) Please review and apply if appropriate. Signed-off-by: Mark Deneen <mdeneen@gmail.com>

elupus

This does not seem right. Hone assistant should not catch that exception and retry indefinitely.

mdeneen · 2026-02-21T16:44:49Z

I can assure you that it does.

mdeneen · 2026-02-21T16:51:45Z

log.gz

Here is a log of it happening. The unhandled exception happens and Home Assistant logs until the device runs out of disk space.

mdeneen · 2026-02-21T17:39:51Z

This is the tight loop, and _run_once() throws an exception:

            try:
                if self._run_once() == 1:
                    break
            except Exception:  # pylint: disable=broad-except
                self._force_recon = True
                self.logger.exception(
                    "[%s(%s):%s] Unhandled exception in worker thread, attempting reconnect",
                    self.fn or "",
                    self.host,
                    self.port,
                )

        """Receive from the socket and handle data."""
        # pylint: disable=too-many-branches, too-many-statements, too-many-return-statements
                            
        try:            
            if not self._check_connection():
                return 0
        except ChromecastConnectionError:
            return 1

_check_connection() also throws an exception.

        if reset:
            self.receiver_controller.disconnected()
            for channel in self._open_channels:
                self.disconnect_channel(channel)
            self._report_connection_status(
                ConnectionStatus(
                    CONNECTION_STATUS_LOST, NetworkAddress(self.host, self.port), None
                )
            )
            try:
                self.initialize_connection()
            except ChromecastConnectionError:
                self.stop.set()
            return False
        return True

initialize_connection() also throws an exception. Neither true nor false will be returned, since RequestFailed is not a ChromecastConnectionError.

Once you get deeper into the code, it gets a lot more complex since I don't really understand how the controller subclass setup works yet. But I can tell that we never get to a point where there is any sort of delay or select/epoll/whatever python uses.

Perhaps my pull request corrects this behavior in an accidental manner, but there is 100% something wrong with the way it works today in regards to exception handling.

It is also worth noting that the connection retry will never succeed. The state is such that we will never escape this loop.

Here is the behavior with my change:

2026-02-20 16:05:00.643 WARNING (Thread-10) [pychromecast.controllers] PlexController.play_media: unable to launch app, device unavailable?
2026-02-20 16:05:07.803 ERROR (Thread-10) [pychromecast.socket_client] [Craft Room speaker(192.168.9.87):8009] Failed to connect to service HostServiceInfo(host='192.168.9.87', port=8009), retrying in 5.0s

And then it continued to function.

Again, I'm not saying that what I did was right. But I am saying that raising a RequestFailed gets things in an inoperable state.

elupus · 2026-02-21T18:28:22Z

There probably is something wrong. We just need to fix it in a way that doesnt hide the error. It should bubble up.

mdeneen · 2026-04-03T20:06:36Z

I looked into this a little bit more, and I think that I've finally figured it out.

When the exception is raised, the callback is still present in SocketClient's _request_callbacks dictionary. Here's one of the first things that it does:

        for callback_function in self._request_callbacks.values():
            callback_function(False, None)

The callback function is this:

            if not msg_sent:
                raise RequestFailed("PlexController.play_media")
            try:
                self._send_start_play(media, **kwargs)
            finally:
                self.play_media_event.set()

So it immediately raises an exception since msg_sent is False from SocketClient trying to clean things up.

Would clearing the dictionary in the exception handler for run() be acceptable?

            try:
                if self._run_once() == 1:
                    break
            except Exception:  # pylint: disable=broad-except
                self._force_recon = True
                self._request_callbacks = {}
                self.logger.exception(
                    "[%s(%s):%s] Unhandled exception in worker thread, attempting reconnect",
                    self.fn or "",
                    self.host,
                    self.port,
                )

Surely the code with the comment of "Make sure nobody is blocking" was intentional, but that's a bad path to take with Plex.

elupus requested changes Feb 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent tight loop if Plex app failed to launch#1164

Prevent tight loop if Plex app failed to launch#1164
mdeneen wants to merge 1 commit intohome-assistant-libs:masterfrom
mdeneen:plex_assertion

mdeneen commented Feb 21, 2026

Uh oh!

elupus left a comment

Uh oh!

mdeneen commented Feb 21, 2026

Uh oh!

mdeneen commented Feb 21, 2026

Uh oh!

mdeneen commented Feb 21, 2026 •

edited

Loading

Uh oh!

elupus commented Feb 21, 2026

Uh oh!

mdeneen commented Apr 3, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mdeneen commented Feb 21, 2026

Uh oh!

elupus left a comment

Choose a reason for hiding this comment

Uh oh!

mdeneen commented Feb 21, 2026

Uh oh!

mdeneen commented Feb 21, 2026

Uh oh!

mdeneen commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elupus commented Feb 21, 2026

Uh oh!

mdeneen commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mdeneen commented Feb 21, 2026 •

edited

Loading

mdeneen commented Apr 3, 2026 •

edited

Loading