Conversation
Adds cmd_queue.monitor_manifest, which lets a queue's run state be serialized to disk and reloaded by an out-of-process monitor. Each queue subclass now has _build_monitor_manifest, _write_monitor_manifest, and _from_manifest hooks so monitor() and kill() can be invoked on a queue rebuilt from the manifest alone (no jobs resubmitted). This is groundwork for letting the monitor live in its own tmux session that survives the parent shell. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Each backend now persists its monitor manifest at the start of run(), which makes the queue reattachable from a separate process. Also preserves the user-supplied SlurmQueue name on self.name (previously dropped after queue_id was constructed) so that name-based monitor lookup works for both queue_id and the friendly name. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Reattaches to a running queue by name (via the active-index that run() populates), by manifest path (--manifest), or by dpath. This is the entry point that step 3's tmux monitor backend will execute inside its own tmux session, and is also useful on its own when the original run() shell has been closed but workers are still active. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Both TMUXMultiQueue.monitor and SlurmQueue.monitor now accept onfail/onexit kwargs and perform the corresponding kill()/capture() themselves. run() simply forwards the args. This way the same finalization happens whether the monitor runs inline, in a separate tmux session (step 3), or via `cmd_queue monitor` from another shell. The semantics are preserved (onfail='kill' tears down idle tmux sessions only on a clean exit; on slurm it fires only on failure). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The new ``monitor`` kwarg on TMUXMultiQueue.run and SlurmQueue.run controls where the live status UI runs while the queue is executing. Default is ``'inline'`` (current behavior). With ``'tmux'``, the monitor is spawned in a detached tmux session via the new ``util_tmux.tmux.spawn_monitor_session`` helper, which invokes ``cmd_queue monitor --manifest=<path>`` under sys.executable. The parent process still blocks on a headless state poll, so block=True keeps its meaning even when the visible UI lives elsewhere — closing or detaching the tmux UI does not return control early. The tmux monitor session intentionally outlives the workers: workers self-clean on success, so the monitor session is what holds the final status table open for the user to read. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Single-file example covering monitor='inline', 'tmux', and 'none'. Useful as both a hands-on demo for users and a smoke test that the new monitor backend works against a small real DAG. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The parent shell no longer pulls the user's tty into the spawned
monitor session. Instead, after spawning the monitor it prints a
prompt explaining how to attach (or switch-client when already
inside tmux) and a manual reattach hint, then enters a cbreak
keypress loop:
[a] attach (or switch-client) to the monitor session — user can
detach with the usual binding and we re-enter the loop.
[q]/[d] stop watching from this shell (queue keeps running).
Non-TTY stdin falls back to a silent polling loop, so the path
remains usable in scripts and CI.
Also drops the synthetic "press enter to close" prompt at the end of
the monitor pane in favour of `exec bash`, so the pane stays open
without needing user input but doesn't trap the user behind a read.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ple DAG After block_with_attach_prompt and _headless_block_until_done exit, print a Rich-formatted summary line showing pass/fail/skip/total so the user gets a clear completion signal in their original shell. The tmux_example DAG is expanded to 11 jobs across 4 dependency levels (prep → proc → merge → final) with 4 workers and 2-8s sleeps, making parallel execution and dependency fan-in clearly visible in the monitor. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…bled When the queue finishes with failures, list each failing job by name along with its log file path (if log=True was passed). For any failed job that doesn't have a log on disk, emit a single hint that logs were not enabled — so the user knows where the gap is rather than seeing the same hint repeated per job. Hooked into the inline monitor path as well so all three monitor modes (inline, tmux, none) produce the same summary. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add --failures (default 1) to the tmux example so the failure summary and dependency-skip cascade are visible by default. The first N proc-* jobs exit non-zero, which causes their downstream merge/final jobs to be skipped. Pass --failures=0 for a clean run. Also enable log capture by default (--no-logs to disable) so the failed-job log paths printed by the new done-summary actually exist. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Render a 'Failed jobs' table directly below the per-worker status table while the queue is still running, so failures are visible the moment they happen rather than only in the post-run summary. Each row shows the job name and its log path (or '(no log)' when log capture wasn't enabled); a one-line note reminds the user to pass log=True if any failed jobs lack a log on disk. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…es failures When the monitor is rehydrated via cmd_queue monitor --manifest=..., the reconstructed workers had empty .jobs lists, so the failed-jobs panel and post-run summary couldn't surface any failing job names — even when fail markers existed on disk. Serialize each job's name, log flag, and fail/log paths into the manifest, and rebuild lightweight SimpleNamespace stubs on each reconstructed worker. Enough surface for the failure renderer; we don't need the full BashJob since the monitor never re-runs anything. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… monitor The bash boilerplate generated by BashJob.finalize_text ran an unconditional if-RC-0-on_pass-else-on_fail after the deps-check, so a skipped job (RC=126, on_skip already ran) ALSO had fail_fpath written and NUM_FAILED incremented. The status agg therefore showed a skipped+failed double-count. Fix: * Add a skip_fpath marker (printed by the on_skip block). * Make the post-RC dispatch 3-way: on_pass for RC=0, no-op for RC=126, on_fail otherwise. Monitor: * Carry skip_fpath and dependency names in the rehydration manifest. * Replace single failed panel with Failed + Skipped tables; skipped rows show a reason like dep X failed. * Same split applied to the post-run summary. Update tests/test_bash_variants.py: the prior test asserted the bug. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Asserts that Queue.submit(..., log=True) lands on BashJob.log and that the rendered command section gets the expected ``(<cmd>) 2>&1 | tee <log_fpath>`` wrapper. Also covers log=False and the current default (False). Catches a regression class where ``submit`` drops or shadows the ``log`` kwarg without other tests noticing — log files would just silently stop being written. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Press [a] from the live status table to attach (or switch-client) to a detached cmd_queue monitor tmux session running alongside; [q] stops watching while the queue keeps running. The side session is killed when the inline monitor exits. Reorganizes the monitor mode taxonomy so each value names a single intent: 'hybrid' (default) for inline+tmux, 'inline' for current-shell only, 'tmux' for detached-only, 'none' for headless block. 'hybrid' warns and falls back to inline when tmux is unavailable. Wires the [a] keybind into both the simple-rich (rich.Live + cbreak) and textual monitor paths, mirrors the new mode through the slurm backend, and adds plumbing-layer tests that mock the tmux helpers so the suite runs without a tmux server. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.