feat(procedure): per-stage concurrency cap (max_concurrent) + wait_for_completion for batch deploys by djodjo02130 · Pull Request #1487 · moghtech/komodo

djodjo02130 · 2026-06-19T00:21:25Z

Problem

Procedure stages run all their executions in parallel (join_all) with no limit, so a stage that fans out to many executions — a Batch* deploy across hundreds of resources, or a "run everything" procedure — launches them all at once and can saturate the host (CPU / RAM / network). There is no way to throttle this.

Two changes together give procedures a proper worker-pool / task-queue behavior.

1. `max_concurrent` per stage

Optional field on ProcedureStage:

0 (default) → unchanged: every execution runs at once (fully backwards compatible).
n > 0 → the stage runs as a worker pool of size n: only n executions run at a time, the rest are queued and started as running ones finish.

Implementation swaps join_all(futures) for stream::iter(futures).buffer_unordered(limit). All executions still complete before the stage returns and the first error is still propagated.

2. `wait_for_completion` on `Deploy`

max_concurrent caps concurrent executions. Most executions block until done, but a Deploy is fire-and-forget — it resolves as soon as the container is started, not when it exits. So for one-shot / batch containers the cap would throttle deploy issuance but not how many actually run.

wait_for_completion: bool (default false) makes the Deploy execution poll the container (InspectContainer) until it exits (24h safety cap) before resolving. Combined with max_concurrent, a stage of Deploy { wait_for_completion = true } becomes a real worker pool: max_concurrent = 10 ⇒ at most 10 containers running at once.

[[procedure.config.stage]]
name = "run-all-jobs"
max_concurrent = 10
executions = [
  { execution.type = "Deploy", execution.params.deployment = "job-01", execution.params.wait_for_completion = true },
  # ...
]

Notes

Both fields are opt-in and backwards compatible (serde defaults 0 / false).
wait_for_completion targets Server deployments; ignored for Swarm services.
Configurable via TOML and the procedure UI (stage "Max parallel" input + Deploy "Wait for completion" switch).
TS types regenerated via typeshare; docs updated.

Validation

cargo build, cargo test, and cargo fmt --all -- --check all pass.

…stage Procedure stages run all their executions in parallel (`join_all`) with no limit, so a stage that fans out to many executions (e.g. a Batch deploy across hundreds of resources, or a "run all" procedure) can saturate the host's CPU/RAM/network. Add an optional `max_concurrent` field to `ProcedureStage`: * `0` (default) -> unchanged: every execution runs at once (backwards compatible); * `n > 0` -> the stage runs as a worker pool of size n: only n executions run at a time, the rest are queued and started as running ones finish. Implementation: replace `join_all(futures)` with `stream::iter(futures).buffer_unordered(limit)` in the stage executor; all executions still complete before the stage returns and the first error is propagated. - entity: `ProcedureStage.max_concurrent: I64` (serde default 0) - core: bounded-concurrency stage executor + 3 built-in procedure literals - types: regenerated TS (`types.ts`, `types.d.ts`) - ui: "Max parallel" NumberInput on the stage editor + newStage factory - docs: procedures.md field + example

…iner exits A `Deploy` execution is fire-and-forget: it resolves as soon as the container is *started*, not when it *exits*. So a procedure stage's `max_concurrent` cap (added in the previous commit) throttles how fast deploys are *issued*, but not how many one-shot / batch containers actually *run* at once — they all end up started. Add `wait_for_completion: bool` (default false) to `Deploy`. When true, after the container is started the core polls its state (`InspectContainer`) until it exits (or is gone), with a 24h safety cap, before the execution resolves. Combined with `max_concurrent`, a stage of `Deploy { wait_for_completion = true }` executions then runs as a true worker pool: `max_concurrent = 10` => at most 10 containers running at a time, the rest queued until a slot frees. - entity: `Deploy.wait_for_completion: bool` (#[serde(default)]) - core: poll-until-exit after deploy (Server target; ignored for Swarm services) - types: regenerated via typeshare - ui: "Wait for completion" switch on the Deploy execution - docs: procedures.md note + example (max_concurrent + wait_for_completion)

djodjo02130 added 2 commits June 19, 2026 02:03

djodjo02130 changed the title ~~feat(procedure): cap parallel executions per stage with max_concurrent~~ feat(procedure): per-stage concurrency cap (max_concurrent) + wait_for_completion for batch deploys Jun 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(procedure): per-stage concurrency cap (max_concurrent) + wait_for_completion for batch deploys#1487

feat(procedure): per-stage concurrency cap (max_concurrent) + wait_for_completion for batch deploys#1487
djodjo02130 wants to merge 2 commits into
moghtech:mainfrom
djodjo02130:feat/procedure-stage-max-concurrent

djodjo02130 commented Jun 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

djodjo02130 commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

1. max_concurrent per stage

2. wait_for_completion on Deploy

Notes

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

djodjo02130 commented Jun 19, 2026 •

edited

Loading

1. `max_concurrent` per stage

2. `wait_for_completion` on `Deploy`