From 984228a5b809ccf5995da740a938cefa76271124 Mon Sep 17 00:00:00 2001 From: masonxhuang Date: Sat, 27 Jun 2026 08:45:15 +0800 Subject: [PATCH 1/5] Harden i18n translation recovery workflow --- docs/.i18n/translation-workflow.md | 142 +++++++++++++++++------------ 1 file changed, 86 insertions(+), 56 deletions(-) diff --git a/docs/.i18n/translation-workflow.md b/docs/.i18n/translation-workflow.md index 090736c286..dd7db0d366 100644 --- a/docs/.i18n/translation-workflow.md +++ b/docs/.i18n/translation-workflow.md @@ -5,41 +5,47 @@ Internal note for the docs publish pipeline. This file is under `docs/.i18n`, wh ## Goals - English docs deploy quickly after every source docs sync. -- Locale translation does not run for every hot `main` commit. -- Translation work is debounced so a burst of docs commits becomes one translation wave. -- Locale jobs translate only pages whose source hash changed since the last successful locale output. -- Successful locale outputs are committed together, even if one or more locale jobs fail. -- A weekly reconciliation reruns every locale/page path to repair missed or flaky translations. +- Incremental translation does not run for every hot `main` commit. +- Full reconciliation is a recovery path, not a release path. +- Full reconciliation runs automatically only on the weekly schedule, or manually when an operator starts it. +- A failed full run can be retried for one locale without rerunning every locale. +- Provider/key failures stop before locale fan-out. +- A tiny canary sample must succeed before follow-up full batches start. +- Locale translation failures are visible as failed GitHub jobs, even when diagnostic artifacts were uploaded. -## Event flow +## Event Flow 1. `openclaw/openclaw` syncs English docs into `openclaw/docs`. 2. GitHub Pages deploys English/source changes immediately from the sync commit. -3. `Translate All` is triggered by the sync commit, release dispatch, manual dispatch, or weekly schedule. -4. The coordinator waits a cooldown window before starting translation. -5. After the cooldown, the coordinator reads the current `origin/main` source metadata. -6. If a newer docs sync arrived during cooldown, the coordinator uses the newer source state. -7. Per-locale translation jobs run in parallel with `fail-fast: false`. -8. Each locale job uploads an artifact for the requested source SHA. -9. The finalizer downloads available artifacts, ignores stale or failed payloads, and pushes one aggregate i18n commit. -10. After the aggregate commit lands, the finalizer dispatches the Pages deploy once. -11. The Pages workflow dispatches live smoke after deployment. +3. `Translate Incremental` debounces source-doc pushes and translates stale locale pages. +4. `Translate Full` runs only from the weekly schedule or `workflow_dispatch`. +5. Both workflows read the current `origin/main` source metadata after debounce. +6. Both workflows run the shared OpenAI provider/key preflight before any locale job. +7. Full translation plans one canary locale sample and bounded follow-up batches of up to three locales. +8. If the canary fails, follow-up full batches do not start. +9. Full locale jobs validate, commit, and dispatch deploy independently after that locale succeeds. +10. Incremental locale jobs still upload artifacts for the aggregate finalizer. +11. Failed locale jobs upload failure metadata before failing the job, so artifacts and CI status agree. -## Debounce policy +## Trigger Policy -The coordinator waits 1 hour after a docs sync or release dispatch, then re-reads `origin/main`. +`Translate Full` deliberately does not listen to release dispatches or glossary pushes. Release and glossary changes converge through the weekly full run. For urgent recovery, manually run `Translate Full` with `target_locale=all` or a single locale slug. -The default cooldown is controlled by the publish repo variable `OPENCLAW_DOCS_TRANSLATION_COOLDOWN_SECONDS`, which defaults to `3600`. Repository dispatch callers may override it with `client_payload.cooldown_seconds`, and manual runs may set `cooldown_seconds`. +Top-level full workflow concurrency is serialized with `cancel-in-progress: false`. A new full run waits for a running full run instead of cancelling it. -If `.openclaw-sync/source.json` changed during the wait, it waits again from the newer state. If `main` keeps moving, the wait is capped by `OPENCLAW_DOCS_TRANSLATION_MAX_WAIT_SECONDS`, which defaults to the cooldown value. The newest observed state is translated after the cap. +Manual `target_locale` accepts `all` or one locale slug such as `fr`, `ja-jp`, or `zh-cn`. A single-locale rerun uses that locale for the canary sample, then schedules only that locale in the first full batch. -Manual and weekly runs do not wait by default. +## Debounce Policy -## Incremental translation +The coordinator waits after push-triggered incremental runs. The default cooldown is controlled by `OPENCLAW_DOCS_TRANSLATION_COOLDOWN_SECONDS`, which defaults to `3600`. Manual and weekly runs do not wait by default unless the manual input sets `cooldown_seconds`. + +If `.openclaw-sync/source.json` changed during a wait, the workflow waits again from the newer state. If `main` keeps moving, the wait is capped by `OPENCLAW_DOCS_TRANSLATION_MAX_WAIT_SECONDS`, which defaults to the cooldown value. + +## Incremental Translation Each translated page stores `x-i18n.source_hash`. Locale jobs compare the current English page hash with the stored locale hash. -Normal runs translate only: +Normal incremental runs translate only: - missing locale pages - locale pages with stale `x-i18n.source_hash` @@ -47,14 +53,32 @@ Normal runs translate only: Internal files under `docs/.i18n/**` are not translation inputs. Push-triggered runs that only change internal i18n files skip before the locale matrix. -If a locale job fails, its artifact is marked failed and carries no payload. The finalizer still commits successful locales. The failed locale remains stale and is picked up by the next incremental run because its source hashes still do not match. +Incremental translation uses the provider/key preflight before expanding the locale matrix. If the key is invalid, model access is denied, or quota is exhausted, the preflight job fails and locale jobs are not scheduled. + +## Full Translation + +Full mode forces every source page for the selected locale into the pending manifest instead of relying on changed source hashes. + +The weekly all-locale plan is: + +```text +provider/key preflight + -> canary locale sample + -> batch 1, up to 3 locales + -> batch 2, up to 3 locales + -> ... + -> status summary +``` + +The canary is a deterministic one-document sample from the first selected locale. It uploads a `canary` artifact, applies it through the same artifact validation path as locale commits, and runs the aggregate docs check without committing or publishing. If it fails translation or validation, later batches are skipped. If it succeeds, the selected locales, including the canary locale, run in normal full batches. If a later locale fails, already successful locales remain committed and published, and the failed locale can be rerun manually. -## Artifact contract +## Artifact Contract -Each locale job uploads one artifact named with locale and source SHA: +Each locale job uploads one artifact named with role, locale, shard, and source SHA: ```text -i18n-zh-cn- +i18n-zh-cn-s0of1- +i18n-canary-zh-cn-s0of1- ``` Artifact contents: @@ -67,45 +91,51 @@ payload/docs//** payload/docs/.i18n/.tm.jsonl ``` -`metadata.json` includes the locale, locale slug, source SHA, pending count, changed count, and any failure reason. The finalizer rejects artifacts whose `source_sha` does not match the current `.openclaw-sync/source.json`. +`metadata.json` includes the artifact role, locale, locale slug, source SHA, pending count, changed count, deleted count, step outcomes, and failure reason. A failed translation writes an empty payload contract, uploads the artifact, then fails the job. Full status summaries count canary artifacts separately and do not treat a canary artifact as a successful locale refresh. -The source repo release workflow dispatches one `translate-all-release` event. The coordinator still accepts old per-locale release events for compatibility, but those are only a fallback. +## Commit And Deploy Policy -## Aggregate commit +Full locale jobs are the commit and publish unit. After a locale succeeds, a separate write-permission commit job downloads that locale artifact, applies it to latest `main`, runs `npm run docs:check`, commits only `docs//**` and `docs/.i18n/.tm.jsonl`, pushes with rebase/retry under the shared locale finalizer concurrency, and dispatches `pages.yml`. -The finalizer owns the only locale push in the normal path. +Artifact application is intentionally conservative when source metadata has moved. The apply step uses latest `main`, copies only payload pages whose embedded `x-i18n.source_hash` still matches the current source page, and skips stale translation memory. If `main` moves again between apply/validation and push, the commit script skips that locale commit so the next manual or weekly run can re-evaluate from the new base. -Commit message: +Incremental translation keeps the aggregate finalizer. The finalizer downloads available artifacts, applies valid successful payloads, rejects stale or failed artifacts, runs `npm run docs:check`, pushes one aggregate i18n commit, dispatches `pages.yml`, and fails when required locale artifacts are missing or failed. -```text -chore(i18n): refresh translations -``` - -The commit may contain a partial locale set. The job summary lists applied locales, locales with no changes, missing or failed locales, stale artifacts, and invalid artifacts. - -## Weekly reconciliation +## Automatic Verification -The weekly run uses `full` mode. It forces a full reconciliation across every locale and every source page instead of relying only on changed source hashes. +The script test suite validates the recovery controls: -Glossary changes also force full reconciliation because glossary guidance can affect pages whose source hashes did not change. +- `Translate Full` has no release dispatch trigger. +- glossary pushes do not trigger `Translate Full`. +- weekly and manual triggers remain present. +- manual single-locale planning selects only that locale. +- full canary manifests keep the total pending count but translate only a bounded sample. +- provider/key preflight classifies invalid key, model access, and quota failures. +- canary success gates follow-up full batches. +- full worker fan-out stays within the small-batch budget. +- full status summaries report locale success, failure, skip reason, and artifact counts from metadata. +- failed artifact metadata produces visible GitHub output status. +- locale artifact application still rejects missing, failed, stale, and invalid artifacts. -Expected behavior: +Run locally: -- regenerate or verify every locale page -- prune stale locale pages -- refresh translation memory as needed -- still use parallel locale jobs -- still commit one aggregate result -- still tolerate individual locale failures - -The weekly run is the repair mechanism for LLM flakiness, partial failures, and missed incremental updates. - -## Deployment policy - -English deploys from source sync commits. +```bash +python -m unittest .github/scripts/i18n/tests/test_i18n_scripts.py +python .github/scripts/i18n/workflow_shell_check.py --check-bash +python .github/scripts/i18n/budget_check.py +``` -Translations deploy after the aggregate i18n commit. The finalizer dispatches GitHub Pages once because GitHub suppresses normal push-triggered workflow runs from `GITHUB_TOKEN` commits. The Pages workflow dispatches live smoke after deployment so the smoke test checks the deployed site instead of racing the deploy. +## Manual Verification -A hot docs day should produce many fast English deploys, but only a small number of locale deploys. +Before merging workflow recovery changes: -If external deploy providers such as Mintlify watch every push, the aggregate i18n commit is the load reducer. Avoid restoring per-locale pushes to `main`. +1. Trigger `Translate Full` with a deliberately invalid translation key in a test context and confirm the provider preflight fails before locale jobs start. +2. Trigger or simulate a canary failure and confirm follow-up full batches are skipped. +3. Trigger `Translate Full` with `target_locale=fr` and confirm only `fr` runs. +4. Trigger a small manual full run and confirm a successful locale commits independently and dispatches `pages.yml`. +5. Observe or simulate a later locale failure and confirm earlier successful locale commits remain published. +6. Rerun only the failed locale with `target_locale=` and confirm it commits independently. +7. Confirm release events do not start `Translate Full`. +8. Confirm glossary-only changes do not start `Translate Full`. +9. Check GitHub Actions summaries for selected locales, canary/batch status, artifact counts, and explicit failures. +10. Confirm the final diff from any locale commit contains only `docs//**` and `docs/.i18n/.tm.jsonl`. From cd3288542741feeaa1d5cb4c2875be5fd36da60a Mon Sep 17 00:00:00 2001 From: masonxhuang Date: Sat, 27 Jun 2026 13:33:19 +0800 Subject: [PATCH 2/5] fix(i18n): verify canary R2 publish before full batches --- .../scripts/i18n/build_pending_manifest.py | 18 +- .../scripts/i18n/commit_locale_artifact.py | 83 +++++- .github/scripts/i18n/dispatch_r2_pages.py | 248 ++++++++++++++++++ .github/scripts/i18n/package_artifact.py | 4 +- .../scripts/i18n/tests/test_i18n_scripts.py | 243 ++++++++++++++++- .github/workflows/translate-all.yml | 24 +- .../workflows/translate-finalize-reusable.yml | 22 +- .../workflows/translate-locale-reusable.yml | 90 ++++++- docs/.i18n/translation-workflow.md | 10 +- 9 files changed, 704 insertions(+), 38 deletions(-) create mode 100644 .github/scripts/i18n/dispatch_r2_pages.py diff --git a/.github/scripts/i18n/build_pending_manifest.py b/.github/scripts/i18n/build_pending_manifest.py index 31f8af71cd..a7edd98000 100644 --- a/.github/scripts/i18n/build_pending_manifest.py +++ b/.github/scripts/i18n/build_pending_manifest.py @@ -13,7 +13,7 @@ Environment: LOCALE, LOCALE_SLUG, MODE, SHARD_INDEX, SHARD_TOTAL, optional - PENDING_LIMIT, and GITHUB_OUTPUT. + PENDING_LIMIT, CANARY_SOURCE_PATH, and GITHUB_OUTPUT. Outputs: Writes docs-i18n--sof.txt under --openclaw-sync-dir. @@ -93,6 +93,7 @@ def build_pending_manifest( shard_index: int, shard_total: int, pending_limit: int = 0, + canary_source_path: str = "", ) -> PendingResult: locale_dirs = {path.name for path in docs_root.iterdir() if is_locale_dir(path)} pending_path = openclaw_sync_dir / f"docs-i18n-{locale_slug}-s{shard_index}of{shard_total}.txt" @@ -118,9 +119,15 @@ def build_pending_manifest( pending_files = sorted(pending_files) shard_files = [file for index, file in enumerate(pending_files) if index % shard_total == shard_index] if pending_limit: - # Full canary intentionally translates a tiny deterministic sample; the - # follow-up locale batch still runs the complete manifest after the gate. - shard_files = shard_files[:pending_limit] + canary_source = (docs_root / canary_source_path).resolve() if canary_source_path else None + if canary_source in shard_files: + # Prefer a user-visible page with known glossary coverage so the + # canary proves both translation and the deployed page content. + shard_files = [canary_source] + else: + # Full canary publishes a real one-page probe before expensive batches, + # so choose the smallest deterministic sample to cap token and review cost. + shard_files = sorted(shard_files, key=lambda file: (file.stat().st_size, file.as_posix()))[:pending_limit] pending_path.parent.mkdir(parents=True, exist_ok=True) pending_path.write_text("\n".join(str(file) for file in shard_files) + ("\n" if shard_files else ""), encoding="utf-8") @@ -156,7 +163,7 @@ def parse_args() -> argparse.Namespace: Examples: LOCALE=fr LOCALE_SLUG=fr MODE=incremental SHARD_INDEX=0 SHARD_TOTAL=1 python .github/scripts/i18n/build_pending_manifest.py - LOCALE=zh-CN LOCALE_SLUG=zh-cn MODE=full SHARD_INDEX=1 SHARD_TOTAL=4 PENDING_LIMIT=1 python .github/scripts/i18n/build_pending_manifest.py + LOCALE=zh-CN LOCALE_SLUG=zh-cn MODE=full SHARD_INDEX=0 SHARD_TOTAL=1 PENDING_LIMIT=1 CANARY_SOURCE_PATH=channels/line.md python .github/scripts/i18n/build_pending_manifest.py """, ) parser.add_argument("--docs-root", default="docs", type=Path) @@ -177,6 +184,7 @@ def main() -> None: shard_index=shard_index, shard_total=shard_total, pending_limit=pending_limit, + canary_source_path=os.environ.get("CANARY_SOURCE_PATH", ""), ) append_output(result) diff --git a/.github/scripts/i18n/commit_locale_artifact.py b/.github/scripts/i18n/commit_locale_artifact.py index 16837f6c77..7d9a5e348d 100644 --- a/.github/scripts/i18n/commit_locale_artifact.py +++ b/.github/scripts/i18n/commit_locale_artifact.py @@ -5,13 +5,15 @@ This script owns the per-locale commit/push control plane for full translation recovery. It expects locale files to have already been applied and validated, then commits only docs/ and that locale's translation - memory. It retries rebase/push conflicts while guarding against source - metadata moving after artifact application. + memory. Canary commits are additionally restricted to the sampled page and + locale translation memory. It retries rebase/push conflicts while guarding + against source metadata moving after artifact application. Parameters: --locale: Locale directory to commit. Default: LOCALE environment variable. --base-source-sha: Source SHA observed after artifact application. Default: BASE_SOURCE_SHA environment variable. + --artifact-dir: Downloaded locale artifact directory. Default: ARTIFACT_DIR. --attempts: Push/rebase retry count. Default: 5. Outputs: @@ -22,7 +24,8 @@ Examples: LOCALE=fr BASE_SOURCE_SHA=abc python .github/scripts/i18n/commit_locale_artifact.py - python .github/scripts/i18n/commit_locale_artifact.py --locale zh-CN --base-source-sha abc --attempts 3 + ARTIFACT_ROLE=canary ARTIFACT_DIR=.openclaw-sync/i18n-artifacts/zh-cn-s0of1 LOCALE=zh-CN BASE_SOURCE_SHA=abc python .github/scripts/i18n/commit_locale_artifact.py + python .github/scripts/i18n/commit_locale_artifact.py --locale zh-CN --base-source-sha abc --artifact-dir .openclaw-sync/i18n-artifacts/zh-cn-s0of1 --attempts 3 """ from __future__ import annotations @@ -31,6 +34,7 @@ import json import os import subprocess +import sys import time from pathlib import Path @@ -84,12 +88,69 @@ def has_locale_changes(locale: str) -> bool: return bool(result.stdout.strip()) -def commit_locale(locale: str, base_source_sha: str, attempts: int) -> bool: +def pending_allowed(locale: str, locale_slug: str, shard_index: str, shard_total: str) -> set[str]: + pending_file = Path(".openclaw-sync") / f"docs-i18n-{locale_slug}-s{shard_index}of{shard_total}.txt" + allowed = {f"docs/.i18n/{locale}.tm.jsonl"} + if not pending_file.exists(): + raise SystemExit(f"missing canary pending manifest: {pending_file}") + docs_root = Path("docs").resolve() + for line in pending_file.read_text(encoding="utf-8").splitlines(): + if not line.strip(): + continue + source = Path(line.strip()).resolve() + rel = source.relative_to(docs_root).as_posix() + allowed.add(f"docs/{locale}/{rel}") + return allowed + + +def artifact_allowed(locale: str, artifact_dir: str) -> set[str]: + artifact = Path(artifact_dir) + if not artifact.exists(): + raise SystemExit(f"missing canary artifact directory: {artifact}") + deleted = [line for line in (artifact / "deleted-files.txt").read_text(encoding="utf-8").splitlines() if line.strip()] + if deleted: + raise SystemExit(f"canary artifact unexpectedly included deleted paths: {', '.join(deleted)}") + allowed = set() + for line in (artifact / "changed-files.txt").read_text(encoding="utf-8").splitlines(): + if not line.strip(): + continue + if line == f"docs/.i18n/{locale}.tm.jsonl" or line.startswith(f"docs/{locale}/"): + allowed.add(line) + continue + raise SystemExit(f"canary artifact changed path outside locale scope: {line}") + return allowed + + +def enforce_canary_scope(locale: str, allowed: set[str]) -> None: + status = git_stdout(["status", "--porcelain", "--untracked-files=all", "--", f"docs/{locale}", f"docs/.i18n/{locale}.tm.jsonl"]) + changed = {line[3:] for line in status.splitlines() if line.strip()} + bad = sorted(path for path in changed if path not in allowed) + if bad: + print("Canary commit touched paths outside the sampled page contract:", file=sys.stderr) + for path in bad: + print(path, file=sys.stderr) + raise SystemExit(1) + + +def commit_locale( + locale: str, + base_source_sha: str, + attempts: int, + artifact_role: str = "", + locale_slug: str = "", + shard_index: str = "0", + shard_total: str = "1", + artifact_dir: str = "", +) -> bool: if not has_locale_changes(locale): print(f"No {locale} translation changes.") write_output("committed", "false") return False + if artifact_role == "canary": + allowed = artifact_allowed(locale, artifact_dir) if artifact_dir else pending_allowed(locale, locale_slug or locale, shard_index, shard_total) + enforce_canary_scope(locale, allowed) + git_stdout(["config", "user.name", "openclaw-docs-i18n[bot]"]) git_stdout(["config", "user.email", "openclaw-docs-i18n[bot]@users.noreply.github.com"]) git_stdout(["add", f"docs/{locale}", f"docs/.i18n/{locale}.tm.jsonl"]) @@ -121,11 +182,12 @@ def parse_args() -> argparse.Namespace: Examples: LOCALE=fr BASE_SOURCE_SHA=abc python .github/scripts/i18n/commit_locale_artifact.py - python .github/scripts/i18n/commit_locale_artifact.py --locale zh-CN --base-source-sha abc --attempts 3 + python .github/scripts/i18n/commit_locale_artifact.py --locale zh-CN --base-source-sha abc --artifact-dir .openclaw-sync/i18n-artifacts/zh-cn-s0of1 --attempts 3 """, ) parser.add_argument("--locale", default=os.environ.get("LOCALE", "")) parser.add_argument("--base-source-sha", default=os.environ.get("BASE_SOURCE_SHA", "")) + parser.add_argument("--artifact-dir", default=os.environ.get("ARTIFACT_DIR", "")) parser.add_argument("--attempts", default=5, type=int) return parser.parse_args() @@ -138,7 +200,16 @@ def main() -> None: raise SystemExit("missing base source sha: pass --base-source-sha or set BASE_SOURCE_SHA") if args.attempts < 1: raise SystemExit("attempts must be >= 1") - commit_locale(args.locale, args.base_source_sha, args.attempts) + commit_locale( + args.locale, + args.base_source_sha, + args.attempts, + artifact_role=os.environ.get("ARTIFACT_ROLE", ""), + locale_slug=os.environ.get("LOCALE_SLUG", args.locale), + shard_index=os.environ.get("SHARD_INDEX", "0"), + shard_total=os.environ.get("SHARD_TOTAL", "1"), + artifact_dir=args.artifact_dir, + ) if __name__ == "__main__": diff --git a/.github/scripts/i18n/dispatch_r2_pages.py b/.github/scripts/i18n/dispatch_r2_pages.py new file mode 100644 index 0000000000..77a2415810 --- /dev/null +++ b/.github/scripts/i18n/dispatch_r2_pages.py @@ -0,0 +1,248 @@ +#!/usr/bin/env python3 +"""Dispatch R2 Pages and wait for the upload result. + +Definition: + This script is the translation workflow deploy gate. It starts the R2 Pages + workflow through GitHub CLI and waits for the dispatched run to finish so a + translation canary proves content upload, not just workflow scheduling. + +Parameters: + --workflow: Workflow file to dispatch. Default: r2-pages.yml. + --ref: Git ref to dispatch. Default: main. + --repo: GitHub repository. Default: GITHUB_REPOSITORY. + --artifact-scope: R2 artifact scope input. Default: full. + --force-upload: Force R2 object audit/upload input. Default: true. + --live-url: Optional live URL to verify after upload. + --expect-h1: Expected h1 text for live URL verification. + --timeout-seconds: Maximum wait. Default: 3600. + --poll-seconds: Poll interval. Default: 10. + +Outputs: + Prints the dispatched run URL, final conclusion, and optional live smoke + status. Exits non-zero when the workflow cannot be dispatched, cannot be + found, times out, finishes with a non-success conclusion, or the live smoke + check does not converge. + +Examples: + GH_TOKEN=... GITHUB_REPOSITORY=openclaw/docs python .github/scripts/i18n/dispatch_r2_pages.py + python .github/scripts/i18n/dispatch_r2_pages.py --repo openclaw/docs --ref main --timeout-seconds 1800 +""" + +from __future__ import annotations + +import argparse +import html +import json +import os +import re +import subprocess +import time +import urllib.error +import urllib.request +from datetime import UTC, datetime + + +RUN_URL_RE = re.compile(r"/actions/runs/([0-9]+)") +H1_RE = re.compile(r"]*>(.*?)", re.I | re.S) + + +def run(args: list[str], check: bool = True) -> subprocess.CompletedProcess[str]: + result = subprocess.run(args, check=False, text=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) + if check and result.returncode != 0: + detail = result.stderr.strip() or result.stdout.strip() or f"{' '.join(args)} failed" + raise SystemExit(detail) + return result + + +def parse_time(value: str) -> datetime: + return datetime.fromisoformat(value.replace("Z", "+00:00")) + + +def parse_run_id(output: str) -> str: + match = RUN_URL_RE.search(output) + return match.group(1) if match else "" + + +def dispatch(workflow: str, ref: str, repo: str, artifact_scope: str, force_upload: bool) -> str: + result = run( + [ + "gh", + "workflow", + "run", + workflow, + "--repo", + repo, + "--ref", + ref, + "-f", + f"artifact_scope={artifact_scope}", + "-f", + f"force_upload={'true' if force_upload else 'false'}", + ] + ) + output = "\n".join(part for part in [result.stdout.strip(), result.stderr.strip()] if part) + if output: + print(output) + return parse_run_id(output) + + +def list_workflow_dispatch_runs(workflow: str, ref: str, repo: str) -> list[dict]: + result = run( + [ + "gh", + "run", + "list", + "--repo", + repo, + "--workflow", + workflow, + "--branch", + ref, + "--event", + "workflow_dispatch", + "--json", + "databaseId,createdAt,status,url", + "--limit", + "20", + ] + ) + return json.loads(result.stdout or "[]") + + +def find_dispatched_run(workflow: str, ref: str, repo: str, started_at: datetime, known_run_ids: set[str]) -> str: + cutoff = started_at.replace(microsecond=0) + for _ in range(12): + runs = list_workflow_dispatch_runs(workflow, ref, repo) + recent = [ + item + for item in runs + if str(item["databaseId"]) not in known_run_ids and parse_time(item["createdAt"]) >= cutoff + ] + if len(recent) == 1: + run_id = str(recent[0]["databaseId"]) + print(f"Resolved R2 Pages run: {recent[0].get('url') or run_id}") + return run_id + if len(recent) > 1: + urls = ", ".join(str(item.get("url") or item["databaseId"]) for item in recent) + raise SystemExit(f"ambiguous dispatched R2 Pages run candidates: {urls}") + time.sleep(5) + raise SystemExit("could not resolve dispatched R2 Pages run") + + +def find_recent_run(workflow: str, ref: str, repo: str, started_at: datetime) -> str: + """Backward-compatible wrapper for tests and one-off callers.""" + return find_dispatched_run(workflow, ref, repo, started_at, set()) + + +def known_workflow_dispatch_run_ids(workflow: str, ref: str, repo: str) -> set[str]: + try: + return {str(item["databaseId"]) for item in list_workflow_dispatch_runs(workflow, ref, repo)} + except SystemExit: + raise + except Exception as exc: + raise SystemExit(f"could not list existing R2 Pages runs before dispatch: {exc}") from exc + + +def wait_for_run(repo: str, run_id: str, timeout_seconds: int, poll_seconds: int) -> None: + deadline = time.monotonic() + timeout_seconds + while True: + result = run(["gh", "run", "view", run_id, "--repo", repo, "--json", "status,conclusion,url"]) + data = json.loads(result.stdout) + status = data.get("status") + conclusion = data.get("conclusion") or "" + url = data.get("url") or run_id + print(f"R2 Pages run {run_id}: status={status} conclusion={conclusion} url={url}") + if status == "completed": + if conclusion == "success": + return + raise SystemExit(f"R2 Pages run {run_id} finished with conclusion={conclusion}") + if time.monotonic() >= deadline: + raise SystemExit(f"timed out waiting for R2 Pages run {run_id}") + time.sleep(poll_seconds) + + +def extract_h1(document: str) -> str: + match = H1_RE.search(document) + if not match: + return "" + text = re.sub(r"<[^>]+>", "", match.group(1)) + return " ".join(html.unescape(text).split()) + + +def fetch_text(url: str, timeout_seconds: int = 30) -> str: + request = urllib.request.Request(url, headers={"User-Agent": "openclaw-docs-i18n-canary/1.0"}) + with urllib.request.urlopen(request, timeout=timeout_seconds) as response: # noqa: S310 - workflow-provided HTTPS URL. + return response.read().decode("utf-8", errors="replace") + + +def verify_live_h1(url: str, expected_h1: str, timeout_seconds: int, poll_seconds: int) -> None: + if not url and not expected_h1: + return + if not url or not expected_h1: + raise SystemExit("live smoke requires both --live-url and --expect-h1") + deadline = time.monotonic() + timeout_seconds + cache_buster = int(time.time()) + separator = "&" if "?" in url else "?" + smoke_url = f"{url}{separator}_openclaw_i18n_canary={cache_buster}" + last_h1 = "" + while True: + try: + last_h1 = extract_h1(fetch_text(smoke_url)) + print(f"Live canary h1: {last_h1!r} from {smoke_url}") + if last_h1 == expected_h1: + return + except (urllib.error.URLError, TimeoutError) as exc: + print(f"Live canary fetch failed: {exc}") + if time.monotonic() >= deadline: + raise SystemExit(f"live canary h1 did not become {expected_h1!r}; last h1 was {last_h1!r}") + time.sleep(poll_seconds) + + +def parse_args() -> argparse.Namespace: + parser = argparse.ArgumentParser( + description="Dispatch R2 Pages and wait for the upload result.", + formatter_class=argparse.RawDescriptionHelpFormatter, + epilog="""Outputs: + Prints the dispatched R2 Pages run and exits non-zero unless it completes successfully. + When --live-url and --expect-h1 are provided, also verifies the live page h1. + +Examples: + GH_TOKEN=... GITHUB_REPOSITORY=openclaw/docs python .github/scripts/i18n/dispatch_r2_pages.py + python .github/scripts/i18n/dispatch_r2_pages.py --repo openclaw/docs --ref main --timeout-seconds 1800 + python .github/scripts/i18n/dispatch_r2_pages.py --live-url https://docs.openclaw.ai/zh-CN/channels/line --expect-h1 LINE +""", + ) + parser.add_argument("--workflow", default="r2-pages.yml") + parser.add_argument("--ref", default="main") + parser.add_argument("--repo", default=os.environ.get("GITHUB_REPOSITORY", "")) + parser.add_argument("--artifact-scope", default="full") + parser.add_argument("--force-upload", default=True, action=argparse.BooleanOptionalAction) + parser.add_argument("--live-url", default="") + parser.add_argument("--expect-h1", default="") + parser.add_argument("--timeout-seconds", default=3600, type=int) + parser.add_argument("--poll-seconds", default=10, type=int) + return parser.parse_args() + + +def main() -> None: + args = parse_args() + if not args.repo: + raise SystemExit("missing repository: pass --repo or set GITHUB_REPOSITORY") + if args.timeout_seconds < 1: + raise SystemExit("timeout-seconds must be >= 1") + if args.poll_seconds < 1: + raise SystemExit("poll-seconds must be >= 1") + + # GitHub's dispatch API can omit the new run URL; snapshot first so fallback + # resolution cannot attach this deploy gate to a pre-existing R2 run. + known_run_ids = known_workflow_dispatch_run_ids(args.workflow, args.ref, args.repo) + started_at = datetime.now(UTC) + run_id = dispatch(args.workflow, args.ref, args.repo, args.artifact_scope, args.force_upload) + if not run_id: + run_id = find_dispatched_run(args.workflow, args.ref, args.repo, started_at, known_run_ids) + wait_for_run(args.repo, run_id, args.timeout_seconds, args.poll_seconds) + verify_live_h1(args.live_url, args.expect_h1, args.timeout_seconds, args.poll_seconds) + + +if __name__ == "__main__": + main() diff --git a/.github/scripts/i18n/package_artifact.py b/.github/scripts/i18n/package_artifact.py index f1c448efaf..95b0a37c5d 100644 --- a/.github/scripts/i18n/package_artifact.py +++ b/.github/scripts/i18n/package_artifact.py @@ -144,7 +144,9 @@ def package_artifact(workspace: Path, openclaw_sync_dir: Path) -> dict[str, obje allowed = read_pending_allowed(workspace, locale, locale_slug, shard_index, shard_total) shard_changed = [line for line in changed if line in allowed] - if shard_total == 1: + if os.environ.get("ARTIFACT_ROLE") == "canary": + shard_deleted = [] + elif shard_total == 1: shard_deleted = deleted else: shard_deleted = [ diff --git a/.github/scripts/i18n/tests/test_i18n_scripts.py b/.github/scripts/i18n/tests/test_i18n_scripts.py index dcac5d3def..bf54e2606b 100644 --- a/.github/scripts/i18n/tests/test_i18n_scripts.py +++ b/.github/scripts/i18n/tests/test_i18n_scripts.py @@ -41,6 +41,8 @@ def load_module(name: str): plan_full = load_module("plan_full") provider_preflight = load_module("provider_preflight") summarize_full = load_module("summarize_full") +commit_locale_artifact = load_module("commit_locale_artifact") +dispatch_r2_pages = load_module("dispatch_r2_pages") @contextmanager @@ -85,12 +87,20 @@ def test_translate_workflows_call_existing_scripts_without_inline_python_or_node called_scripts: set[Path] = set() heredoc_pattern = re.compile(r"(?:python|node)\s+-\s+<<['\"]?(?:PY|NODE)['\"]?") - script_call_pattern = re.compile(r"python\s+(\.github/scripts/i18n/[A-Za-z0-9_./-]+\.py)\b") + script_call_pattern = re.compile( + r"python\s+(?:" + r"(?P\.github/scripts/i18n/[A-Za-z0-9_./-]+\.py)" + r"|\"\$\{I18N_SCRIPT_DIR\}/(?P[A-Za-z0-9_-]+\.py)\"" + r")(?=\s|$)" + ) for workflow in workflows: text = workflow.read_text(encoding="utf-8") self.assertIsNone(heredoc_pattern.search(text), f"{workflow} still contains inline Python/Node heredoc") for match in script_call_pattern.finditer(text): - called_scripts.add(REPO_ROOT / match.group(1)) + if match.group("repo"): + called_scripts.add(REPO_ROOT / match.group("repo")) + else: + called_scripts.add(SCRIPT_DIR / match.group("temp")) expected_scripts = set(SCRIPT_DIR.glob("*.py")) - {SCRIPT_DIR / "__init__.py"} self.assertEqual(expected_scripts, called_scripts) @@ -172,6 +182,7 @@ def test_full_workflow_keeps_only_weekly_and_manual_triggers(self) -> None: self.assertIn("schedule:", text) self.assertIn("workflow_dispatch:", text) self.assertIn("target_locale:", text) + self.assertIn("canary_only:", text) self.assertIn("cancel-in-progress: false", text) def test_full_workflow_gates_batches_after_canary(self) -> None: @@ -180,10 +191,28 @@ def test_full_workflow_gates_batches_after_canary(self) -> None: for index in range(1, 7): self.assertIn(f"translate-batch-{index}:", text) self.assertIn("needs.translate-canary.result == 'success'", text) + self.assertIn("inputs.canary_only != true", text) self.assertIn("artifact_role: canary", text) + self.assertIn("canary_source_path: channels/line.md", text) + self.assertIn("canary_live_path: channels/line", text) + self.assertIn("canary_expected_h1: LINE", text) + self.assertRegex(text, r"translate-canary:[\s\S]*?artifact_role: canary[\s\S]*?commit_locale: true") self.assertIn("inputs.commit_locale || inputs.artifact_role == 'canary'", reusable) self.assertIn("inputs.artifact_role == 'canary' || steps.apply.outputs.changed_count != '0'", reusable) self.assertIn("inputs.commit_locale && steps.apply.outputs.changed_count != '0'", reusable) + self.assertIn("Fail uncommitted locale refresh", reusable) + self.assertIn("inputs.artifact_role == 'canary' || (inputs.commit_locale && steps.locale_commit.outputs.committed == 'true')", reusable) + self.assertIn("ARTIFACT_DIR: .openclaw-sync/i18n-artifacts/${{ inputs.locale_slug }}-s${{ inputs.shard_index }}of${{ inputs.shard_total }}", reusable) + self.assertIn('echo "I18N_SCRIPT_DIR=${I18N_SCRIPT_DIR}" >> "$GITHUB_ENV"', reusable) + self.assertIn("ref: ${{ github.workflow_sha }}", reusable) + self.assertIn('python "${I18N_SCRIPT_DIR}/build_pending_manifest.py"', reusable) + self.assertIn('python "${I18N_SCRIPT_DIR}/commit_locale_artifact.py"', reusable) + self.assertIn('python "${I18N_SCRIPT_DIR}/dispatch_r2_pages.py" "${args[@]}"', reusable) + self.assertIn('--live-url "${CANARY_LIVE_URL}" --expect-h1 "${CANARY_EXPECTED_H1}"', reusable) + finalize_reusable = (REPO_ROOT / ".github/workflows/translate-finalize-reusable.yml").read_text(encoding="utf-8") + self.assertIn('echo "I18N_SCRIPT_DIR=${I18N_SCRIPT_DIR}" >> "$GITHUB_ENV"', finalize_reusable) + self.assertIn("ref: ${{ github.workflow_sha }}", finalize_reusable) + self.assertIn('python "${I18N_SCRIPT_DIR}/dispatch_r2_pages.py"', finalize_reusable) self.assertIn("provider-preflight:", text) self.assertIn("Translate Full completed with failed or cancelled work", text) @@ -289,6 +318,7 @@ def test_pending_manifest_filters_locale_generated_and_shards_pending_docs(self) self.assertEqual(2, result.all_count) self.assertEqual(2, result.total_pending_count) self.assertEqual(1, result.pending_count) + self.assertEqual("index.md", result.shard_files[0].name) self.assertTrue(result.shard_files[0].as_posix().endswith("/docs/index.md")) self.assertEqual(str(result.shard_files[0]), result.pending_path.read_text(encoding="utf-8").strip()) @@ -336,6 +366,27 @@ def test_pending_manifest_canary_limit_keeps_total_count_but_limits_sample(self) self.assertEqual(2, result.total_pending_count) self.assertEqual(1, result.pending_count) + def test_pending_manifest_canary_prefers_configured_source_page(self) -> None: + with tempfile.TemporaryDirectory() as tmp: + tmp_path = Path(tmp) + shutil.copytree(FIXTURES / "pending-docs" / "docs", tmp_path / "docs") + + result = pending.build_pending_manifest( + docs_root=tmp_path / "docs", + openclaw_sync_dir=tmp_path / ".openclaw-sync", + locale="fr", + locale_slug="fr", + mode="full", + shard_index=0, + shard_total=1, + pending_limit=1, + canary_source_path="guide/setup.mdx", + ) + + self.assertEqual(2, result.total_pending_count) + self.assertEqual(1, result.pending_count) + self.assertTrue(result.shard_files[0].as_posix().endswith("/docs/guide/setup.mdx")) + def test_package_artifact_keeps_only_allowed_changed_paths_and_payload(self) -> None: with tempfile.TemporaryDirectory() as tmp: repo = Path(tmp) @@ -419,6 +470,194 @@ def test_package_artifact_failure_writes_empty_payload_contract(self) -> None: self.assertEqual("", (artifact / "changed-files.txt").read_text(encoding="utf-8")) self.assertEqual("", (artifact / "deleted-files.txt").read_text(encoding="utf-8")) + def test_canary_package_excludes_unrelated_pruned_deletes(self) -> None: + with tempfile.TemporaryDirectory() as tmp: + repo = Path(tmp) + init_repo(repo) + (repo / ".openclaw-sync").mkdir() + (repo / "docs/fr").mkdir(parents=True) + (repo / "docs/.i18n").mkdir(parents=True) + (repo / "docs/index.md").write_text("# Index\n", encoding="utf-8") + (repo / "docs/fr/index.md").write_text("# Old Index FR\n", encoding="utf-8") + (repo / "docs/fr/removed.md").write_text("# Removed FR\n", encoding="utf-8") + run_git(repo, "add", ".") + run_git(repo, "commit", "-m", "initial") + + (repo / "docs/fr/index.md").write_text("# New Index FR\n", encoding="utf-8") + (repo / "docs/fr/removed.md").unlink() + (repo / "docs/.i18n/fr.tm.jsonl").write_text('{"ok":true}\n', encoding="utf-8") + (repo / ".openclaw-sync/docs-i18n-fr-s0of1.txt").write_text(str(repo / "docs/index.md") + "\n", encoding="utf-8") + + with chdir(repo), env( + { + "GITHUB_WORKSPACE": str(repo), + "LOCALE": "fr", + "LOCALE_SLUG": "fr", + "SOURCE_SHA": "source-a", + "MODE": "full", + "SHARD_INDEX": "0", + "SHARD_TOTAL": "1", + "WORKER_PARALLEL": "3", + "THINKING_EFFORT": "medium", + "PENDING_COUNT": "1", + "TOTAL_PENDING_COUNT": "2", + "ALL_COUNT": "2", + "ARTIFACT_ROLE": "canary", + "TRANSLATE_OUTCOME": "success", + "MDX_CHECK_OUTCOME": "skipped", + "MDX_REPAIR_OUTCOME": "skipped", + "MDX_SCOPE_OUTCOME": "skipped", + "MDX_RECHECK_OUTCOME": "skipped", + } + ): + metadata = package_artifact.package_artifact(repo, Path(".openclaw-sync")) + + artifact = repo / ".openclaw-sync/artifacts/fr-s0of1" + self.assertEqual(2, metadata["changed_count"]) + self.assertEqual(0, metadata["deleted_count"]) + self.assertEqual(["docs/.i18n/fr.tm.jsonl", "docs/fr/index.md"], (artifact / "changed-files.txt").read_text(encoding="utf-8").splitlines()) + self.assertEqual("", (artifact / "deleted-files.txt").read_text(encoding="utf-8")) + + def test_canary_commit_scope_allows_only_sampled_page_and_tm(self) -> None: + with tempfile.TemporaryDirectory() as tmp: + repo = Path(tmp) + init_repo(repo) + (repo / ".openclaw-sync").mkdir() + (repo / "docs/fr").mkdir(parents=True) + (repo / "docs/.i18n").mkdir(parents=True) + (repo / "docs/index.md").write_text("# Index\n", encoding="utf-8") + (repo / "docs/fr/index.md").write_text("# Old Index FR\n", encoding="utf-8") + (repo / "docs/.i18n/fr.tm.jsonl").write_text('{"old":true}\n', encoding="utf-8") + run_git(repo, "add", ".") + run_git(repo, "commit", "-m", "initial") + + (repo / "docs/fr/index.md").write_text("# New Index FR\n", encoding="utf-8") + (repo / "docs/.i18n/fr.tm.jsonl").write_text('{"ok":true}\n', encoding="utf-8") + artifact = repo / ".openclaw-sync/i18n-artifacts/fr-s0of1" + artifact.mkdir(parents=True) + (artifact / "changed-files.txt").write_text("docs/.i18n/fr.tm.jsonl\ndocs/fr/index.md\n", encoding="utf-8") + (artifact / "deleted-files.txt").write_text("", encoding="utf-8") + + with chdir(repo): + allowed = commit_locale_artifact.artifact_allowed("fr", str(artifact)) + commit_locale_artifact.enforce_canary_scope("fr", allowed) + + def test_canary_commit_scope_rejects_unrelated_locale_deletes_not_in_artifact(self) -> None: + with tempfile.TemporaryDirectory() as tmp: + repo = Path(tmp) + init_repo(repo) + (repo / ".openclaw-sync").mkdir() + (repo / "docs/fr").mkdir(parents=True) + (repo / "docs/.i18n").mkdir(parents=True) + (repo / "docs/index.md").write_text("# Index\n", encoding="utf-8") + (repo / "docs/fr/index.md").write_text("# Old Index FR\n", encoding="utf-8") + (repo / "docs/fr/removed.md").write_text("# Removed FR\n", encoding="utf-8") + (repo / "docs/.i18n/fr.tm.jsonl").write_text('{"old":true}\n', encoding="utf-8") + run_git(repo, "add", ".") + run_git(repo, "commit", "-m", "initial") + + (repo / "docs/fr/index.md").write_text("# New Index FR\n", encoding="utf-8") + (repo / "docs/fr/removed.md").unlink() + artifact = repo / ".openclaw-sync/i18n-artifacts/fr-s0of1" + artifact.mkdir(parents=True) + (artifact / "changed-files.txt").write_text("docs/fr/index.md\n", encoding="utf-8") + (artifact / "deleted-files.txt").write_text("", encoding="utf-8") + + with chdir(repo): + allowed = commit_locale_artifact.artifact_allowed("fr", str(artifact)) + with self.assertRaises(SystemExit): + commit_locale_artifact.enforce_canary_scope("fr", allowed) + + def test_canary_artifact_scope_rejects_deleted_paths(self) -> None: + with tempfile.TemporaryDirectory() as tmp: + artifact = Path(tmp) / "artifact" + artifact.mkdir() + (artifact / "changed-files.txt").write_text("docs/fr/index.md\n", encoding="utf-8") + (artifact / "deleted-files.txt").write_text("docs/fr/removed.md\n", encoding="utf-8") + + with self.assertRaises(SystemExit): + commit_locale_artifact.artifact_allowed("fr", str(artifact)) + + def test_dispatch_r2_pages_parses_run_urls(self) -> None: + self.assertEqual("28277584371", dispatch_r2_pages.parse_run_id("https://github.com/openclaw/docs/actions/runs/28277584371")) + + def test_dispatch_r2_pages_selects_recent_workflow_dispatch(self) -> None: + calls = {"count": 0} + now = "2026-06-27T03:43:01Z" + + def fake_run(args: list[str], check: bool = True) -> subprocess.CompletedProcess[str]: + calls["count"] += 1 + payload = [{"databaseId": 123, "createdAt": now, "status": "queued", "url": "https://github.com/openclaw/docs/actions/runs/123"}] + return subprocess.CompletedProcess(args=args, returncode=0, stdout=json.dumps(payload), stderr="") + + with patch.object(dispatch_r2_pages, "run", fake_run), patch.object(dispatch_r2_pages.time, "sleep", lambda _: None): + run_id = dispatch_r2_pages.find_recent_run("r2-pages.yml", "main", "openclaw/docs", dispatch_r2_pages.parse_time(now)) + + self.assertEqual("123", run_id) + self.assertEqual(1, calls["count"]) + + def test_dispatch_r2_pages_ignores_known_recent_runs(self) -> None: + now = "2026-06-27T03:43:01Z" + + def fake_list(workflow: str, ref: str, repo: str) -> list[dict]: + self.assertEqual("r2-pages.yml", workflow) + self.assertEqual("main", ref) + self.assertEqual("openclaw/docs", repo) + return [ + {"databaseId": 123, "createdAt": now, "status": "completed", "url": "https://github.com/openclaw/docs/actions/runs/123"}, + {"databaseId": 456, "createdAt": now, "status": "queued", "url": "https://github.com/openclaw/docs/actions/runs/456"}, + ] + + with patch.object(dispatch_r2_pages, "list_workflow_dispatch_runs", fake_list), patch.object(dispatch_r2_pages.time, "sleep", lambda _: None): + run_id = dispatch_r2_pages.find_dispatched_run( + "r2-pages.yml", + "main", + "openclaw/docs", + dispatch_r2_pages.parse_time(now), + {"123"}, + ) + + self.assertEqual("456", run_id) + + def test_dispatch_r2_pages_rejects_ambiguous_new_runs(self) -> None: + now = "2026-06-27T03:43:01Z" + + def fake_list(workflow: str, ref: str, repo: str) -> list[dict]: + return [ + {"databaseId": 123, "createdAt": now, "status": "queued", "url": "https://github.com/openclaw/docs/actions/runs/123"}, + {"databaseId": 456, "createdAt": now, "status": "queued", "url": "https://github.com/openclaw/docs/actions/runs/456"}, + ] + + with patch.object(dispatch_r2_pages, "list_workflow_dispatch_runs", fake_list), patch.object(dispatch_r2_pages.time, "sleep", lambda _: None): + with self.assertRaises(SystemExit): + dispatch_r2_pages.find_dispatched_run( + "r2-pages.yml", + "main", + "openclaw/docs", + dispatch_r2_pages.parse_time(now), + set(), + ) + + def test_dispatch_r2_pages_extracts_h1_text(self) -> None: + document = '

LINE

' + + self.assertEqual("LINE", dispatch_r2_pages.extract_h1(document)) + + def test_dispatch_r2_pages_live_h1_retries_until_expected(self) -> None: + seen: list[str] = [] + + def fake_fetch(url: str, timeout_seconds: int = 30) -> str: + seen.append(url) + if len(seen) == 1: + return "

" + return "

LINE

" + + with patch.object(dispatch_r2_pages, "fetch_text", fake_fetch), patch.object(dispatch_r2_pages.time, "sleep", lambda _: None): + dispatch_r2_pages.verify_live_h1("https://docs.openclaw.ai/zh-CN/channels/line", "LINE", 30, 1) + + self.assertEqual(2, len(seen)) + self.assertIn("_openclaw_i18n_canary=", seen[0]) + def test_package_artifact_failure_writes_visible_github_status(self) -> None: with tempfile.TemporaryDirectory() as tmp: repo = Path(tmp) diff --git a/.github/workflows/translate-all.yml b/.github/workflows/translate-all.yml index 6768d54eb0..6beba6733c 100644 --- a/.github/workflows/translate-all.yml +++ b/.github/workflows/translate-all.yml @@ -37,6 +37,11 @@ on: - id - pl - th + canary_only: + description: Run only the canary translation, publish, and live smoke. + required: false + default: false + type: boolean permissions: actions: write @@ -154,9 +159,12 @@ jobs: worker_parallel: "3" thinking_effort: "medium" pending_limit: "1" + canary_source_path: channels/line.md + canary_live_path: channels/line + canary_expected_h1: LINE artifact_prefix: i18n-canary artifact_role: canary - commit_locale: false + commit_locale: true secrets: inherit translate-batch-1: @@ -165,7 +173,7 @@ jobs: - prepare - plan - translate-canary - if: always() && needs.prepare.outputs.should_translate == 'true' && needs.translate-canary.result == 'success' && needs.plan.outputs.batch_1_count != '0' + if: always() && needs.prepare.outputs.should_translate == 'true' && needs.translate-canary.result == 'success' && inputs.canary_only != true && needs.plan.outputs.batch_1_count != '0' strategy: fail-fast: false max-parallel: 3 @@ -194,7 +202,7 @@ jobs: - plan - translate-canary - translate-batch-1 - if: always() && needs.prepare.outputs.should_translate == 'true' && needs.translate-canary.result == 'success' && needs.plan.outputs.batch_2_count != '0' + if: always() && needs.prepare.outputs.should_translate == 'true' && needs.translate-canary.result == 'success' && inputs.canary_only != true && needs.plan.outputs.batch_2_count != '0' strategy: fail-fast: false max-parallel: 3 @@ -223,7 +231,7 @@ jobs: - plan - translate-canary - translate-batch-2 - if: always() && needs.prepare.outputs.should_translate == 'true' && needs.translate-canary.result == 'success' && needs.plan.outputs.batch_3_count != '0' + if: always() && needs.prepare.outputs.should_translate == 'true' && needs.translate-canary.result == 'success' && inputs.canary_only != true && needs.plan.outputs.batch_3_count != '0' strategy: fail-fast: false max-parallel: 3 @@ -252,7 +260,7 @@ jobs: - plan - translate-canary - translate-batch-3 - if: always() && needs.prepare.outputs.should_translate == 'true' && needs.translate-canary.result == 'success' && needs.plan.outputs.batch_4_count != '0' + if: always() && needs.prepare.outputs.should_translate == 'true' && needs.translate-canary.result == 'success' && inputs.canary_only != true && needs.plan.outputs.batch_4_count != '0' strategy: fail-fast: false max-parallel: 3 @@ -281,7 +289,7 @@ jobs: - plan - translate-canary - translate-batch-4 - if: always() && needs.prepare.outputs.should_translate == 'true' && needs.translate-canary.result == 'success' && needs.plan.outputs.batch_5_count != '0' + if: always() && needs.prepare.outputs.should_translate == 'true' && needs.translate-canary.result == 'success' && inputs.canary_only != true && needs.plan.outputs.batch_5_count != '0' strategy: fail-fast: false max-parallel: 3 @@ -310,7 +318,7 @@ jobs: - plan - translate-canary - translate-batch-5 - if: always() && needs.prepare.outputs.should_translate == 'true' && needs.translate-canary.result == 'success' && needs.plan.outputs.batch_6_count != '0' + if: always() && needs.prepare.outputs.should_translate == 'true' && needs.translate-canary.result == 'success' && inputs.canary_only != true && needs.plan.outputs.batch_6_count != '0' strategy: fail-fast: false max-parallel: 3 @@ -362,6 +370,7 @@ jobs: env: SELECTED_LOCALES: ${{ needs.plan.outputs.selected_locales }} LOCALE_COUNT: ${{ needs.plan.outputs.locale_count }} + CANARY_ONLY: ${{ inputs.canary_only || false }} PROVIDER_RESULT: ${{ needs.provider-preflight.result }} CANARY_RESULT: ${{ needs.translate-canary.result }} BATCH_1_RESULT: ${{ needs.translate-batch-1.result }} @@ -378,6 +387,7 @@ jobs: echo echo "- selected locales: \`${SELECTED_LOCALES}\`" echo "- locale count: \`${LOCALE_COUNT}\`" + echo "- canary only: \`${CANARY_ONLY}\`" echo "- provider preflight: \`${PROVIDER_RESULT}\`" echo "- canary: \`${CANARY_RESULT}\`" echo "- batch 1: \`${BATCH_1_RESULT}\`" diff --git a/.github/workflows/translate-finalize-reusable.yml b/.github/workflows/translate-finalize-reusable.yml index cb745a4ea1..9ba26d0e76 100644 --- a/.github/workflows/translate-finalize-reusable.yml +++ b/.github/workflows/translate-finalize-reusable.yml @@ -25,6 +25,24 @@ jobs: group: docs-i18n-finalize cancel-in-progress: false steps: + # The artifact commit targets latest main, while the workflow control + # scripts stay pinned to the workflow ref under test. + - name: Check out workflow scripts + uses: actions/checkout@v6 + with: + ref: ${{ github.workflow_sha }} + path: .openclaw-sync/workflow-ref + sparse-checkout: .github/scripts/i18n + persist-credentials: false + + - name: Stage workflow scripts + run: | + set -euo pipefail + I18N_SCRIPT_DIR="${RUNNER_TEMP}/openclaw-i18n-scripts" + echo "I18N_SCRIPT_DIR=${I18N_SCRIPT_DIR}" >> "$GITHUB_ENV" + mkdir -p "${I18N_SCRIPT_DIR}" + cp -R .openclaw-sync/workflow-ref/.github/scripts/i18n/. "${I18N_SCRIPT_DIR}/" + - name: Check out latest main uses: actions/checkout@v6 with: @@ -47,7 +65,7 @@ jobs: EXPECTED_LOCALES: zh-cn=zh-CN zh-tw=zh-TW ja-jp=ja-JP es=es pt-br=pt-BR ko=ko de=de fr=fr hi=hi ar=ar it=it vi=vi nl=nl fa=fa ru=ru tr=tr uk=uk id=id pl=pl th=th run: | set -euo pipefail - python .github/scripts/i18n/apply_artifacts.py \ + python "${I18N_SCRIPT_DIR}/apply_artifacts.py" \ --source-sha "${SOURCE_SHA}" \ --mode "${MODE}" \ --shard-total "${SHARD_TOTAL}" \ @@ -126,7 +144,7 @@ jobs: GH_TOKEN: ${{ github.token }} run: | set -euo pipefail - gh workflow run pages.yml --ref main + python "${I18N_SCRIPT_DIR}/dispatch_r2_pages.py" - name: Fail incomplete translation run if: steps.apply.outputs.stale != 'true' && steps.apply.outputs.incomplete_count != '0' diff --git a/.github/workflows/translate-locale-reusable.yml b/.github/workflows/translate-locale-reusable.yml index d5596da6cc..ec325efbc5 100644 --- a/.github/workflows/translate-locale-reusable.yml +++ b/.github/workflows/translate-locale-reusable.yml @@ -34,6 +34,10 @@ on: required: false default: "0" type: string + canary_source_path: + required: false + default: "" + type: string artifact_prefix: required: false default: i18n @@ -46,6 +50,14 @@ on: required: false default: false type: boolean + canary_live_path: + required: false + default: channels/line + type: string + canary_expected_h1: + required: false + default: LINE + type: string jobs: translate: @@ -57,6 +69,24 @@ jobs: group: translate-${{ inputs.locale_slug }}-s${{ inputs.shard_index }}of${{ inputs.shard_total }} cancel-in-progress: false steps: + # Keep workflow control-plane scripts pinned to the workflow ref. The docs + # content checkout below may intentionally point at latest main. + - name: Check out workflow scripts + uses: actions/checkout@v6 + with: + ref: ${{ github.workflow_sha }} + path: .openclaw-sync/workflow-ref + sparse-checkout: .github/scripts/i18n + persist-credentials: false + + - name: Stage workflow scripts + run: | + set -euo pipefail + I18N_SCRIPT_DIR="${RUNNER_TEMP}/openclaw-i18n-scripts" + echo "I18N_SCRIPT_DIR=${I18N_SCRIPT_DIR}" >> "$GITHUB_ENV" + mkdir -p "${I18N_SCRIPT_DIR}" + cp -R .openclaw-sync/workflow-ref/.github/scripts/i18n/. "${I18N_SCRIPT_DIR}/" + - name: Checkout publish repo uses: actions/checkout@v6 with: @@ -69,7 +99,7 @@ jobs: env: SOURCE_SHA: ${{ inputs.source_sha }} run: | - python .github/scripts/i18n/read_source_metadata.py + python "${I18N_SCRIPT_DIR}/read_source_metadata.py" - name: Allow requested source id: stale @@ -106,7 +136,7 @@ jobs: env: LOCALE: ${{ inputs.locale }} run: | - python .github/scripts/i18n/prune_stale_locale_pages.py + python "${I18N_SCRIPT_DIR}/prune_stale_locale_pages.py" - name: Build pending docs file list id: pending @@ -118,8 +148,9 @@ jobs: SHARD_INDEX: ${{ inputs.shard_index }} SHARD_TOTAL: ${{ inputs.shard_total }} PENDING_LIMIT: ${{ inputs.pending_limit }} + CANARY_SOURCE_PATH: ${{ inputs.canary_source_path }} run: | - python .github/scripts/i18n/build_pending_manifest.py + python "${I18N_SCRIPT_DIR}/build_pending_manifest.py" - name: Translate changed docs into locale id: translate_docs @@ -190,7 +221,7 @@ jobs: env: LOCALE: ${{ inputs.locale }} run: | - python .github/scripts/i18n/mdx_repair_scope.py snapshot \ + python "${I18N_SCRIPT_DIR}/mdx_repair_scope.py" snapshot \ --baseline "${RUNNER_TEMP}/${LOCALE}.repair-baseline.txt" - name: Repair translated MDX @@ -216,7 +247,7 @@ jobs: env: LOCALE: ${{ inputs.locale }} run: | - python .github/scripts/i18n/mdx_repair_scope.py enforce \ + python "${I18N_SCRIPT_DIR}/mdx_repair_scope.py" enforce \ --baseline "${RUNNER_TEMP}/${LOCALE}.repair-baseline.txt" - name: Recheck translated MDX @@ -253,7 +284,7 @@ jobs: run: | set -euo pipefail - python .github/scripts/i18n/package_artifact.py + python "${I18N_SCRIPT_DIR}/package_artifact.py" - name: Upload locale artifact if: steps.stale.outputs.skip != 'true' @@ -287,6 +318,24 @@ jobs: group: docs-i18n-finalize cancel-in-progress: false steps: + # Finalizers apply artifacts to latest main, but deploy/commit control + # logic must come from this workflow ref so branch canaries test the fix. + - name: Check out workflow scripts + uses: actions/checkout@v6 + with: + ref: ${{ github.workflow_sha }} + path: .openclaw-sync/workflow-ref + sparse-checkout: .github/scripts/i18n + persist-credentials: false + + - name: Stage workflow scripts + run: | + set -euo pipefail + I18N_SCRIPT_DIR="${RUNNER_TEMP}/openclaw-i18n-scripts" + echo "I18N_SCRIPT_DIR=${I18N_SCRIPT_DIR}" >> "$GITHUB_ENV" + mkdir -p "${I18N_SCRIPT_DIR}" + cp -R .openclaw-sync/workflow-ref/.github/scripts/i18n/. "${I18N_SCRIPT_DIR}/" + - name: Check out latest main uses: actions/checkout@v6 with: @@ -308,7 +357,7 @@ jobs: run: | set -euo pipefail - python .github/scripts/i18n/apply_artifacts.py \ + python "${I18N_SCRIPT_DIR}/apply_artifacts.py" \ --source-sha "${SOURCE_SHA}" \ --mode "${MODE}" \ --shard-total "${SHARD_TOTAL}" \ @@ -341,21 +390,42 @@ jobs: id: locale_commit if: inputs.commit_locale && steps.apply.outputs.changed_count != '0' env: + ARTIFACT_ROLE: ${{ inputs.artifact_role }} BASE_SOURCE_SHA: ${{ steps.apply.outputs.base_source_sha }} LOCALE: ${{ inputs.locale }} + LOCALE_SLUG: ${{ inputs.locale_slug }} + SHARD_INDEX: ${{ inputs.shard_index }} + SHARD_TOTAL: ${{ inputs.shard_total }} + ARTIFACT_DIR: .openclaw-sync/i18n-artifacts/${{ inputs.locale_slug }}-s${{ inputs.shard_index }}of${{ inputs.shard_total }} run: | set -euo pipefail - python .github/scripts/i18n/commit_locale_artifact.py + python "${I18N_SCRIPT_DIR}/commit_locale_artifact.py" + + - name: Fail uncommitted locale refresh + if: inputs.commit_locale && steps.apply.outputs.changed_count != '0' && steps.locale_commit.outputs.committed != 'true' + run: | + { + echo "Locale artifact applied changes but did not commit them." + echo "Failing so the translation status does not report an unpublished refresh as successful." + } >&2 + exit 1 - name: Dispatch locale docs deploy - if: inputs.commit_locale && steps.locale_commit.outputs.committed == 'true' + if: inputs.artifact_role == 'canary' || (inputs.commit_locale && steps.locale_commit.outputs.committed == 'true') env: GH_TOKEN: ${{ github.token }} + ARTIFACT_ROLE: ${{ inputs.artifact_role }} + CANARY_LIVE_URL: https://docs.openclaw.ai/${{ inputs.locale }}/${{ inputs.canary_live_path }} + CANARY_EXPECTED_H1: ${{ inputs.canary_expected_h1 }} run: | set -euo pipefail - gh workflow run pages.yml --ref main + args=() + if [ "${ARTIFACT_ROLE}" = "canary" ]; then + args+=(--live-url "${CANARY_LIVE_URL}" --expect-h1 "${CANARY_EXPECTED_H1}") + fi + python "${I18N_SCRIPT_DIR}/dispatch_r2_pages.py" "${args[@]}" - name: Fail incomplete locale artifact if: steps.apply.outputs.incomplete_count != '0' diff --git a/docs/.i18n/translation-workflow.md b/docs/.i18n/translation-workflow.md index dd7db0d366..3e5aaa5cea 100644 --- a/docs/.i18n/translation-workflow.md +++ b/docs/.i18n/translation-workflow.md @@ -33,7 +33,7 @@ Internal note for the docs publish pipeline. This file is under `docs/.i18n`, wh Top-level full workflow concurrency is serialized with `cancel-in-progress: false`. A new full run waits for a running full run instead of cancelling it. -Manual `target_locale` accepts `all` or one locale slug such as `fr`, `ja-jp`, or `zh-cn`. A single-locale rerun uses that locale for the canary sample, then schedules only that locale in the first full batch. +Manual `target_locale` accepts `all` or one locale slug such as `fr`, `ja-jp`, or `zh-cn`. A single-locale rerun uses that locale for the canary sample, then schedules only that locale in the first full batch. Manual `canary_only=true` runs only the canary translation, R2 upload, and live smoke without starting follow-up full batches. ## Debounce Policy @@ -70,7 +70,7 @@ provider/key preflight -> status summary ``` -The canary is a deterministic one-document sample from the first selected locale. It uploads a `canary` artifact, applies it through the same artifact validation path as locale commits, and runs the aggregate docs check without committing or publishing. If it fails translation or validation, later batches are skipped. If it succeeds, the selected locales, including the canary locale, run in normal full batches. If a later locale fails, already successful locales remain committed and published, and the failed locale can be rerun manually. +The canary is a deterministic one-document sample from the first selected locale. It prefers `channels/line.md` because that page is easy to inspect on the live site and exercises fixed glossary terms such as `LINE`; if that page is not pending, it falls back to the smallest pending source page. The canary uploads a `canary` artifact, applies it through the same artifact validation path as locale commits, runs the aggregate docs check, commits that one-page locale refresh when there is a git diff, then dispatches and waits for an R2 Pages full upload. The R2 deploy is required even when the canary page already matches `main`, because `main` can be current while R2 is stale. After upload, the canary live-smokes `https://docs.openclaw.ai//channels/line` and requires the page `

` to be `LINE`. Canary artifacts include only the sampled locale page and that locale translation memory; unrelated pruned locale pages are not published by the probe. Before writing `main`, canary commits are guarded again against the downloaded artifact contract so only the sampled page and translation memory can be committed. If it fails translation, validation, commit, R2 upload, or live smoke, later batches are skipped. If it succeeds, the selected locales, including the canary locale, run in normal full batches unless `canary_only=true` was requested. If a later locale fails, already successful locales remain committed and published, and the failed locale can be rerun manually. ## Artifact Contract @@ -95,11 +95,11 @@ payload/docs/.i18n/.tm.jsonl ## Commit And Deploy Policy -Full locale jobs are the commit and publish unit. After a locale succeeds, a separate write-permission commit job downloads that locale artifact, applies it to latest `main`, runs `npm run docs:check`, commits only `docs//**` and `docs/.i18n/.tm.jsonl`, pushes with rebase/retry under the shared locale finalizer concurrency, and dispatches `pages.yml`. +Full locale jobs are the commit and publish unit. After a locale succeeds, a separate write-permission commit job downloads that locale artifact, applies it to latest `main`, runs `npm run docs:check`, commits only `docs//**` and `docs/.i18n/.tm.jsonl`, pushes with rebase/retry under the shared locale finalizer concurrency, and dispatches `r2-pages.yml` with a full upload. The dispatch step waits for the R2 Pages run and fails if the upload fails. If an artifact applied changes but the locale commit did not land, the finalizer fails instead of reporting an unpublished refresh as successful. Artifact application is intentionally conservative when source metadata has moved. The apply step uses latest `main`, copies only payload pages whose embedded `x-i18n.source_hash` still matches the current source page, and skips stale translation memory. If `main` moves again between apply/validation and push, the commit script skips that locale commit so the next manual or weekly run can re-evaluate from the new base. -Incremental translation keeps the aggregate finalizer. The finalizer downloads available artifacts, applies valid successful payloads, rejects stale or failed artifacts, runs `npm run docs:check`, pushes one aggregate i18n commit, dispatches `pages.yml`, and fails when required locale artifacts are missing or failed. +Incremental translation keeps the aggregate finalizer. The finalizer downloads available artifacts, applies valid successful payloads, rejects stale or failed artifacts, runs `npm run docs:check`, pushes one aggregate i18n commit, dispatches and waits for `r2-pages.yml` with a full upload, and fails when required locale artifacts are missing or failed. ## Automatic Verification @@ -132,7 +132,7 @@ Before merging workflow recovery changes: 1. Trigger `Translate Full` with a deliberately invalid translation key in a test context and confirm the provider preflight fails before locale jobs start. 2. Trigger or simulate a canary failure and confirm follow-up full batches are skipped. 3. Trigger `Translate Full` with `target_locale=fr` and confirm only `fr` runs. -4. Trigger a small manual full run and confirm a successful locale commits independently and dispatches `pages.yml`. +4. Trigger a manual `canary_only=true` run and confirm the canary waits for `r2-pages.yml` and live-smokes the LINE page. 5. Observe or simulate a later locale failure and confirm earlier successful locale commits remain published. 6. Rerun only the failed locale with `target_locale=` and confirm it commits independently. 7. Confirm release events do not start `Translate Full`. From 97858c1134f22c540cf3e778ce2ed5f16e9ae86f Mon Sep 17 00:00:00 2001 From: masonxhuang Date: Sat, 27 Jun 2026 15:57:04 +0800 Subject: [PATCH 3/5] fix(i18n): scope R2 publishes for translation canaries --- .../scripts/i18n/build_pending_manifest.py | 10 +- .github/scripts/i18n/dispatch_r2_pages.py | 61 +++++--- .../scripts/i18n/tests/test_i18n_scripts.py | 141 +++++++++++++++++- .github/workflows/r2-pages.yml | 40 ++++- .../workflows/translate-locale-reusable.yml | 15 +- docs/.i18n/translation-workflow.md | 4 +- scripts/docs-site/r2-upload.mjs | 64 +++++++- 7 files changed, 305 insertions(+), 30 deletions(-) diff --git a/.github/scripts/i18n/build_pending_manifest.py b/.github/scripts/i18n/build_pending_manifest.py index a7edd98000..7924b19cd7 100644 --- a/.github/scripts/i18n/build_pending_manifest.py +++ b/.github/scripts/i18n/build_pending_manifest.py @@ -119,8 +119,14 @@ def build_pending_manifest( pending_files = sorted(pending_files) shard_files = [file for index, file in enumerate(pending_files) if index % shard_total == shard_index] if pending_limit: - canary_source = (docs_root / canary_source_path).resolve() if canary_source_path else None - if canary_source in shard_files: + if canary_source_path: + canary_source = (docs_root / canary_source_path).resolve() + try: + canary_source.relative_to(docs_root.resolve()) + except ValueError as exc: + raise SystemExit(f"configured canary source must stay under docs: {canary_source_path}") from exc + if canary_source not in shard_files: + raise SystemExit(f"configured canary source is not pending in this shard: {canary_source_path}") # Prefer a user-visible page with known glossary coverage so the # canary proves both translation and the deployed page content. shard_files = [canary_source] diff --git a/.github/scripts/i18n/dispatch_r2_pages.py b/.github/scripts/i18n/dispatch_r2_pages.py index 77a2415810..0984fffca3 100644 --- a/.github/scripts/i18n/dispatch_r2_pages.py +++ b/.github/scripts/i18n/dispatch_r2_pages.py @@ -11,6 +11,8 @@ --ref: Git ref to dispatch. Default: main. --repo: GitHub repository. Default: GITHUB_REPOSITORY. --artifact-scope: R2 artifact scope input. Default: full. + --locale: Locale code for locale/page scoped uploads. + --page-path: Locale-relative page route for page scoped uploads. --force-upload: Force R2 object audit/upload input. Default: true. --live-url: Optional live URL to verify after upload. --expect-h1: Expected h1 text for live URL verification. @@ -26,6 +28,7 @@ Examples: GH_TOKEN=... GITHUB_REPOSITORY=openclaw/docs python .github/scripts/i18n/dispatch_r2_pages.py python .github/scripts/i18n/dispatch_r2_pages.py --repo openclaw/docs --ref main --timeout-seconds 1800 + python .github/scripts/i18n/dispatch_r2_pages.py --artifact-scope page --locale zh-CN --page-path channels/line --no-force-upload """ from __future__ import annotations @@ -63,23 +66,34 @@ def parse_run_id(output: str) -> str: return match.group(1) if match else "" -def dispatch(workflow: str, ref: str, repo: str, artifact_scope: str, force_upload: bool) -> str: - result = run( - [ - "gh", - "workflow", - "run", - workflow, - "--repo", - repo, - "--ref", - ref, - "-f", - f"artifact_scope={artifact_scope}", - "-f", - f"force_upload={'true' if force_upload else 'false'}", - ] - ) +def dispatch( + workflow: str, + ref: str, + repo: str, + artifact_scope: str, + force_upload: bool, + locale: str = "", + page_path: str = "", +) -> str: + command = [ + "gh", + "workflow", + "run", + workflow, + "--repo", + repo, + "--ref", + ref, + "-f", + f"artifact_scope={artifact_scope}", + "-f", + f"force_upload={'true' if force_upload else 'false'}", + ] + if locale: + command.extend(["-f", f"locale={locale}"]) + if page_path: + command.extend(["-f", f"page_path={page_path}"]) + result = run(command) output = "\n".join(part for part in [result.stdout.strip(), result.stderr.strip()] if part) if output: print(output) @@ -209,6 +223,7 @@ def parse_args() -> argparse.Namespace: Examples: GH_TOKEN=... GITHUB_REPOSITORY=openclaw/docs python .github/scripts/i18n/dispatch_r2_pages.py python .github/scripts/i18n/dispatch_r2_pages.py --repo openclaw/docs --ref main --timeout-seconds 1800 + python .github/scripts/i18n/dispatch_r2_pages.py --artifact-scope locale --locale ja-JP --no-force-upload python .github/scripts/i18n/dispatch_r2_pages.py --live-url https://docs.openclaw.ai/zh-CN/channels/line --expect-h1 LINE """, ) @@ -216,6 +231,8 @@ def parse_args() -> argparse.Namespace: parser.add_argument("--ref", default="main") parser.add_argument("--repo", default=os.environ.get("GITHUB_REPOSITORY", "")) parser.add_argument("--artifact-scope", default="full") + parser.add_argument("--locale", default="") + parser.add_argument("--page-path", default="") parser.add_argument("--force-upload", default=True, action=argparse.BooleanOptionalAction) parser.add_argument("--live-url", default="") parser.add_argument("--expect-h1", default="") @@ -237,7 +254,15 @@ def main() -> None: # resolution cannot attach this deploy gate to a pre-existing R2 run. known_run_ids = known_workflow_dispatch_run_ids(args.workflow, args.ref, args.repo) started_at = datetime.now(UTC) - run_id = dispatch(args.workflow, args.ref, args.repo, args.artifact_scope, args.force_upload) + run_id = dispatch( + args.workflow, + args.ref, + args.repo, + args.artifact_scope, + args.force_upload, + args.locale, + args.page_path, + ) if not run_id: run_id = find_dispatched_run(args.workflow, args.ref, args.repo, started_at, known_run_ids) wait_for_run(args.repo, run_id, args.timeout_seconds, args.poll_seconds) diff --git a/.github/scripts/i18n/tests/test_i18n_scripts.py b/.github/scripts/i18n/tests/test_i18n_scripts.py index bf54e2606b..48984d26a7 100644 --- a/.github/scripts/i18n/tests/test_i18n_scripts.py +++ b/.github/scripts/i18n/tests/test_i18n_scripts.py @@ -208,13 +208,25 @@ def test_full_workflow_gates_batches_after_canary(self) -> None: self.assertIn('python "${I18N_SCRIPT_DIR}/build_pending_manifest.py"', reusable) self.assertIn('python "${I18N_SCRIPT_DIR}/commit_locale_artifact.py"', reusable) self.assertIn('python "${I18N_SCRIPT_DIR}/dispatch_r2_pages.py" "${args[@]}"', reusable) - self.assertIn('--live-url "${CANARY_LIVE_URL}" --expect-h1 "${CANARY_EXPECTED_H1}"', reusable) + self.assertIn("--artifact-scope page", reusable) + self.assertIn('--locale "${{ inputs.locale }}"', reusable) + self.assertIn('--page-path "${{ inputs.canary_live_path }}"', reusable) + self.assertIn("--artifact-scope locale", reusable) + self.assertIn("--no-force-upload", reusable) + self.assertIn('--live-url "${CANARY_LIVE_URL}"', reusable) + self.assertIn('--expect-h1 "${CANARY_EXPECTED_H1}"', reusable) finalize_reusable = (REPO_ROOT / ".github/workflows/translate-finalize-reusable.yml").read_text(encoding="utf-8") self.assertIn('echo "I18N_SCRIPT_DIR=${I18N_SCRIPT_DIR}" >> "$GITHUB_ENV"', finalize_reusable) self.assertIn("ref: ${{ github.workflow_sha }}", finalize_reusable) self.assertIn('python "${I18N_SCRIPT_DIR}/dispatch_r2_pages.py"', finalize_reusable) self.assertIn("provider-preflight:", text) self.assertIn("Translate Full completed with failed or cancelled work", text) + r2_pages = (REPO_ROOT / ".github/workflows/r2-pages.yml").read_text(encoding="utf-8") + self.assertIn("- locale", r2_pages) + self.assertIn("- page", r2_pages) + self.assertIn("R2_UPLOAD_SCOPE: ${{ steps.artifact-scope.outputs.upload_scope }}", r2_pages) + self.assertIn("R2_UPLOAD_LOCALE: ${{ inputs.locale || '' }}", r2_pages) + self.assertIn("R2_UPLOAD_PAGE_PATH: ${{ inputs.page_path || '' }}", r2_pages) def test_prepare_path_selection_matches_incremental_rules(self) -> None: self.assertTrue(prepare.is_translatable_doc_path("docs/guide/setup.mdx")) @@ -387,6 +399,24 @@ def test_pending_manifest_canary_prefers_configured_source_page(self) -> None: self.assertEqual(1, result.pending_count) self.assertTrue(result.shard_files[0].as_posix().endswith("/docs/guide/setup.mdx")) + def test_pending_manifest_canary_rejects_missing_configured_source_page(self) -> None: + with tempfile.TemporaryDirectory() as tmp: + tmp_path = Path(tmp) + shutil.copytree(FIXTURES / "pending-docs" / "docs", tmp_path / "docs") + + with self.assertRaises(SystemExit): + pending.build_pending_manifest( + docs_root=tmp_path / "docs", + openclaw_sync_dir=tmp_path / ".openclaw-sync", + locale="fr", + locale_slug="fr", + mode="full", + shard_index=0, + shard_total=1, + pending_limit=1, + canary_source_path="channels/line.md", + ) + def test_package_artifact_keeps_only_allowed_changed_paths_and_payload(self) -> None: with tempfile.TemporaryDirectory() as tmp: repo = Path(tmp) @@ -581,6 +611,35 @@ def test_canary_artifact_scope_rejects_deleted_paths(self) -> None: def test_dispatch_r2_pages_parses_run_urls(self) -> None: self.assertEqual("28277584371", dispatch_r2_pages.parse_run_id("https://github.com/openclaw/docs/actions/runs/28277584371")) + def test_dispatch_r2_pages_passes_scoped_inputs(self) -> None: + captured: list[str] = [] + + def fake_run(args: list[str], check: bool = True) -> subprocess.CompletedProcess[str]: + captured.extend(args) + return subprocess.CompletedProcess( + args=args, + returncode=0, + stdout="https://github.com/openclaw/docs/actions/runs/28277584371\n", + stderr="", + ) + + with patch.object(dispatch_r2_pages, "run", fake_run): + run_id = dispatch_r2_pages.dispatch( + "r2-pages.yml", + "main", + "openclaw/docs", + "page", + False, + "zh-CN", + "channels/line", + ) + + self.assertEqual("28277584371", run_id) + self.assertIn("artifact_scope=page", captured) + self.assertIn("force_upload=false", captured) + self.assertIn("locale=zh-CN", captured) + self.assertIn("page_path=channels/line", captured) + def test_dispatch_r2_pages_selects_recent_workflow_dispatch(self) -> None: calls = {"count": 0} now = "2026-06-27T03:43:01Z" @@ -658,6 +717,86 @@ def fake_fetch(url: str, timeout_seconds: int = 30) -> str: self.assertEqual(2, len(seen)) self.assertIn("_openclaw_i18n_canary=", seen[0]) + def test_r2_upload_page_scope_filters_manifest_entries(self) -> None: + result = self._run_r2_upload_scope("page", "zh-CN", "channels/line") + + self.assertEqual(0, result.returncode, result.stderr) + self.assertIn("r2 upload scope: page (3/7 manifest entries, partial=true)", result.stdout) + self.assertIn("r2 dry-run put: zh-CN/channels/line\n", result.stdout) + self.assertIn("r2 dry-run put: zh-CN/channels/line/index.html", result.stdout) + self.assertIn("r2 dry-run put: zh-CN/channels/line.md", result.stdout) + self.assertNotIn("zh-CN/channels/sms", result.stdout) + self.assertNotIn("ja-JP/channels/line", result.stdout) + self.assertNotIn("assets/docs-site.css", result.stdout) + self.assertNotIn("pagefind/pagefind.js", result.stdout) + + def test_r2_upload_locale_scope_filters_manifest_entries(self) -> None: + result = self._run_r2_upload_scope("locale", "zh-CN") + + self.assertEqual(0, result.returncode, result.stderr) + self.assertIn("r2 upload scope: locale (5/7 manifest entries, partial=true)", result.stdout) + self.assertIn("r2 dry-run put: zh-CN/channels/line/index.html", result.stdout) + self.assertIn("r2 dry-run put: zh-CN/channels/sms/index.html", result.stdout) + self.assertIn("r2 dry-run put: pagefind/pagefind.js", result.stdout) + self.assertNotIn("ja-JP/channels/line", result.stdout) + self.assertNotIn("assets/docs-site.css", result.stdout) + + def _run_r2_upload_scope(self, scope: str, locale: str, page_path: str = "") -> subprocess.CompletedProcess[str]: + with tempfile.TemporaryDirectory() as tmp: + tmp_path = Path(tmp) + dist = tmp_path / "dist" + files = tmp_path / "files" + dist.mkdir() + files.mkdir() + entries = [] + for key in [ + "zh-CN/channels/line", + "zh-CN/channels/line/index.html", + "zh-CN/channels/line.md", + "zh-CN/channels/sms/index.html", + "ja-JP/channels/line/index.html", + "pagefind/pagefind.js", + "assets/docs-site.css", + ]: + file_path = files / key.replace("/", "__") + file_path.write_text(key, encoding="utf-8") + digest = hashlib.sha256(file_path.read_bytes()).hexdigest() + entries.append( + { + "cacheControl": "public, max-age=60", + "contentType": "text/html; charset=utf-8", + "file": str(file_path), + "key": key, + "sha256": digest, + } + ) + + manifest = tmp_path / "manifest.json" + manifest.write_text(json.dumps({"entries": entries, "generatedAt": "2026-06-27T00:00:00Z", "version": 1}), encoding="utf-8") + remote_manifest = tmp_path / "remote.json" + remote_manifest.write_text(json.dumps({"entries": [], "generatedAt": "2026-06-26T00:00:00Z", "version": 1}), encoding="utf-8") + + test_env = os.environ.copy() + test_env.update( + { + "R2_UPLOAD_DRY_RUN": "1", + "R2_UPLOAD_MANIFEST_PATH": str(manifest), + "R2_UPLOAD_REMOTE_MANIFEST_PATH": str(remote_manifest), + "R2_UPLOAD_SCOPE": scope, + "R2_UPLOAD_LOCALE": locale, + } + ) + if page_path: + test_env["R2_UPLOAD_PAGE_PATH"] = page_path + return subprocess.run( + ["node", str(REPO_ROOT / "scripts/docs-site/r2-upload.mjs")], + cwd=tmp_path, + env=test_env, + text=True, + stdout=subprocess.PIPE, + stderr=subprocess.PIPE, + ) + def test_package_artifact_failure_writes_visible_github_status(self) -> None: with tempfile.TemporaryDirectory() as tmp: repo = Path(tmp) diff --git a/.github/workflows/r2-pages.yml b/.github/workflows/r2-pages.yml index 2c758ad888..a2257bce82 100644 --- a/.github/workflows/r2-pages.yml +++ b/.github/workflows/r2-pages.yml @@ -23,6 +23,18 @@ on: - auto - full - shell + - locale + - page + locale: + description: "Locale code for locale/page scoped uploads." + required: false + type: string + default: "" + page_path: + description: "Locale-relative page route for page scoped uploads, for example channels/line." + required: false + type: string + default: "" force_upload: description: "Audit R2 objects before upload; unchanged objects remain cache hits." required: false @@ -64,15 +76,27 @@ jobs: BEFORE_SHA: ${{ github.event.before || '' }} EVENT_NAME: ${{ github.event_name }} REQUESTED_SCOPE: ${{ inputs.artifact_scope || 'auto' }} + REQUESTED_LOCALE: ${{ inputs.locale || '' }} + REQUESTED_PAGE_PATH: ${{ inputs.page_path || '' }} run: | set -euo pipefail scope="${REQUESTED_SCOPE}" if [ -z "${scope}" ]; then scope="auto"; fi case "${scope}" in - auto|full|shell) ;; + auto|full|shell|locale|page) ;; *) echo "Invalid artifact scope: ${scope}" >&2; exit 1 ;; esac + if [ "${scope}" = "locale" ] || [ "${scope}" = "page" ]; then + if [ -z "${REQUESTED_LOCALE}" ]; then + echo "locale is required for ${scope} artifact scope" >&2 + exit 1 + fi + fi + if [ "${scope}" = "page" ] && [ -z "${REQUESTED_PAGE_PATH}" ]; then + echo "page_path is required for page artifact scope" >&2 + exit 1 + fi if [ "${scope}" = "auto" ]; then scope="full" @@ -98,13 +122,17 @@ jobs: { echo "scope=${scope}" - echo "partial=$([ "${scope}" = "shell" ] && echo 1 || echo 0)" + case "${scope}" in + shell|locale|page) echo "upload_scope=${scope}" ;; + *) echo "upload_scope=all" ;; + esac + echo "partial=$([[ "${scope}" = "shell" || "${scope}" = "locale" || "${scope}" = "page" ]] && echo 1 || echo 0)" echo "deploy_worker=${deploy_worker}" } >> "${GITHUB_OUTPUT}" echo "Artifact scope: ${scope}" - name: Read source metadata - if: steps.artifact-scope.outputs.scope == 'full' + if: steps.artifact-scope.outputs.scope == 'full' || steps.artifact-scope.outputs.scope == 'locale' || steps.artifact-scope.outputs.scope == 'page' id: source-meta run: | node - <<'NODE' @@ -116,7 +144,7 @@ jobs: NODE - name: Check out OpenClaw source - if: steps.artifact-scope.outputs.scope == 'full' + if: steps.artifact-scope.outputs.scope == 'full' || steps.artifact-scope.outputs.scope == 'locale' || steps.artifact-scope.outputs.scope == 'page' uses: actions/checkout@v6 with: repository: ${{ steps.source-meta.outputs.repository }} @@ -288,7 +316,9 @@ jobs: R2_UPLOAD_CONCURRENCY: 64 R2_UPLOAD_PARTIAL: ${{ steps.artifact-scope.outputs.partial }} R2_UPLOAD_FORCE: ${{ github.event_name == 'workflow_dispatch' && inputs.force_upload == true && '1' || '0' }} - R2_UPLOAD_SCOPE: ${{ steps.artifact-scope.outputs.scope == 'shell' && 'shell' || 'all' }} + R2_UPLOAD_SCOPE: ${{ steps.artifact-scope.outputs.upload_scope }} + R2_UPLOAD_LOCALE: ${{ inputs.locale || '' }} + R2_UPLOAD_PAGE_PATH: ${{ inputs.page_path || '' }} run: npm run docs:r2:upload - name: Deploy Worker before live smoke diff --git a/.github/workflows/translate-locale-reusable.yml b/.github/workflows/translate-locale-reusable.yml index ec325efbc5..fd3bfd1267 100644 --- a/.github/workflows/translate-locale-reusable.yml +++ b/.github/workflows/translate-locale-reusable.yml @@ -423,7 +423,20 @@ jobs: args=() if [ "${ARTIFACT_ROLE}" = "canary" ]; then - args+=(--live-url "${CANARY_LIVE_URL}" --expect-h1 "${CANARY_EXPECTED_H1}") + args+=( + --artifact-scope page + --locale "${{ inputs.locale }}" + --page-path "${{ inputs.canary_live_path }}" + --no-force-upload + --live-url "${CANARY_LIVE_URL}" + --expect-h1 "${CANARY_EXPECTED_H1}" + ) + else + args+=( + --artifact-scope locale + --locale "${{ inputs.locale }}" + --no-force-upload + ) fi python "${I18N_SCRIPT_DIR}/dispatch_r2_pages.py" "${args[@]}" diff --git a/docs/.i18n/translation-workflow.md b/docs/.i18n/translation-workflow.md index 3e5aaa5cea..6f39806e94 100644 --- a/docs/.i18n/translation-workflow.md +++ b/docs/.i18n/translation-workflow.md @@ -70,7 +70,7 @@ provider/key preflight -> status summary ``` -The canary is a deterministic one-document sample from the first selected locale. It prefers `channels/line.md` because that page is easy to inspect on the live site and exercises fixed glossary terms such as `LINE`; if that page is not pending, it falls back to the smallest pending source page. The canary uploads a `canary` artifact, applies it through the same artifact validation path as locale commits, runs the aggregate docs check, commits that one-page locale refresh when there is a git diff, then dispatches and waits for an R2 Pages full upload. The R2 deploy is required even when the canary page already matches `main`, because `main` can be current while R2 is stale. After upload, the canary live-smokes `https://docs.openclaw.ai//channels/line` and requires the page `

` to be `LINE`. Canary artifacts include only the sampled locale page and that locale translation memory; unrelated pruned locale pages are not published by the probe. Before writing `main`, canary commits are guarded again against the downloaded artifact contract so only the sampled page and translation memory can be committed. If it fails translation, validation, commit, R2 upload, or live smoke, later batches are skipped. If it succeeds, the selected locales, including the canary locale, run in normal full batches unless `canary_only=true` was requested. If a later locale fails, already successful locales remain committed and published, and the failed locale can be rerun manually. +The canary is a deterministic one-document sample from the first selected locale. It uses `channels/line.md` because that page is easy to inspect on the live site and exercises fixed glossary terms such as `LINE`; if that configured page is not pending in the canary shard, the canary fails instead of silently switching pages. The canary uploads a `canary` artifact, applies it through the same artifact validation path as locale commits, runs the aggregate docs check, commits that one-page locale refresh when there is a git diff, then dispatches and waits for an R2 Pages single-page upload. The R2 deploy is required even when the canary page already matches `main`, because `main` can be current while R2 is stale. After upload, the canary live-smokes `https://docs.openclaw.ai//channels/line` and requires the page `

` to be `LINE`. Canary artifacts include only the sampled locale page and that locale translation memory; unrelated pruned locale pages are not published by the probe. Before writing `main`, canary commits are guarded again against the downloaded artifact contract so only the sampled page and translation memory can be committed. If it fails translation, validation, commit, R2 upload, or live smoke, later batches are skipped. If it succeeds, the selected locales, including the canary locale, run in normal full batches unless `canary_only=true` was requested. If a later locale fails, already successful locales remain committed and published, and the failed locale can be rerun manually. ## Artifact Contract @@ -95,7 +95,7 @@ payload/docs/.i18n/.tm.jsonl ## Commit And Deploy Policy -Full locale jobs are the commit and publish unit. After a locale succeeds, a separate write-permission commit job downloads that locale artifact, applies it to latest `main`, runs `npm run docs:check`, commits only `docs//**` and `docs/.i18n/.tm.jsonl`, pushes with rebase/retry under the shared locale finalizer concurrency, and dispatches `r2-pages.yml` with a full upload. The dispatch step waits for the R2 Pages run and fails if the upload fails. If an artifact applied changes but the locale commit did not land, the finalizer fails instead of reporting an unpublished refresh as successful. +Full locale jobs are the commit and publish unit. After a locale succeeds, a separate write-permission commit job downloads that locale artifact, applies it to latest `main`, runs `npm run docs:check`, commits only `docs//**` and `docs/.i18n/.tm.jsonl`, pushes with rebase/retry under the shared locale finalizer concurrency, and dispatches `r2-pages.yml` with a locale-scoped upload. Locale-scoped upload includes that locale's page objects plus regenerated `pagefind/` search shards, because Pagefind is a shared index derived from translated page content. The dispatch step waits for the R2 Pages run and fails if the upload fails. If an artifact applied changes but the locale commit did not land, the finalizer fails instead of reporting an unpublished refresh as successful. Artifact application is intentionally conservative when source metadata has moved. The apply step uses latest `main`, copies only payload pages whose embedded `x-i18n.source_hash` still matches the current source page, and skips stale translation memory. If `main` moves again between apply/validation and push, the commit script skips that locale commit so the next manual or weekly run can re-evaluate from the new base. diff --git a/scripts/docs-site/r2-upload.mjs b/scripts/docs-site/r2-upload.mjs index 8807149041..a67467e1c2 100644 --- a/scripts/docs-site/r2-upload.mjs +++ b/scripts/docs-site/r2-upload.mjs @@ -3,6 +3,8 @@ import crypto from "node:crypto"; import fs from "node:fs"; import path from "node:path"; +import { localeLabels } from "./config.mjs"; + const root = process.cwd(); const bucket = process.env.CLOUDFLARE_R2_BUCKET || "openclaw-docs"; const manifestPath = resolvePath(process.env.R2_UPLOAD_MANIFEST_PATH || "dist/docs-r2-manifest.json"); @@ -24,6 +26,8 @@ const putAll = process.env.R2_UPLOAD_PUT_ALL === "1"; const dryRun = process.env.R2_UPLOAD_DRY_RUN === "1"; const remoteManifestPath = process.env.R2_UPLOAD_REMOTE_MANIFEST_PATH || ""; const uploadScope = process.env.R2_UPLOAD_SCOPE || "all"; +const uploadLocale = normalizeLocale(process.env.R2_UPLOAD_LOCALE || ""); +const uploadPagePath = normalizePagePath(process.env.R2_UPLOAD_PAGE_PATH || ""); const partialUpload = process.env.R2_UPLOAD_PARTIAL === "1" || uploadScope !== "all"; const protectedKeys = new Set([ "llms-full.txt", @@ -41,6 +45,9 @@ if (!dryRun && !secretAccessKey) throw new Error("OPENCLAW_R2_SECRET_ACCESS_KEY const manifest = JSON.parse(fs.readFileSync(manifestPath, "utf8")); if (!Array.isArray(manifest.entries)) throw new Error("dist/docs-r2-manifest.json must contain an entries array"); const scopedEntries = filterEntriesByScope(manifest.entries, uploadScope); +if ((uploadScope === "page" || uploadScope === "locale") && scopedEntries.length === 0) { + throw new Error(`R2_UPLOAD_SCOPE=${uploadScope} matched zero manifest entries`); +} const remoteManifest = await getRemoteManifest(); if (partialUpload && !dryRun && remoteManifest.status !== "hit" && process.env.R2_UPLOAD_ALLOW_PARTIAL_WITHOUT_REMOTE !== "1") { throw new Error("Partial R2 upload requires an existing remote manifest; run a full upload first or set R2_UPLOAD_ALLOW_PARTIAL_WITHOUT_REMOTE=1"); @@ -84,8 +91,12 @@ function filterEntriesByScope(entries, scope) { return entries; case "shell": return entries.filter((entry) => isShellScopedEntry(entry)); + case "locale": + return entries.filter((entry) => isLocaleScopedEntry(entry, uploadLocale)); + case "page": + return entries.filter((entry) => isPageScopedEntry(entry, uploadLocale, uploadPagePath)); default: - throw new Error(`R2_UPLOAD_SCOPE must be all or shell, got ${scope}`); + throw new Error(`R2_UPLOAD_SCOPE must be all, shell, locale, or page, got ${scope}`); } } @@ -98,6 +109,57 @@ function isShellScopedEntry(entry) { return entry.contentType === "text/html; charset=utf-8" || key.endsWith(".html"); } +function isLocaleScopedEntry(entry, locale) { + const key = entry.key; + // Locale page changes regenerate Pagefind's global shards. Uploading those + // search objects keeps a successful locale publish discoverable without + // falling back to a full-site page upload. + if (key.startsWith("pagefind/")) return true; + return key === locale || key.startsWith(`${locale}/`); +} + +function isPageScopedEntry(entry, locale, pagePath) { + const keys = pageScopedKeys(locale, pagePath); + return keys.has(entry.key); +} + +function pageScopedKeys(locale, pagePath) { + const routeBase = pagePath === "index" ? locale : `${locale}/${pagePath}`; + return new Set([ + routeBase, + `${routeBase}/index.html`, + pagePath === "index" ? `${locale}/index.md` : `${routeBase}.md`, + ]); +} + +function normalizeLocale(value) { + if (uploadScope !== "page" && uploadScope !== "locale") return ""; + const locale = value.trim(); + if (!locale) throw new Error(`R2_UPLOAD_LOCALE is required for R2_UPLOAD_SCOPE=${uploadScope}`); + if (locale === "en") throw new Error("R2_UPLOAD_LOCALE=en is not supported for translation-scoped uploads"); + if (!Object.hasOwn(localeLabels, locale)) throw new Error(`R2_UPLOAD_LOCALE is not a configured locale: ${locale}`); + return locale; +} + +function normalizePagePath(value) { + if (uploadScope !== "page") return ""; + const raw = value.trim(); + if (!raw) throw new Error("R2_UPLOAD_PAGE_PATH is required for R2_UPLOAD_SCOPE=page"); + if (raw.startsWith("/") || raw.includes("\\") || raw.includes("?") || raw.includes("#")) { + throw new Error(`R2_UPLOAD_PAGE_PATH must be a clean relative docs route, got ${raw}`); + } + const withoutExtension = raw.replace(/\.(md|mdx)$/u, ""); + const normalized = path.posix.normalize(withoutExtension); + if (normalized === "." || normalized.startsWith("../") || normalized.includes("/../")) { + throw new Error(`R2_UPLOAD_PAGE_PATH must stay inside the locale docs tree, got ${raw}`); + } + const pagePath = normalized.replace(/\/index$/u, "") || "index"; + if (pagePath.split("/").some((segment) => !segment || segment === "." || segment === "..")) { + throw new Error(`R2_UPLOAD_PAGE_PATH contains an invalid segment: ${raw}`); + } + return pagePath; +} + function writeUploadManifest(localManifest, currentRemoteManifest, entries) { if (!partialUpload) return { path: manifestPath }; From bcf2a814489031027699a6c379d4d034ce5c6385 Mon Sep 17 00:00:00 2001 From: masonxhuang Date: Sat, 27 Jun 2026 16:09:35 +0800 Subject: [PATCH 4/5] fix(i18n): dispatch scoped R2 deploys from workflow ref --- .../scripts/i18n/tests/test_i18n_scripts.py | 3 +++ .github/workflows/r2-pages.yml | 21 +++++++++++++++++-- .../workflows/translate-locale-reusable.yml | 2 ++ 3 files changed, 24 insertions(+), 2 deletions(-) diff --git a/.github/scripts/i18n/tests/test_i18n_scripts.py b/.github/scripts/i18n/tests/test_i18n_scripts.py index 48984d26a7..7fde034f61 100644 --- a/.github/scripts/i18n/tests/test_i18n_scripts.py +++ b/.github/scripts/i18n/tests/test_i18n_scripts.py @@ -209,6 +209,7 @@ def test_full_workflow_gates_batches_after_canary(self) -> None: self.assertIn('python "${I18N_SCRIPT_DIR}/commit_locale_artifact.py"', reusable) self.assertIn('python "${I18N_SCRIPT_DIR}/dispatch_r2_pages.py" "${args[@]}"', reusable) self.assertIn("--artifact-scope page", reusable) + self.assertIn('--ref "${{ github.ref_name }}"', reusable) self.assertIn('--locale "${{ inputs.locale }}"', reusable) self.assertIn('--page-path "${{ inputs.canary_live_path }}"', reusable) self.assertIn("--artifact-scope locale", reusable) @@ -224,6 +225,8 @@ def test_full_workflow_gates_batches_after_canary(self) -> None: r2_pages = (REPO_ROOT / ".github/workflows/r2-pages.yml").read_text(encoding="utf-8") self.assertIn("- locale", r2_pages) self.assertIn("- page", r2_pages) + self.assertIn("Refresh scoped docs content from main", r2_pages) + self.assertIn("SCOPED_CONTENT_SHA: ${{ steps.scoped-content.outputs.content_sha || '' }}", r2_pages) self.assertIn("R2_UPLOAD_SCOPE: ${{ steps.artifact-scope.outputs.upload_scope }}", r2_pages) self.assertIn("R2_UPLOAD_LOCALE: ${{ inputs.locale || '' }}", r2_pages) self.assertIn("R2_UPLOAD_PAGE_PATH: ${{ inputs.page_path || '' }}", r2_pages) diff --git a/.github/workflows/r2-pages.yml b/.github/workflows/r2-pages.yml index a2257bce82..b972f17a61 100644 --- a/.github/workflows/r2-pages.yml +++ b/.github/workflows/r2-pages.yml @@ -131,6 +131,20 @@ jobs: } >> "${GITHUB_OUTPUT}" echo "Artifact scope: ${scope}" + - name: Refresh scoped docs content from main + if: github.event_name == 'workflow_dispatch' && (steps.artifact-scope.outputs.scope == 'locale' || steps.artifact-scope.outputs.scope == 'page') + id: scoped-content + run: | + set -euo pipefail + + git fetch --quiet origin main:refs/remotes/origin/main + content_sha="$(git rev-parse refs/remotes/origin/main)" + # Scoped translation deploys must test the workflow/ref under review + # while publishing the content that the finalizer just wrote to main. + git checkout "${content_sha}" -- docs .openclaw-sync/source.json + echo "content_sha=${content_sha}" >> "${GITHUB_OUTPUT}" + echo "Scoped docs content refreshed from main ${content_sha}." + - name: Read source metadata if: steps.artifact-scope.outputs.scope == 'full' || steps.artifact-scope.outputs.scope == 'locale' || steps.artifact-scope.outputs.scope == 'page' id: source-meta @@ -192,13 +206,16 @@ jobs: - name: Check current docs main if: steps.artifact-scope.outputs.scope != 'none' || steps.artifact-scope.outputs.deploy_worker == '1' id: current-main + env: + SCOPED_CONTENT_SHA: ${{ steps.scoped-content.outputs.content_sha || '' }} run: | set -euo pipefail git fetch --quiet origin main latest="$(git rev-parse refs/remotes/origin/main)" - if [ "${GITHUB_SHA}" != "${latest}" ]; then + expected="${SCOPED_CONTENT_SHA:-${GITHUB_SHA}}" + if [ "${expected}" != "${latest}" ]; then echo "stale=true" >> "${GITHUB_OUTPUT}" - echo "::notice::Docs main moved to ${latest}; skipping stale R2 upload for ${GITHUB_SHA}." + echo "::notice::Docs main moved to ${latest}; skipping stale R2 upload for ${expected}." else echo "stale=false" >> "${GITHUB_OUTPUT}" fi diff --git a/.github/workflows/translate-locale-reusable.yml b/.github/workflows/translate-locale-reusable.yml index fd3bfd1267..35f8671414 100644 --- a/.github/workflows/translate-locale-reusable.yml +++ b/.github/workflows/translate-locale-reusable.yml @@ -424,6 +424,7 @@ jobs: args=() if [ "${ARTIFACT_ROLE}" = "canary" ]; then args+=( + --ref "${{ github.ref_name }}" --artifact-scope page --locale "${{ inputs.locale }}" --page-path "${{ inputs.canary_live_path }}" @@ -433,6 +434,7 @@ jobs: ) else args+=( + --ref "${{ github.ref_name }}" --artifact-scope locale --locale "${{ inputs.locale }}" --no-force-upload From fe85b3e5cfc50367be3b7d15d56fe69eef9cca9b Mon Sep 17 00:00:00 2001 From: masonxhuang Date: Sat, 27 Jun 2026 17:46:06 +0800 Subject: [PATCH 5/5] fix(i18n): queue R2 deploys instead of cancelling --- .github/scripts/i18n/tests/test_i18n_scripts.py | 1 + .github/workflows/r2-pages.yml | 2 +- docs/.i18n/translation-workflow.md | 2 +- 3 files changed, 3 insertions(+), 2 deletions(-) diff --git a/.github/scripts/i18n/tests/test_i18n_scripts.py b/.github/scripts/i18n/tests/test_i18n_scripts.py index 7fde034f61..1bb6ef8c88 100644 --- a/.github/scripts/i18n/tests/test_i18n_scripts.py +++ b/.github/scripts/i18n/tests/test_i18n_scripts.py @@ -225,6 +225,7 @@ def test_full_workflow_gates_batches_after_canary(self) -> None: r2_pages = (REPO_ROOT / ".github/workflows/r2-pages.yml").read_text(encoding="utf-8") self.assertIn("- locale", r2_pages) self.assertIn("- page", r2_pages) + self.assertRegex(r2_pages, r"group: r2-pages\s+cancel-in-progress: false") self.assertIn("Refresh scoped docs content from main", r2_pages) self.assertIn("SCOPED_CONTENT_SHA: ${{ steps.scoped-content.outputs.content_sha || '' }}", r2_pages) self.assertIn("R2_UPLOAD_SCOPE: ${{ steps.artifact-scope.outputs.upload_scope }}", r2_pages) diff --git a/.github/workflows/r2-pages.yml b/.github/workflows/r2-pages.yml index b972f17a61..61b41caf9b 100644 --- a/.github/workflows/r2-pages.yml +++ b/.github/workflows/r2-pages.yml @@ -52,7 +52,7 @@ permissions: concurrency: group: r2-pages - cancel-in-progress: true + cancel-in-progress: false jobs: deploy: diff --git a/docs/.i18n/translation-workflow.md b/docs/.i18n/translation-workflow.md index 6f39806e94..24ab48da23 100644 --- a/docs/.i18n/translation-workflow.md +++ b/docs/.i18n/translation-workflow.md @@ -95,7 +95,7 @@ payload/docs/.i18n/.tm.jsonl ## Commit And Deploy Policy -Full locale jobs are the commit and publish unit. After a locale succeeds, a separate write-permission commit job downloads that locale artifact, applies it to latest `main`, runs `npm run docs:check`, commits only `docs//**` and `docs/.i18n/.tm.jsonl`, pushes with rebase/retry under the shared locale finalizer concurrency, and dispatches `r2-pages.yml` with a locale-scoped upload. Locale-scoped upload includes that locale's page objects plus regenerated `pagefind/` search shards, because Pagefind is a shared index derived from translated page content. The dispatch step waits for the R2 Pages run and fails if the upload fails. If an artifact applied changes but the locale commit did not land, the finalizer fails instead of reporting an unpublished refresh as successful. +Full locale jobs are the commit and publish unit. After a locale succeeds, a separate write-permission commit job downloads that locale artifact, applies it to latest `main`, runs `npm run docs:check`, commits only `docs//**` and `docs/.i18n/.tm.jsonl`, pushes with rebase/retry under the shared locale finalizer concurrency, and dispatches `r2-pages.yml` with a locale-scoped upload. Locale-scoped upload includes that locale's page objects plus regenerated `pagefind/` search shards, because Pagefind is a shared index derived from translated page content. R2 Pages runs use the single `r2-pages` concurrency group with `cancel-in-progress: false`, so locale deploy dispatches queue instead of cancelling earlier deploys. The dispatch step waits for the R2 Pages run and fails if the upload fails. If an artifact applied changes but the locale commit did not land, the finalizer fails instead of reporting an unpublished refresh as successful. Artifact application is intentionally conservative when source metadata has moved. The apply step uses latest `main`, copies only payload pages whose embedded `x-i18n.source_hash` still matches the current source page, and skips stale translation memory. If `main` moves again between apply/validation and push, the commit script skips that locale commit so the next manual or weekly run can re-evaluate from the new base.