Add CI guard for manual .xlf edits in human-authored PRs by Evangelink · Pull Request #8902 · microsoft/testfx

Evangelink · 2026-06-07T08:33:01Z

Summary

The repository policy in copilot-instructions.md → Localization Guidelines is explicit:

NEVER manually modify *.xlf files. Instead, regenerate them by running dotnet msbuild <project>.csproj /t:UpdateXlf on the owning project.

Despite that, hand-edited .xlf files do occasionally land in PRs. They get silently reverted by the next OneLocBuild bulk update, so the fix is lost and any regression returns. This PR encodes the rule as a CI check so it's a property of the repository rather than reviewer vigilance.

What the guard does

Triggers on every pull_request_target whose diff touches **/*.xlf or **/*.resx (workflow definition is sourced from the base ref, making it tamper-resistant — see "Tamper-resistance" below).
Bot PRs are exempt (dotnet-bot, dotnet-maestro, any *[bot] account) — OneLocBuild owns the bulk update path.
For human PRs, every changed .xlf is mapped to its expected .resx path. The mapping uses the repo-standard <dir>/Resources/xlf/<Base>.<locale>.xlf → <dir>/Resources/<Base>.resx layout convention for strict full-path matching, with a basename-only fallback for any future non-standard layout. If the matching .resx isn't in the diff, the check fails with a clear remediation message.

Why blocking rather than a soft notice

The policy uses "NEVER" — this is a hard rule, not a guideline.
The escape hatch is trivial (touch the .resx too, or get the OneLocBuild bot to do it), so the cost of a false positive is low.
Translation typo fixes that are genuinely one-off go via the OneLocBuild submission flow; hand-edits get overwritten anyway.

If reviewers prefer a soft warning (::notice annotation), I can switch the workflow — just say the word. I went with blocking because the policy wording justifies it.

Tamper-resistance

Both the workflow file and the script are sourced from the BASE commit so a PR cannot weaken its own gate:

Workflow: triggered by pull_request_target, which runs the workflow definition from the base ref. actions/checkout is pinned to ref: ${{ github.event.pull_request.head.sha }} so the git diff still evaluates the PR's tree.
Script: the run step does git show "$BASE_REF:.github/scripts/check-xlf-manual-edit.ps1" into $RUNNER_TEMP and invokes that copy. Falls back to the PR working copy only when the script doesn't yet exist at the base (bootstrap PRs).
Permissions: permissions: contents: read only, no secrets used, no execution of PR-controlled code — so the elevated default token associated with pull_request_target is inert here and the standard PWN risk doesn't apply.

Bootstrap (this PR only)

Because pull_request_target sources the workflow file from the base ref, the guard step does not run on this introducing PR — main doesn't yet contain xlf-manual-edit-guard.yml. The PR is verified instead by the local smoke-test scenarios listed under "Tested locally" below. Once merged, subsequent PRs touching **/*.xlf or **/*.resx will trigger the guard from the merged-in base workflow file.

Files added

.github/scripts/check-xlf-manual-edit.ps1 — PowerShell diff parser with tight ^[A-Za-z0-9._/+\-]+$ path-safety regex (rejects context-line decoys), repo-standard Resources/xlf layout-aware .xlf→.resx mapping, exit-code reporting
.github/workflows/xlf-manual-edit-guard.yml — paths-scoped pull_request_target workflow, contents: read only, shell: pwsh (pre-installed on ubuntu-latest), 5-minute timeout

Tested locally

Seven synthetic diffs covering: bot exemption, [bot]-suffix wildcard, locale-bearing .xlf with matching .resx, no-locale .xlf, multi-segment locale (zh-Hans), the testfx Resources/xlf/ cross-directory layout, the cross-project basename-collision case (e.g. ExtensionResources.resx appears in 9 different projects — would have false-passed under basename-only matching), and a unified-diff with misleading context lines that mention .xlf paths in comments. All cases produced the expected exit code.

Part of a small batch of repo-maintenance improvements identified from a one-month review of PR review-comment patterns (companion PRs: #8898 expert-reviewer extensions, #8899 msbuild-reviewer Rule B-3, #8900 PublicAPI init-accessor guard, #8901 CLI help-text drift reminder).

The repository policy (.github/copilot-instructions.md, Localization Guidelines section) is explicit: NEVER manually modify *.xlf files. Instead, regenerate them by running `dotnet msbuild <project>.csproj /t:UpdateXlf` on the owning project. The OneLocBuild bot (dotnet-bot) owns the bulk translation update path. A human PR may legitimately touch *.xlf files only when those changes are the output of /t:UpdateXlf after a *.resx edit -- in which case the same PR will also include the *.resx change. A PR that hand-edits *.xlf without any *.resx change is the defect this guard catches. What the guard does: * On every pull_request whose diff touches *.xlf or *.resx, run .github/scripts/check_xlf_manual_edit.py. * Bot PRs (dotnet-bot, dotnet-maestro, any *[bot] account) are exempt and the script exits 0 immediately. * For human PRs the script extracts the basename of each changed *.xlf (e.g. `Resources/xlf/Resource.cs.xlf` -> `Resource`) and checks whether the diff also includes a *.resx with the same basename. Basename matching is used (not strict path matching) because testfx places *.xlf in a `xlf/` subdirectory while the matching *.resx lives in the parent `Resources/` directory. * If any *.xlf is unaccompanied, the workflow fails with a clear remediation message (edit the *.resx, run /t:UpdateXlf, commit both). Why blocking rather than a soft notice: * The policy uses "NEVER" -- this is a hard rule, not a guideline. * The escape hatch is trivial (touch the *.resx too, or get the OneLocBuild bot to do it), so the cost of a false positive is low. * Translation typo fixes that are intentionally one-off can be done via the OneLocBuild submission flow; manual *.xlf hand-edits would be overwritten on the next bulk update anyway. Why this matters now: * Hand-edited *.xlf files get silently reverted by the next OneLocBuild run, so the fix is lost and the bug regresses. * The current PR-review process catches this inconsistently; encoding the rule as a CI check makes it a property of the repository rather than reviewer vigilance. Tested locally with five synthetic diffs covering: bot exemption, [bot]-suffix wildcard, locale-bearing *.xlf with matching *.resx, no-locale *.xlf, multi-segment locale (zh-Hans), and the testfx `Resources/xlf/` cross-directory layout. All cases produced the expected exit code. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Adds a pull-request-time CI guard that enforces the repo’s localization policy: human-authored PRs must not include standalone .xlf changes (they must be accompanied by a corresponding .resx change), while bot-authored localization updates are exempt.

Changes:

Introduces a new pull_request workflow scoped to .xlf/.resx changes that runs a fast Python-based validation step.
Adds a Python script that detects whether changed .xlf files have a matching .resx basename also changed in the same PR, and emits a clear remediation report on failure.
Implements bot author exemptions (known bot list + *[bot] suffix) to allow OneLocBuild/dependency-bot localization updates.

Show a summary per file

File	Description
`.github/workflows/xlf-manual-edit-guard.yml`	New PR workflow to run the guard on `.xlf`/`.resx` diffs with minimal permissions and short timeout.
`.github/scripts/check_xlf_manual_edit.py`	Implements the diff path parsing, bot exemption logic, basename matching, and step-summary reporting used by the workflow.

Copilot's findings

Files reviewed: 2/2 changed files
Comments generated: 0

Evangelink

Note

🤖 Automated review by GitHub Copilot. Posted via a maintainer's GitHub token, so it appears under their account — the account owner did not write or approve this content personally. Generated by the Expert Code Review workflow. To request a follow-up action, reply by tagging @copilot directly.

Review Summary

This PR adds a new GitHub Actions workflow and a Python script to enforce the repository's localization policy: .xlf files must never be edited by hand — they must be regenerated via dotnet msbuild /t:UpdateXlf after editing the corresponding .resx.

Verdict table

Dimension	Result
Logic / Correctness	✅ LGTM
Security	✅ LGTM
CI/CD Workflow Quality	✅ LGTM
Error Handling	💡 One suggestion (non-blocking, see inline comment)

Notable positives

Regex is correct. r"\.([a-zA-Z]{2,3}(?:-[A-Za-z0-9]+)*)\.xlf$" correctly strips the locale suffix (e.g. Strings.de.xlf → Strings) and falls back gracefully to Path.stem for locale-less XLF files. Basename-only matching is intentionally permissive to accommodate xlf/ subdirectory layouts.
Bot exemption logic is clean: both the KNOWN_BOTS allowlist and the [bot]-suffix catch cover all expected automated actors (OneLocBuild, Maestro, GitHub Actions).
Minimal permissions (contents: read only) and timeout-minutes: 5 are appropriate.
pull_request (not pull_request_target) is the correct, safer trigger for a read-only guard that needs to run on fork PRs.
fetch-depth: 0 is required and present for the three-dot git diff origin/<base>...HEAD diff to resolve correctly.
write_step_summary is guarded against OSError so the script never crashes if GITHUB_STEP_SUMMARY is unavailable.

One non-blocking suggestion

See the inline comment on _run: subprocess.run itself can raise FileNotFoundError / OSError if git is not on PATH. In the ubuntu-latest CI environment this never fires in practice, but a small try/except OSError guard around the call would make the documented exit code 2 ("usage / IO error") accurate for that edge case and produce a friendlier error message than a raw Python traceback.

Generated by Expert Code Review (on open) for issue #8902 · sonnet46 3.4M

Youssef1313

I haven't looked at current implementation. But I think the only reliable way to detect this properly would be:

Revert all XLF changes in the given PR.
Touch the corresponding resx (so that it has higher timestamp and MSBuild can see that UpdateXlf is not update to date)
Run dotnet build /t:UpdateXlf on the corresponding project, and ensure it's the same set of changes that were in the PR.

* Move bot-author exemption from Python to the workflow `if:`. Per the reviewer, the Python script should focus on diff analysis and not need to know about workflow context. Removed `KNOWN_BOTS`, `is_bot()`, `--author`, and the `PR_AUTHOR` env var from the script. The workflow now skips the whole job via: if: >- github.event.pull_request.user.type != 'Bot' && github.event.pull_request.user.login != 'dotnet-bot' && github.event.pull_request.user.login != 'dotnet-maestro' `user.type == 'Bot'` covers GitHub App identities; the explicit logins cover the dotnet-bot / dotnet-maestro user accounts. * Pin actions to SHA per the repository's lock-file convention. The workflow now uses: actions/checkout@34e1148 # v4.3.1 actions/setup-python@a26af69 # v5.6.0 These are the current tip-of-v4 / tip-of-v5 commits as of 2026-06-07. * Empty except: narrow `except Exception` to `except (AttributeError, ValueError, OSError)` and add an explanatory comment. The previous bare except swallowed any programmer error from `sys.stdout.reconfigure`; the new tuple matches the documented failure modes only. * Drive-by while in the workflow: also switch the diff base from `origin/<base.ref>` to the immutable `base.sha` from the event payload, mirroring the fix applied to PR #8900. A base-branch advance between the event and the checkout fetch could otherwise pull in unrelated commits. * Rename the env var to `BASE_REF` (with `BASE_SHA` fallback in the script's `--base` default for backward compatibility) for consistency with the rest of this PR series. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The previous fix added an explanatory comment ABOVE the try/except, but github-code-quality's empty-except rule looks for a comment INSIDE the except body. Move/extend the comment so it lives next to the `pass`. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Copilot's findings

Files reviewed: 2/2 changed files
Comments generated: 3

Match testfx's existing PowerShell tooling convention (eng/common/, eng/scripts/, .github/scripts/scan-duplicates.ps1) instead of bringing in a Python runtime dependency for a single guard script. - Adds .github/scripts/check-xlf-manual-edit.ps1 with the same logic: parse a diff (unified or name-only), strip the `.<locale>.xlf` suffix to compute the resx basename, flag every .xlf with no matching .resx change in the same PR. - Removes .github/scripts/check_xlf_manual_edit.py. - Drops actions/setup-python from the workflow and switches the run step to `shell: pwsh` (cross-platform on ubuntu-latest). - Updates `paths:` to track the new .ps1 filename. - Preserves the BASE_REF / BASE_SHA env-var fallback so existing workflow invocations keep working. Smoke-tested locally with synthetic name-only diff inputs: (a) .xlf alone → exit 1 with report, (b) .xlf + matching .resx → 0, (c) no .xlf changes → 0, (d) .xlf + .zh-Hans.xlf + .resx → 0. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Three concerns flagged on the previous Python version still applied to the PowerShell port: 1. Guard script sourced from the PR checkout (untrusted) A PR could weaken its own gate by editing the script in the same commit. The workflow now sources the script from the base commit via `git show "$BASE_REF:.github/scripts/check-xlf-manual-edit.ps1"` into $RUNNER_TEMP and runs that copy, falling back to the PR copy only when the script doesn't yet exist at base (bootstrap PRs). 2. Basename-only matching false-passes on cross-project basename collisions `ExtensionResources.resx` appears in 9 different projects in this repo. A PR could hand-edit `proj-A/.../xlf/ExtensionResources.de.xlf` while touching an unrelated `proj-B/.../ExtensionResources.resx`, and the basename-only check would let it through. Added `Get-XlfExpectedResx` that uses the repo-standard layout mapping `<dir>/Resources/xlf/<Base>.*.xlf` → `<dir>/Resources/<Base>.resx` for strict full-path matching. Basename fallback retained for any future non-standard layout. 3. Unified-diff fallback parser too lax Context lines were trimmed before classification, so a code-comment line like ` /// see some/file.xlf` got promoted to a path. The parser now (a) classifies hunk-header / context / +/- lines via the RAW (untrimmed) prefix, and (b) rejects any candidate path that doesn't match the `^[A-Za-z0-9._/+\-]+$` repo-safe alphabet. Smoke-tested locally with: - Cross-project basename collision → correctly FLAGS (was missed pre-fix). - Matching same-project pair → PASS. - Multi-locale .xlf with single matching .resx → PASS. - Hand-edit with no .resx → FLAGS. - Unified diff with misleading context lines → only flags the real .xlf header, ignores the comment-line decoys. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Copilot's findings

Files reviewed: 2/2 changed files
Comments generated: 3

…tance Two related Copilot review comments on a09f2ed flagged that even with the script sourced from the base commit, the workflow file itself was still sourced from the PR head — so a PR could edit `.github/workflows/xlf-manual-edit-guard.yml` (e.g. broaden the bot `if:`) and bypass the gate while still producing a "successful" check. Fix: switch the trigger from `on: pull_request` to `on: pull_request_target` so the workflow definition is taken from the BASE ref (tamper-resistant). This required two compensating changes: 1. actions/checkout: pin `ref: ${{ github.event.pull_request.head.sha }}`. `pull_request_target` otherwise checks out the base ref, leaving HEAD == base and the diff empty — silently turning the gate into a no-op. 2. Header comment: documents the tamper-resistance model end-to-end: workflow from base, script from base, PR HEAD only used for the diff subject. Also documents why the usual `pull_request_target` PWN risk (elevated default GITHUB_TOKEN on fork PRs) does not apply here: the workflow has `permissions: contents: read` only, uses no secrets, and never executes PR-controlled code (it only runs our own base-sourced script via pwsh). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings June 7, 2026 08:33

Copilot started reviewing on behalf of Evangelink June 7, 2026 08:33 View session

github-code-quality Bot found potential problems Jun 7, 2026

View reviewed changes

Comment thread .github/scripts/check_xlf_manual_edit.py Fixed

Copilot AI reviewed Jun 7, 2026

View reviewed changes

Evangelink commented Jun 7, 2026

View reviewed changes

Comment thread .github/scripts/check_xlf_manual_edit.py Outdated

Youssef1313 reviewed Jun 7, 2026

View reviewed changes

Comment thread .github/workflows/xlf-manual-edit-guard.yml Outdated

Youssef1313 reviewed Jun 7, 2026

View reviewed changes

Comment thread .github/workflows/xlf-manual-edit-guard.yml Outdated

github-code-quality Bot found potential problems Jun 7, 2026

View reviewed changes

Comment thread .github/scripts/check_xlf_manual_edit.py Fixed

Copilot AI review requested due to automatic review settings June 7, 2026 11:24

Copilot started reviewing on behalf of Evangelink June 7, 2026 11:24 View session

Copilot AI reviewed Jun 7, 2026

View reviewed changes

Comment thread .github/workflows/xlf-manual-edit-guard.yml Outdated

Comment thread .github/scripts/check_xlf_manual_edit.py Outdated

Comment thread .github/scripts/check_xlf_manual_edit.py Outdated

Evangelink and others added 2 commits June 8, 2026 10:08

Copilot AI review requested due to automatic review settings June 8, 2026 08:41

Copilot started reviewing on behalf of Evangelink June 8, 2026 08:42 View session

Copilot AI reviewed Jun 8, 2026

View reviewed changes

Comment thread .github/workflows/xlf-manual-edit-guard.yml

Comment thread .github/workflows/xlf-manual-edit-guard.yml

Comment thread .github/workflows/xlf-manual-edit-guard.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CI guard for manual .xlf edits in human-authored PRs#8902

Add CI guard for manual .xlf edits in human-authored PRs#8902
Evangelink wants to merge 6 commits into
mainfrom
dev/amauryleve/xlf-manual-edit-guard

Evangelink commented Jun 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Evangelink left a comment

Uh oh!

Uh oh!

Youssef1313 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Evangelink commented Jun 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What the guard does

Why blocking rather than a soft notice

Tamper-resistance

Bootstrap (this PR only)

Files added

Tested locally

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Evangelink left a comment

Choose a reason for hiding this comment

Review Summary

Verdict table

Notable positives

One non-blocking suggestion

Uh oh!

Uh oh!

Youssef1313 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Copilot's findings

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Copilot's findings

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Evangelink commented Jun 7, 2026 •

edited

Loading