fix(path-cli): unify test env lock to stop cmd_pathbase racing the cache (red main)#113
Merged
Merged
Conversation
…ache `main` went red after #112 with a flaky failure in `cmd_resume::tests::resolve_input_url_uses_cache_on_hit_without_refetching` ("download ... failed (500): boom") — the cache-hit test reached the network despite a seeded cache. Root cause: two independent mutexes guarded the *same* global resource, the process environment. `cmd_pathbase`'s `EnvGuard` serialized via a private `ENV_LOCK`, while `cmd_resume`/`cmd_cache`/`cmd_export`/`config` use `config::TEST_ENV_LOCK`. Both touch `TOOLPATH_CONFIG_DIR`, so under parallel `cargo test` an `EnvGuard::set("TOOLPATH_CONFIG_DIR", dirB)` could run while the cache-hit test held `TEST_ENV_LOCK` with `dirA`, making `config_dir()` read `dirB` → cache miss → download → 500. (`std::env::set_var`/`var_os` are also not thread-safe, so concurrent access across the two locks is a data race on the environ regardless.) The `EnvGuard` SAFETY comment's claim that "no other tests mutate or read TOOLPATH_CONFIG_DIR" was simply false. #112 only changed production upload code; it surfaced a latent flake by chance. Fix: `EnvGuard` now acquires the shared `config::TEST_ENV_LOCK` and the private `ENV_LOCK` is deleted, leaving exactly one env lock in the crate so all env-touching tests are mutually exclusive. (The PATH/HOME resume guards already run only inside `TEST_ENV_LOCK`-holding tests.) Test-only change. Verified: 14 full-suite runs (incl. 6 at 32 threads) green; clippy clean.
|
🔍 Preview deployed: https://b19868e7.toolpath.pages.dev |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Symptom
mainis red. The CI run for the #112 merge (run 28177878573) failed one test:That test seeds the cache and points downloads at a 500 mock to prove
resolve_inputnever hits the network on a cache hit. It reached the network — a cache miss.Root cause — two locks, one global resource
The failure is a test-isolation race, not a #112 regression (#112 changed only production upload code; it surfaced a latent flake by chance — the parent run passed by luck, and the test passes in isolation).
The process environment was guarded by two independent mutexes:
cmd_pathbase'sEnvGuard→ a privateENV_LOCKcmd_resume/cmd_cache/cmd_export/config→config::TEST_ENV_LOCKBoth mutate/read
TOOLPATH_CONFIG_DIR. Under parallelcargo test, anEnvGuard::set("TOOLPATH_CONFIG_DIR", dirB)can run while the cache-hit test holdsTEST_ENV_LOCKwithdirA, soconfig_dir()readsdirB→ the seeded cache file isn't there → download → 500. (std::env::set_var/var_osare also not thread-safe, so concurrent access across the two locks is a data race on the environ regardless.) TheEnvGuardSAFETY comment claiming "no other tests mutate or readTOOLPATH_CONFIG_DIR" was false.Fix
EnvGuardnow acquires the sharedconfig::TEST_ENV_LOCK; the privateENV_LOCKis deleted. There is now exactly one env lock in the crate, so every env-touching test is mutually exclusive by construction. (The PATH/HOME resume guards already run only insideTEST_ENV_LOCK-holding tests, so they were never part of the race.)Test-only change; no production code.
Verification
path-cli --libruns green, including 6 atRUST_TEST_THREADS=32cargo clippy -p path-cli --all-targets -- -D warningscleanNeed help on this PR? Tag
/codesmithwith what you need. Autofix is disabled.