Fix DiceGenetic.mate continuous-mutation typo pinning to lower bound (#441)#472
Open
jbbqqf wants to merge 1 commit into
Open
Fix DiceGenetic.mate continuous-mutation typo pinning to lower bound (#441)#472jbbqqf wants to merge 1 commit into
jbbqqf wants to merge 1 commit into
Conversation
`mate()` mutates a continuous feature with `np.random.uniform(low, high)`, but the historical implementation passed `feature_range[feat_name][0]` for both arguments. Every continuous mutation therefore collapsed to the lower bound of the feature's range, leaving the genetic search to explore continuous axes only via the parent-gene branches (prob < 0.80) — half the algorithm's diversity strategy was a no-op. The two adjacent `np.random.uniform` calls in this file (do_random_init at line 92, do_KD_init at line 124) both correctly pass `low, high`, so this is a clear typo rather than an intentional design choice. Fixing it restores the documented genetic-search behaviour and is consistent with multiple user reports that the genetic explainer "ignores" the user's permitted feature range / features_to_vary intent on continuous columns (see interpretml#441 and interpretml#260 comment threads). Adds `TestMateContinuousMutation` driving 200 mutation events on a single continuous feature with range [0, 100] and asserting that at least one sample lands in the upper half. The test fails on `origin/main` (every sample is 0.0) and passes on this branch. Adds a code comment explaining the choice for future reviewers, since the pattern would otherwise be easy to re-introduce in another mutation path. Closes interpretml#441. Signed-off-by: Jean-Baptiste Braun <jbaptiste.braun@gmail.com>
3afa543 to
4647afa
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
DiceGenetic.mate()mutates a continuous feature withnp.random.uniform(low, high), but the call passedfeature_range[feat_name][0]for both arguments. As a result everycontinuous mutation collapsed to the lower bound of the feature's
declared range, leaving the genetic search to explore continuous axes
only via the parent-gene branches (prob < 0.80). Half of the
diversity-injection strategy was a no-op.
The two adjacent
np.random.uniformcalls in the same file(
do_random_initat line ~92 anddo_KD_initat line ~124) bothcorrectly pass
low, high, so this is a clear typo rather than anintentional design choice — and it matches several user reports that
the genetic explainer "ignores" the user's
permitted_range/features_to_varyintent on continuous columns (see #441 and #260comment threads).
Why
One-character typo fix in the
else:mutation branch ofDiceGenetic.mate. Restores the documented diversity behaviour of thegenetic algorithm.
I left a 4-line comment at the call site explaining the prior bug —
without it, a future reviewer or refactor could easily re-introduce
the same typo in another mutation path.
Reproduce BEFORE/AFTER yourself (copy-paste)
What I ran locally
origin/mainwith all 200 samples == 0.0 (every mutation pinned to the lower bound of the feature range, exactly as the typo dictates)np.random.uniform(low, high)call sites confirms the fix matches the established pattern in this fileEdge cases
mate[0, 100]mate[0, 100]matenp.random.choicedo_random_initlow, high)do_KD_initlow, high)AI disclosure
This change was prepared with the assistance of Claude (Anthropic).
The author reviewed every line and is responsible for the final result.