-
-
Notifications
You must be signed in to change notification settings - Fork 34.5k
gh-146044: Fix ctrl-w (unix-word-rubout) to use whitespace word boundaries #146174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 3 commits
678eefd
853cb8c
99023b2
e6d27d9
9f54090
140faf1
c8d1759
0a25f0b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -417,6 +417,22 @@ def bow(self, p: int | None = None) -> int: | |||||
| p -= 1 | ||||||
| return p + 1 | ||||||
|
|
||||||
| def bow_whitespace(self, p: int | None = None) -> int: | ||||||
| """Return the 0-based index of the whitespace-delimited word break | ||||||
| preceding p most immediately. | ||||||
|
|
||||||
| p defaults to self.pos; only whitespace is considered a word | ||||||
| boundary, matching the behavior of unix-word-rubout in bash/readline.""" | ||||||
| if p is None: | ||||||
| p = self.pos | ||||||
| b = self.buffer | ||||||
| p -= 1 | ||||||
| while p >= 0 and b[p] in (" ", "\n"): | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| p -= 1 | ||||||
| while p >= 0 and b[p] not in (" ", "\n"): | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OOC newlines are also counted but what about tabs? do we also convert them to 4-indents in the REPL or? |
||||||
| p -= 1 | ||||||
| return p + 1 | ||||||
|
|
||||||
| def eow(self, p: int | None = None) -> int: | ||||||
| """Return the 0-based index of the word break following p most | ||||||
| immediately. | ||||||
|
|
||||||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -558,3 +558,47 @@ def test_control_characters(self): | |||||
| reader, _ = handle_all_events(events) | ||||||
| self.assert_screen_equal(reader, 'flag = "🏳️\\u200d🌈"', clean=True) | ||||||
| self.assert_screen_equal(reader, 'flag {o}={z} {s}"🏳️\\u200d🌈"{z}'.format(**colors)) | ||||||
|
|
||||||
|
|
||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Remove those extra blanks |
||||||
| class TestBowWhitespace(TestCase): | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This entire test case can have one single reference to the GH issue as a comment and we can remove them from the methods. However I would prefer that we extend the existing test case with the bow tests and place them where existing ones are. |
||||||
| def test_bow_whitespace_stops_at_whitespace(self): | ||||||
| # GH#146044 | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. # See https://github.com/cpython/issues/146044Use a link looking like that so that I can click on it in my IDE. I created the link from memory sonjust check that it is the correct URL. |
||||||
| # unix-word-rubout (ctrl-w) should use whitespace boundaries, | ||||||
| # not punctuation boundaries like bow() does | ||||||
| reader = prepare_reader(prepare_console([])) | ||||||
| reader.buffer = list("foo.bar baz") | ||||||
| reader.pos = len(reader.buffer) # cursor at end | ||||||
|
|
||||||
| # bow_whitespace from end should jump to start of "baz" | ||||||
| result = reader.bow_whitespace() | ||||||
| self.assertEqual(result, 8) # index of 'b' in "baz" | ||||||
|
|
||||||
| def test_bow_whitespace_includes_punctuation_in_word(self): | ||||||
| # GH#146044 | ||||||
| reader = prepare_reader(prepare_console([])) | ||||||
| reader.buffer = list("foo.bar(baz) qux") | ||||||
| reader.pos = 12 # cursor after ")" | ||||||
|
|
||||||
| # bow_whitespace should treat "foo.bar(baz)" as one word | ||||||
| result = reader.bow_whitespace() | ||||||
| self.assertEqual(result, 0) | ||||||
|
|
||||||
| def test_bow_stops_at_punctuation(self): | ||||||
| # Verify existing bow() still uses syntax_table (punctuation boundary) | ||||||
| reader = prepare_reader(prepare_console([])) | ||||||
| reader.buffer = list("foo.bar baz") | ||||||
| reader.pos = len(reader.buffer) | ||||||
|
|
||||||
| result = reader.bow() | ||||||
| self.assertEqual(result, 8) # same — "baz" is all word chars | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Avoid LLM long dashes and use regular english please |
||||||
|
|
||||||
| def test_bow_vs_bow_whitespace_difference(self): | ||||||
| # The key difference: bow() stops at '.', bow_whitespace() does not | ||||||
| reader = prepare_reader(prepare_console([])) | ||||||
| reader.buffer = list("foo.bar") | ||||||
| reader.pos = len(reader.buffer) | ||||||
|
|
||||||
| # bow() stops at '.' → returns index of 'b' in "bar" | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| self.assertEqual(reader.bow(), 4) | ||||||
| # bow_whitespace() treats entire "foo.bar" as one word | ||||||
| self.assertEqual(reader.bow_whitespace(), 0) | ||||||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,3 @@ | ||||||
| Fix ``unix-word-rubout`` (Ctrl-W) in the REPL to use whitespace-only word | ||||||
| boundaries, matching bash/readline behavior. Previously it used | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| syntax-table boundaries which treated punctuation as word separators. | ||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe unix_bow() instead? or bow_ws()? since we use quite short names.