Skip to content

Fix double-shufti accelerator missing matches across stream chunk boundaries#402

Open
yoavwizstein wants to merge 2 commits into
VectorCamp:developfrom
yoavwizstein:fix/double-shufti-streaming-cross-chunk
Open

Fix double-shufti accelerator missing matches across stream chunk boundaries#402
yoavwizstein wants to merge 2 commits into
VectorCamp:developfrom
yoavwizstein:fix/double-shufti-streaming-cross-chunk

Conversation

@yoavwizstein

Copy link
Copy Markdown

What's wrong

Some patterns that accelerate with a two-char-class double-shufti miss matches in
streaming mode when a match's two-byte anchor lands on a chunk boundary. Works in
block mode and as a single chunk, so streaming isn't split-invariant. Silent
false negative.

Patterns I hit it with:

[0-9]{1,2}c[a-f]*
[b-d]{1,3}[cd]+\d*
[a-c]{1,3}[ac][ab]*
[ac]{1,3}[cd]+

Why

The double accel needs byte[i] and byte[i+1], so the last byte of the buffer is
handled in check_last_byte. It only stopped there if the byte completed a
single-char pattern. If the byte is just the first char of a pair, it got
skipped. Fine at real end of data, but at a chunk boundary the second char is in
the next chunk, so the match is lost. The accel can't tell EOD from a chunk
boundary, so it has to be careful with the last byte either way.

This is the streaming sibling of #358. That was the block-mode false negative
from #325, fixed in #368 by scanning the last byte up to c_end. Correct for
block, but it reintroduced the skip at chunk boundaries, and no test caught it
because the double accel had no streaming split test.

Fix

One line in check_last_byte: also stop when the last byte is in the first-char
class, returning buf_end-1. That's a resume point, not a match, so no false
positives. The engine re-checks the byte and carries state into the next chunk.
Covers every engine that uses the accel (LimEx, McClellan, Gough, Sheng,
McSheng).

Tests

Added DoubleShuftiStreaming.SplitInvariant: block matches vs streaming at every
split offset. Fails on develop (drops offset 64 at split 63 for the first
pattern), passes with the fix. Also updated the internal shufti.cpp assertions
for the new last-byte behavior.

unit-internal passes (14196), the new test and Identical pass (43), and a fuzz of
2.18M chunked scans gave 0 FN and 0 FP.

Performance

Not in the SIMD loop, just one check at the tail. Ran hsbench in block mode,
patched vs the same tree with only this reverted, interleaved and pinned to one
core, median of 12: realistic corpus +1.4%, worst case -1.2%. The realistic
number being positive means it's all noise.

Alternative

Could thread an is_eod flag down so block stays byte-identical and only streaming
changes. Cleaner in principle, but it touches every engine's accel call site for
no measurable gain. Happy to do that instead if you'd prefer.

…ng mode

shuftiDoubleExec finds the first position i where byte[i] is in class1 and
byte[i+1] is in class2, so it must look one byte past the position it reports.
shuftiDoubleExecReal excludes the final buffer byte from its main loop and
defers it to check_last_byte, which zero-pads the absent byte[i+1].

check_last_byte only stopped on the last byte when it completed a single-char
(wildcard second char) pattern. When the last byte is merely the first char of
a double it was skipped. That is correct at true end-of-data, but at an
intermediate stream chunk boundary the second char is the first byte of the
next chunk, so the straddling match was silently dropped - a streaming
split-invariance violation.

Also stop on the last byte when it is in the first-char class (last_elem !=
0xff), returning buf_end-1. That is a resume point, not a reported match: the
engine re-examines the byte, carries the pending-first-char state into the next
chunk (fixing the false negative), and at true EOD finds nothing (no false
positive). The change is off the SIMD hot loop.

Add DoubleShuftiStreaming.SplitInvariant (block match set vs every streaming
split) and update the internal shufti tests to the new last-byte contract.
@yoavwizstein yoavwizstein marked this pull request as ready for review July 1, 2026 08:59
@markos

markos commented Jul 1, 2026

Copy link
Copy Markdown

hi @yoavwizstein, thanks for your PR. As this is obviously an AI-generated PR, I will have to confirm that it is a human that sent it and that you actually understand the problem it solves and how it solves it. I will not accepted completely automated PRs and will reject them right away, unless there is a person involved that will actually take credit and responsibility. Even though this is a simple PR, this will only start to get more prevalent and I need to protect the project -and my time.
I hope you understand.

@yoavwizstein

Copy link
Copy Markdown
Author

Hi markos, definitely human here :)
was using AI to investigate false negatives(eg regex SHOULD match but does not) on 5.4.11 which i then confirmed to still happen on develop atleast on stream mode, the research AND the development was AI assisted.
The bug here is real! you get false-negatives for the end of a chunk/streamed block, which can be easily reproduced, should i hand over a replication? or are the regexes and caption above sufficient?

lmk if you need anything else from me!

@markos

markos commented Jul 1, 2026

Copy link
Copy Markdown

I just wanted to make sure that it's an actual human that submitted the PR and can explain it. I will do some manual testing to confirm myself -though your explanation seems very logical and thank you for the catch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants