C++: Get rid of an ugly workaround in dataflow#21227
Conversation
There was a problem hiding this comment.
Pull request overview
Refactors the C/C++ IR dataflow SSA implementation to correctly model postfix increment/decrement by introducing an explicit “saved value” SSA variable, eliminating the previous location-based workaround and restoring correct dataflow behavior.
Changes:
- Added new dataflow tests covering postfix/prefix crements, compound assignments, and conditional-expression cases.
- Updated expected AST/IR flow outputs for the new tests.
- Updated SSA implementation to represent postfix crements via dedicated “saved” source variables/defs/uses, replacing the prior hack.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| cpp/ql/test/library-tests/dataflow/dataflow-tests/test.cpp | Adds new test cases validating desired postfix crement flow behavior and related edge cases. |
| cpp/ql/test/library-tests/dataflow/dataflow-tests/test-source-sink.expected | Updates expected AST/IR flow results corresponding to the new tests. |
| cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/SsaImpl.qll | Implements the new SSA modeling for postfix crements by adding saved-value source variables and wiring corresponding defs/uses. |
Comments suppressed due to low confidence (2)
cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/SsaImpl.qll:50
- Same grammar nit in the doc comment here: "generated by an postfix" should be "generated by a postfix".
* Holds if `store` is the `StoreInstruction` generated by an postfix
cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/SsaImpl.qll:697
- This call site uses
hasBaseSourceVariableAndIndirectrion; if you fix the typo in the predicate name, make sure this reference is updated to match.
this.hasBaseSourceVariableAndIndirectrion(v, indirection)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
IdrissRio
left a comment
There was a problem hiding this comment.
It took me some time to review this since my IR knowledge is fairly limited. That said, the approach makes a lot of sense and the result is much cleaner. I only have one question regarding conditional post increment.
I locally tested pointer dereference with postfix sink(*p++) and it works, still I would add it as a test (up to you).
Moreover DCA looks good.
| sink(x); | ||
|
|
||
| x = source(); | ||
| sink((b ? x : y)++); // $ ast MISSING: ir |
There was a problem hiding this comment.
Is this due to strict_count(left.getAUse()) == 2? If so, anything that looks like (non trivial expression)++ would be considered missing right?
There was a problem hiding this comment.
I didnt investigate it too deeply, but I think this is related to our less than ideal handling of value categories related to ternary expressions. We have a related internal issue for this. I'll make sure to link this comment to that issue.
|
|
||
| private module SourceVariables { | ||
| /** | ||
| * Holds if `store` is the `StoreInstruction` generated by n postfix |
There was a problem hiding this comment.
Small typo. It should be a postifix :)
Thanks a lot, Idriss! 🙇 We have some mean query tests of the form |
For a long time we've had an ugly hack in the C/C++ dataflow library related to postfix operator uses. This hack comes from the fact that we want dataflow in cases like:
but not in:
The hack got this by modifying the location of the read attached to
x++. However, it's both ugly and error prone (and, in fact, for some range analysis work I'm doing this was causing infinite looping in the shared range analysis library).This PR gets rid of this hack and instead properly models this in our SSA library. So for something like:
we create a new SSA source variable, definition and use such that the above program is more faithfully represented as:
this ensures that we get the correct dataflow behavior.
Commit-by-commit review recommended.
I've tested that the
isUseAfterPostfixCrement0predicate indeed correspond exactly to the uses of postfix increments and decrements in lots of databases (by checking whether all theStoreInstructionsfor whichisUseAfterPostfixCrement0(store, _)holds come from here, and vice versa).