Skip to content

Commit 23d8497

Browse files
authored
Minimal Push/Pop support (#2207)
This is the first stage of adding support for stacky/multivaluey things. It adds new push/pop instructions, and so far just shows that they can be read and written, and that the optimizer doesn't do anything immediately wrong on them. No fuzzer support, since there isn't a "correct" way to use these yet. The current test shows some "incorrect" usages of them, which is nice to see that we can parse/emit them, but we should replace them with proper usages of push/pop once we actually have those (see comments in the tests). This should be enough to unblock exceptions (which needs a pop in try-catches). It is also a step towards multivalue (I added some docs about that), but most of multivalue is left to be done.
1 parent 7d1ff56 commit 23d8497

24 files changed

+419
-14
lines changed

README.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,11 @@ There are a few differences between Binaryen IR and the WebAssembly language:
6767
WebAssembly's official text format is primarily a linear instruction list
6868
(with s-expression extensions). Binaryen can't read the linear style, but
6969
it can read a wasm text file if it contains only s-expressions.
70+
* Binaryen uses Stack IR to optimize "stacky" code (that can't be
71+
represented in structured form).
72+
* In rare cases stacky code must be represented in Binaryen IR as well, like
73+
popping a value in an exception catch. To support that Binaryen IR has
74+
`push` and `pop` instructions.
7075
* Types and unreachable code
7176
* WebAssembly limits block/if/loop types to none and the concrete value types
7277
(i32, i64, f32, f64). Binaryen IR has an unreachable type, and it allows
@@ -103,6 +108,19 @@ There are a few differences between Binaryen IR and the WebAssembly language:
103108
emitted when generating wasm. Instead its list of operands will be directly
104109
used in the containing node. Such a block is sometimes called an "implicit
105110
block".
111+
* Multivalue
112+
* Binaryen will not represent multivalue instructions and values directly.
113+
Binaryen's main focus is on optimization of wasm, and therefore the question
114+
of whether we should have multivalue in the main IR is whether it justifes
115+
the extra complexity there. Experiments show that the shrinking of code
116+
size thanks to multivalue is useful but small, just 1-3% or so. Given that,
117+
we prefer to keep the main IR simple, and focus on multivalue optimizations
118+
in Stack IR, which is more suitable for such things.
119+
* Binaryen does still need to implement the "ABI" level of multivalue, that
120+
is, we need multivalue calls because those may cross module boundaries,
121+
and so they are observable externally. To support that, Binaryen may use
122+
`push` and `pop` as mentioned earlier; another option is to add LLVM-like
123+
`extractvalue/composevalue` instructions.
106124

107125
As a result, you might notice that round-trip conversions (wasm => Binaryen IR
108126
=> wasm) change code a little in some corner cases.

scripts/gen-s-parser.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,11 @@
4141
("data.drop", "makeDataDrop(s)"),
4242
("memory.copy", "makeMemoryCopy(s)"),
4343
("memory.fill", "makeMemoryFill(s)"),
44+
("push", "makePush(s)"),
45+
("i32.pop", "makePop(i32)"),
46+
("i64.pop", "makePop(i64)"),
47+
("f32.pop", "makePop(f32)"),
48+
("f64.pop", "makePop(f64)"),
4449
("i32.load", "makeLoad(s, i32, /*isAtomic=*/false)"),
4550
("i64.load", "makeLoad(s, i64, /*isAtomic=*/false)"),
4651
("f32.load", "makeLoad(s, f32, /*isAtomic=*/false)"),

src/gen-s-parser.inc

Lines changed: 39 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -200,6 +200,9 @@ switch (op[0]) {
200200
default: goto parse_error;
201201
}
202202
}
203+
case 'p':
204+
if (strcmp(op, "f32.pop") == 0) { return makePop(f32); }
205+
goto parse_error;
203206
case 'r':
204207
if (strcmp(op, "f32.reinterpret_i32") == 0) { return makeUnary(s, UnaryOp::ReinterpretInt32); }
205208
goto parse_error;
@@ -459,9 +462,17 @@ switch (op[0]) {
459462
default: goto parse_error;
460463
}
461464
}
462-
case 'p':
463-
if (strcmp(op, "f64.promote_f32") == 0) { return makeUnary(s, UnaryOp::PromoteFloat32); }
464-
goto parse_error;
465+
case 'p': {
466+
switch (op[5]) {
467+
case 'o':
468+
if (strcmp(op, "f64.pop") == 0) { return makePop(f64); }
469+
goto parse_error;
470+
case 'r':
471+
if (strcmp(op, "f64.promote_f32") == 0) { return makeUnary(s, UnaryOp::PromoteFloat32); }
472+
goto parse_error;
473+
default: goto parse_error;
474+
}
475+
}
465476
case 'r':
466477
if (strcmp(op, "f64.reinterpret_i64") == 0) { return makeUnary(s, UnaryOp::ReinterpretInt64); }
467478
goto parse_error;
@@ -1089,9 +1100,17 @@ switch (op[0]) {
10891100
case 'o':
10901101
if (strcmp(op, "i32.or") == 0) { return makeBinary(s, BinaryOp::OrInt32); }
10911102
goto parse_error;
1092-
case 'p':
1093-
if (strcmp(op, "i32.popcnt") == 0) { return makeUnary(s, UnaryOp::PopcntInt32); }
1094-
goto parse_error;
1103+
case 'p': {
1104+
switch (op[7]) {
1105+
case '\0':
1106+
if (strcmp(op, "i32.pop") == 0) { return makePop(i32); }
1107+
goto parse_error;
1108+
case 'c':
1109+
if (strcmp(op, "i32.popcnt") == 0) { return makeUnary(s, UnaryOp::PopcntInt32); }
1110+
goto parse_error;
1111+
default: goto parse_error;
1112+
}
1113+
}
10951114
case 'r': {
10961115
switch (op[5]) {
10971116
case 'e': {
@@ -1757,9 +1776,17 @@ switch (op[0]) {
17571776
case 'o':
17581777
if (strcmp(op, "i64.or") == 0) { return makeBinary(s, BinaryOp::OrInt64); }
17591778
goto parse_error;
1760-
case 'p':
1761-
if (strcmp(op, "i64.popcnt") == 0) { return makeUnary(s, UnaryOp::PopcntInt64); }
1762-
goto parse_error;
1779+
case 'p': {
1780+
switch (op[7]) {
1781+
case '\0':
1782+
if (strcmp(op, "i64.pop") == 0) { return makePop(i64); }
1783+
goto parse_error;
1784+
case 'c':
1785+
if (strcmp(op, "i64.popcnt") == 0) { return makeUnary(s, UnaryOp::PopcntInt64); }
1786+
goto parse_error;
1787+
default: goto parse_error;
1788+
}
1789+
}
17631790
case 'r': {
17641791
switch (op[5]) {
17651792
case 'e': {
@@ -2198,6 +2225,9 @@ switch (op[0]) {
21982225
case 'n':
21992226
if (strcmp(op, "nop") == 0) { return makeNop(); }
22002227
goto parse_error;
2228+
case 'p':
2229+
if (strcmp(op, "push") == 0) { return makePush(s); }
2230+
goto parse_error;
22012231
case 'r':
22022232
if (strcmp(op, "return") == 0) { return makeReturn(s); }
22032233
goto parse_error;

src/ir/ExpressionAnalyzer.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -209,6 +209,8 @@ template<typename T> void visitImmediates(Expression* curr, T& visitor) {
209209
}
210210
void visitNop(Nop* curr) {}
211211
void visitUnreachable(Unreachable* curr) {}
212+
void visitPush(Push* curr) {}
213+
void visitPop(Pop* curr) {}
212214
} singleton(curr, visitor);
213215
}
214216

src/ir/ReFinalize.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -157,6 +157,8 @@ void ReFinalize::visitReturn(Return* curr) { curr->finalize(); }
157157
void ReFinalize::visitHost(Host* curr) { curr->finalize(); }
158158
void ReFinalize::visitNop(Nop* curr) { curr->finalize(); }
159159
void ReFinalize::visitUnreachable(Unreachable* curr) { curr->finalize(); }
160+
void ReFinalize::visitPush(Push* curr) { curr->finalize(); }
161+
void ReFinalize::visitPop(Pop* curr) { curr->finalize(); }
160162

161163
void ReFinalize::visitFunction(Function* curr) {
162164
// we may have changed the body from unreachable to none, which might be bad

src/ir/effects.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -368,6 +368,8 @@ struct EffectAnalyzer
368368
}
369369
void visitNop(Nop* curr) {}
370370
void visitUnreachable(Unreachable* curr) { branches = true; }
371+
void visitPush(Push* curr) { calls = true; }
372+
void visitPop(Pop* curr) { calls = true; }
371373

372374
// Helpers
373375

src/ir/utils.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,8 @@ struct ReFinalize
146146
void visitHost(Host* curr);
147147
void visitNop(Nop* curr);
148148
void visitUnreachable(Unreachable* curr);
149+
void visitPush(Push* curr);
150+
void visitPop(Pop* curr);
149151

150152
void visitFunction(Function* curr);
151153

@@ -203,6 +205,8 @@ struct ReFinalizeNode : public OverriddenVisitor<ReFinalizeNode> {
203205
void visitHost(Host* curr) { curr->finalize(); }
204206
void visitNop(Nop* curr) { curr->finalize(); }
205207
void visitUnreachable(Unreachable* curr) { curr->finalize(); }
208+
void visitPush(Push* curr) { curr->finalize(); }
209+
void visitPop(Pop* curr) { curr->finalize(); }
206210

207211
void visitFunctionType(FunctionType* curr) { WASM_UNREACHABLE(); }
208212
void visitExport(Export* curr) { WASM_UNREACHABLE(); }

src/passes/DeadCodeElimination.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -308,6 +308,10 @@ struct DeadCodeElimination
308308
DELEGATE(MemoryCopy);
309309
case Expression::Id::MemoryFillId:
310310
DELEGATE(MemoryFill);
311+
case Expression::Id::PushId:
312+
DELEGATE(Push);
313+
case Expression::Id::PopId:
314+
DELEGATE(Pop);
311315
case Expression::Id::InvalidId:
312316
WASM_UNREACHABLE();
313317
case Expression::Id::NumExpressionIds:

src/passes/Precompute.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,8 @@ class PrecomputingExpressionRunner
130130
Flow visitMemoryCopy(MemoryCopy* curr) { return Flow(NOTPRECOMPUTABLE_FLOW); }
131131
Flow visitMemoryFill(MemoryFill* curr) { return Flow(NOTPRECOMPUTABLE_FLOW); }
132132
Flow visitHost(Host* curr) { return Flow(NOTPRECOMPUTABLE_FLOW); }
133+
Flow visitPush(Push* curr) { return Flow(NOTPRECOMPUTABLE_FLOW); }
134+
Flow visitPop(Pop* curr) { return Flow(NOTPRECOMPUTABLE_FLOW); }
133135

134136
void trap(const char* why) override { throw NonstandaloneException(); }
135137
};

src/passes/Print.cpp

Lines changed: 22 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,8 @@ static Name printableLocal(Index index, Function* func) {
5757

5858
// Prints the internal contents of an expression: everything but
5959
// the children.
60-
struct PrintExpressionContents : public Visitor<PrintExpressionContents> {
60+
struct PrintExpressionContents
61+
: public OverriddenVisitor<PrintExpressionContents> {
6162
Function* currFunction = nullptr;
6263
std::ostream& o;
6364

@@ -1150,11 +1151,17 @@ struct PrintExpressionContents : public Visitor<PrintExpressionContents> {
11501151
}
11511152
void visitNop(Nop* curr) { printMinor(o, "nop"); }
11521153
void visitUnreachable(Unreachable* curr) { printMinor(o, "unreachable"); }
1154+
void visitPush(Push* curr) { prepareColor(o) << "push"; }
1155+
void visitPop(Pop* curr) {
1156+
prepareColor(o) << printType(curr->type);
1157+
o << ".pop";
1158+
restoreNormalColor(o);
1159+
}
11531160
};
11541161

11551162
// Prints an expression in s-expr format, including both the
11561163
// internal contents and the nested children.
1157-
struct PrintSExpression : public Visitor<PrintSExpression> {
1164+
struct PrintSExpression : public OverriddenVisitor<PrintSExpression> {
11581165
std::ostream& o;
11591166
unsigned indent = 0;
11601167

@@ -1205,7 +1212,7 @@ struct PrintSExpression : public Visitor<PrintSExpression> {
12051212

12061213
void visit(Expression* curr) {
12071214
printDebugLocation(curr);
1208-
Visitor<PrintSExpression>::visit(curr);
1215+
OverriddenVisitor<PrintSExpression>::visit(curr);
12091216
}
12101217

12111218
void setMinify(bool minify_) {
@@ -1621,6 +1628,18 @@ struct PrintSExpression : public Visitor<PrintSExpression> {
16211628
PrintExpressionContents(currFunction, o).visit(curr);
16221629
o << ')';
16231630
}
1631+
void visitPush(Push* curr) {
1632+
o << '(';
1633+
PrintExpressionContents(currFunction, o).visit(curr);
1634+
incIndent();
1635+
printFullLine(curr->value);
1636+
decIndent();
1637+
}
1638+
void visitPop(Pop* curr) {
1639+
o << '(';
1640+
PrintExpressionContents(currFunction, o).visit(curr);
1641+
o << ')';
1642+
}
16241643
// Module-level visitors
16251644
void visitFunctionType(FunctionType* curr, Name* internalName = nullptr) {
16261645
o << "(func";

0 commit comments

Comments
 (0)