Skip to content

Commit f7251cf

Browse files
branchseerclaude
andcommitted
docs: add comprehensive performance analysis
Includes detailed timing analysis from Chrome tracing (tracing-chrome) across global CLI, NAPI startup, vite-task session, and task execution phases. Key finding: vite.config.ts loading via JS callbacks accounts for 99% of the overhead before task execution (~930ms for 10-package monorepo). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent a2ab05b commit f7251cf

1 file changed

Lines changed: 291 additions & 0 deletions

File tree

performance.md

Lines changed: 291 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,291 @@
1+
# vite-plus Performance Analysis
2+
3+
Performance measurements from E2E tests (Ubuntu, GitHub Actions runner).
4+
5+
**Test projects**: vibe-dashboard (single-package), frm-stack (multi-package monorepo)
6+
**Node.js**: 24.14.0 (managed by vite-plus js_runtime)
7+
**Commands tested**: `vp fmt`, `vp test`, `vp run build`, `vp run lint:check`, `vp run @yourcompany/api#test`
8+
9+
## Architecture Overview
10+
11+
A `vp` command invocation traverses multiple layers:
12+
13+
```
14+
User runs `vp run lint:check`
15+
16+
├─ [1] Global CLI (Rust binary `vp`) ~3-8ms
17+
│ ├─ argv0 processing ~40μs
18+
│ ├─ Node.js runtime resolution ~1.3ms
19+
│ ├─ Module resolution (oxc_resolver) ~170μs
20+
│ └─ Delegates to local CLI via exec(node bin.js)
21+
22+
├─ [2] Node.js startup + NAPI loading ~3.7ms
23+
│ └─ bin.ts entry → import NAPI binding → call run()
24+
25+
├─ [3] Rust core via NAPI (vite-task session)
26+
│ ├─ Session init ~60μs
27+
│ ├─ load_package_graph (workspace discovery) ~4ms
28+
│ ├─ load_user_config_file × N (JS callbacks) ~160ms first, ~3-12ms subsequent
29+
│ ├─ handle_command + resolve (JS callbacks) ~1.3ms
30+
│ └─ Task execution (spawns child processes)
31+
32+
└─ [4] Task spawns (child processes)
33+
├─ Spawn 1: pnpm install / dependsOn ~0.95-1.05s
34+
└─ Spawn 2: actual command varies (1-6s)
35+
```
36+
37+
## Phase 1: Global CLI (Rust binary)
38+
39+
Measured via Chrome tracing from the `vp` binary process.
40+
Timestamps are relative to process start (microseconds).
41+
42+
### Breakdown (6 invocations, vibe-dashboard, Ubuntu)
43+
44+
| Stage | Time from start | Duration |
45+
|---|---|---|
46+
| argv0 processing | 37-57μs | ~40μs |
47+
| Runtime resolution start | 482-684μs | ~500μs |
48+
| Node.js version selected | 714-1042μs | ~300μs |
49+
| LTS alias resolved | 723-1075μs | ~10μs |
50+
| Version index cache check | 1181-1541μs | ~400μs |
51+
| Node.js version resolved | 1237-1593μs | ~50μs |
52+
| Node.js cache confirmed | 1302-1627μs | ~50μs |
53+
| **oxc_resolver start** | **3058-7896μs** ||
54+
| oxc_resolver complete | 3230-8072μs | **~170μs** |
55+
| Delegation to Node.js | 3275-8160μs | ~40μs |
56+
57+
### Key Observations
58+
59+
- **Total global CLI overhead**: 3.3ms - 8.2ms per invocation
60+
- **oxc_resolver** is extremely fast (~170μs), resolving `vite-plus/package.json` via node_modules
61+
- **Dominant variable cost**: Gap between "Node cached" and "oxc_resolver start" (1.7-6.6ms). This includes CLI argument parsing, command dispatch, and resolver initialization
62+
- **Node.js runtime resolution** consistently uses cached version index and cached Node.js binary (~1.3ms)
63+
64+
## Phase 2: Node.js Startup + NAPI Loading
65+
66+
Measured from NAPI-side Chrome traces (frm-stack project).
67+
68+
The NAPI `run()` function is first called at **~3.7ms** from Node.js process start:
69+
70+
| Event | Time (μs) | Notes |
71+
|---|---|---|
72+
| NAPI `run()` entered | 3,682 | First trace event from NAPI module |
73+
| `napi_run: start` | 3,950 | After ThreadsafeFunction setup |
74+
| `cli::main` span begins | 4,116 | CLI argument processing starts |
75+
76+
This means **Node.js startup + ES module loading + NAPI binding initialization takes ~3.7ms**.
77+
78+
## Phase 3: Rust Core via NAPI (vite-task)
79+
80+
### NAPI-side Detailed Breakdown (frm-stack `vp run lint:check`)
81+
82+
From Chrome trace, all times in μs from process start:
83+
84+
```
85+
3,682 NAPI run() entered
86+
3,950 napi_run: start
87+
4,116 cli::main begins
88+
4,742 execute_vite_task_command begins
89+
4,865 session::init begins
90+
4,907 init_with begins
91+
4,923 init_with ends ── 16μs
92+
4,924 session::init ends ── 59μs
93+
4,925 session::main begins
94+
4,931 plan_from_cli_run_resolved begins
95+
4,935 plan_query begins
96+
4,941 load_task_graph begins
97+
4,943 task_graph::load begins
98+
4,944 load_package_graph begins ━━ 3.8ms
99+
8,764 load_package_graph ends
100+
8,779 load_user_config_file #1 begins ━━ 164ms (first vite.config.ts load)
101+
173,248 load_user_config_file #1 ends
102+
173,265 load_user_config_file #2 begins ━━ 12ms
103+
185,212 load_user_config_file #2 ends
104+
185,221 load_user_config_file #3 begins ━━ 3.4ms
105+
188,666 load_user_config_file #3 ends
106+
188,675 load_user_config_file #4 begins ━━ 741ms (cold import of workspace package config)
107+
929,476 load_user_config_file #4 ends
108+
... (subsequent loads: ~3-5ms each)
109+
```
110+
111+
### Critical Finding: vite.config.ts Loading is the Bottleneck
112+
113+
The **`load_user_config_file`** callback (which calls back into JavaScript to load `vite.config.ts` for each workspace package) dominates the task graph loading time:
114+
115+
| Config Load | Duration | Notes |
116+
|---|---|---|
117+
| First package | **164ms** | Cold import: requires JS module resolution + transpilation |
118+
| Second package | **12ms** | Warm: shared dependencies already cached |
119+
| Third package | **3.4ms** | Warm: nearly all deps cached |
120+
| Fourth package (different deps) | **741ms** | Cold: imports new heavy dependencies |
121+
| Subsequent packages | **3-5ms** each | All warm |
122+
123+
**For frm-stack (10 packages), total config loading: ~930ms** — this is the single largest cost.
124+
125+
### Callback Timing (`handle_command` + `resolve`)
126+
127+
After the task graph is loaded, vite-task calls back into JavaScript to resolve the tool binary:
128+
129+
```
130+
937,757 handle_command begins
131+
937,868 resolve begins
132+
937,873 js_resolver begins (test command)
133+
939,126 js_resolver ends ── 1.25ms
134+
939,187 resolve ends
135+
939,189 handle_command ends ── 1.43ms
136+
```
137+
138+
The `js_resolver` callback (which locates the test runner binary via JavaScript) takes **~1.25ms**.
139+
140+
### Session Init Timing Comparison
141+
142+
| Stage | frm-stack (10 packages) | Notes |
143+
|---|---|---|
144+
| Session init | ~60μs | Minimal setup |
145+
| load_package_graph | ~4ms | Workspace discovery |
146+
| load_user_config_file (all) | **~930ms** | JS callbacks, dominant cost |
147+
| handle_command + resolve | ~1.4ms | Tool binary resolution |
148+
| **Total before task execution** | **~936ms** | |
149+
150+
## Phase 4: Task Execution (vibe-dashboard)
151+
152+
### Spawn Timing (First Run — Cold)
153+
154+
| Command | Spawn 1 (setup) | Spawn 2 (execution) | Total |
155+
|---|---|---|---|
156+
| `vp fmt` | 1.05s (977 reads, 50 writes) | 1.00s (163 reads, 1 write) | ~2.1s |
157+
| `vp test` | 0.96s (977 reads, 50 writes) | 5.71s (4699 reads, 26 writes) | ~6.7s |
158+
| `vp run build` | 0.95s (977 reads, 50 writes) | 1.61s (3753 reads, 17 writes) | ~2.6s |
159+
160+
### Spawn Timing (Second Run — Cache Available)
161+
162+
| Command | Spawn 1 (setup) | Spawn 2 (execution) | Total | Delta |
163+
|---|---|---|---|---|
164+
| `vp fmt` | 0.95s (977 reads, 50 writes) | 0.97s (167 reads, 3 writes) | ~1.9s | -0.2s |
165+
| `vp test` | 0.95s (977 reads, 50 writes) | 4.17s (1930 reads, 4 writes) | ~5.1s | **-1.6s** |
166+
| `vp run build` | 0.96s (977 reads, 50 writes) | **cache hit (replayed)** | ~1.0s | **-1.6s** |
167+
168+
### Key Observations
169+
170+
- **Spawn 1 is constant** (~0.95-1.05s, 977 path_reads, 50 path_writes) regardless of command or cache state. This is the workspace/task-graph loading + pnpm resolution overhead.
171+
- **`vp run build` cache hit**: On second run, the build was fully replayed from cache, saving 1.19s. The 977-read spawn 1 still executes.
172+
- **`vp test` improvement**: Second run read 1930 paths (vs 4699), suggesting OS filesystem caching reduced disk I/O.
173+
174+
## Phase 5: Task Cache Effectiveness
175+
176+
vite-task implements a file-system-aware task cache at `node_modules/.vite/task-cache`.
177+
178+
| Command | First Run | Cache Run | Cache Hit? | Savings |
179+
|---|---|---|---|---|
180+
| `vp fmt` | 2.1s | 1.9s | No ||
181+
| `vp test` | 6.7s | 5.1s | No | -1.6s (OS cache) |
182+
| `vp run build` | 2.6s | 1.0s | **Yes** | **-1.6s** (1.19s from task cache) |
183+
184+
**Only `vp run build` was cache-eligible.** Formatting and test commands are not cached (side effects / non-deterministic outputs).
185+
186+
## End-to-End Timeline: Full Command Lifecycle
187+
188+
Combining all phases for a single `vp run lint:check` invocation (frm-stack):
189+
190+
```
191+
T+0.00ms Global CLI starts (Rust binary)
192+
T+0.04ms argv0 processed
193+
T+0.50ms Runtime resolution begins
194+
T+1.30ms Node.js version resolved (cached)
195+
T+3.30ms oxc_resolver finds local vite-plus ── ~170μs
196+
T+3.35ms exec(node, [dist/bin.js, "run", "lint:check"]) ── process replaced
197+
─── Node.js process starts ───
198+
T+3.70ms NAPI run() called (Node.js startup overhead)
199+
T+4.00ms napi_run: start
200+
T+4.12ms cli::main begins
201+
T+4.74ms execute_vite_task_command begins
202+
T+4.94ms load_package_graph begins
203+
T+8.76ms load_package_graph ends ── 3.8ms
204+
T+8.78ms load_user_config_file #1 begins (JS callback)
205+
T+173ms load_user_config_file #1 ends ── 164ms ★ bottleneck
206+
... (more config loads)
207+
T+937ms handle_command begins
208+
T+939ms handle_command ends (js_resolver: 1.25ms)
209+
T+940ms Task execution starts (child process spawn)
210+
... (actual command runs)
211+
```
212+
213+
**Total overhead before task execution: ~940ms**, of which **~930ms (99%) is vite.config.ts loading**.
214+
215+
## Wall-Clock Timelines (vibe-dashboard, Ubuntu)
216+
217+
### First Run
218+
219+
```
220+
19:16:44.039 vp fmt — pnpm download starts
221+
19:16:44.170 vp fmt — cache dir created
222+
19:16:45.158 vp fmt — spawn 1 finished (setup)
223+
19:16:46.028 vp fmt — spawn 2 finished (biome) Total: ~2.0s
224+
19:16:46.082 vp test — pnpm resolution starts
225+
19:16:46.084 vp test — cache dir created
226+
19:16:47.057 vp test — spawn 1 finished (setup)
227+
19:16:52.750 vp test — spawn 2 finished (vitest) Total: ~6.7s
228+
19:16:52.846 vp run build — cache dir created
229+
19:16:53.793 vp run build — spawn 1 finished (setup)
230+
19:16:55.398 vp run build — spawn 2 finished (vite build) Total: ~2.6s
231+
```
232+
233+
**Total first run: ~11.4s** (3 commands sequential)
234+
235+
### Cache Run
236+
237+
```
238+
19:16:56.446 vp fmt — cache dir created
239+
19:16:57.399 vp fmt — spawn 1 finished
240+
19:16:58.368 vp fmt — spawn 2 finished Total: ~1.9s
241+
19:16:58.441 vp test — cache dir created
242+
19:16:59.390 vp test — spawn 1 finished
243+
19:17:03.556 vp test — spawn 2 finished Total: ~5.1s
244+
19:17:03.641 vp run build — cache dir created
245+
19:17:04.596 vp run build — spawn 1 finished
246+
19:17:05.040 vp run build — cache replayed Total: ~1.4s
247+
```
248+
249+
**Total cache run: ~8.6s** (-24% from first run)
250+
251+
## Summary of Bottlenecks
252+
253+
| Bottleneck | Time | % of overhead | Optimization opportunity |
254+
|---|---|---|---|
255+
| vite.config.ts loading (cold) | **164-741ms** per package | **99%** | Cache config results, lazy loading, parallel loading |
256+
| Spawn 1 (pnpm/setup) | **~1s** || Persistent process, avoid re-resolving |
257+
| load_package_graph | **~4ms** | <1% | Already fast |
258+
| Session init | **~60μs** | <0.01% | Already fast |
259+
| Global CLI overhead | **~5ms** | <0.5% | Already fast |
260+
| Node.js + NAPI startup | **~3.7ms** | <0.4% | Already fast |
261+
| oxc_resolver | **~170μs** | <0.02% | Already fast |
262+
| js_resolver callback | **~1.25ms** | <0.1% | Already fast |
263+
264+
**The single most impactful optimization would be caching or parallelizing `load_user_config_file` calls.** The first cold load takes 164ms, and when new heavy dependencies are encountered, loads can take 741ms. For a 10-package monorepo, this accumulates to ~930ms of config loading before any task runs.
265+
266+
## Inter-Process Communication
267+
268+
vite-task uses Unix shared memory (`/dev/shm`) for parent-child process communication during task execution:
269+
- Creates persistent mapping at `/shmem_<hash>`
270+
- Maps memory into address space for fast IPC
271+
- Cleaned up after spawn completion
272+
273+
## Known Issues
274+
275+
### Windows: Trace files break formatter
276+
When `VITE_LOG_OUTPUT=chrome-json` is set, trace files (`trace-*.json`) are written to the project working directory. On Windows, `vp fmt` picks up these files and fails with "Unterminated string constant" because the trace files contain very long PATH strings.
277+
278+
**Recommendation**: Add `trace-*.json` to formatter ignore patterns, or write trace files to a dedicated directory outside the workspace.
279+
280+
### NAPI trace files empty for some projects
281+
The Chrome tracing `FlushGuard` stored in a static `OnceLock` is never dropped when `process.exit()` is called. Fixed by adding `shutdownTracing()` NAPI function called before exit (commit `72b23304`). Some projects (frm-stack) produce traces because their exit path differs.
282+
283+
## Methodology
284+
285+
- **Tracing**: Rust `tracing` crate with `tracing-chrome` subscriber (Chrome DevTools JSON format)
286+
- **Environment variables**: `VITE_LOG=debug`, `VITE_LOG_OUTPUT=chrome-json`
287+
- **CI environment**: GitHub Actions ubuntu-latest runner
288+
- **Measurement PRs**:
289+
- vite-task: https://github.com/voidzero-dev/vite-task/pull/178
290+
- vite-plus: https://github.com/voidzero-dev/vite-plus/pull/663
291+
- **Trace sources**: Global CLI traces (6 files, vibe-dashboard), NAPI traces (20 files, frm-stack)

0 commit comments

Comments
 (0)