Skip to content

fd/eventfd: wake all registered pollers on write, not just one#2491

Open
stlankes wants to merge 1 commit into
hermit-os:mainfrom
stlankes:event
Open

fd/eventfd: wake all registered pollers on write, not just one#2491
stlankes wants to merge 1 commit into
hermit-os:mainfrom
stlankes:event

Conversation

@stlankes

Copy link
Copy Markdown
Contributor

A write made the eventfd readable but only woke the oldest waiter (pop_front). With the poll(2)-based fd multiplexing re-registering a fresh waker on every poll, the read queue accumulates stale wakers, so the single wake could hit a dead waker and miss the real one — a cross-thread wakeup (e.g. tokio's I/O driver) could then be lost. Drain the queue and wake every registered poller, matching level-triggered readiness.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark Results

Details
Benchmark Current: 926a17a Previous: f83a235 Performance Ratio
startup_benchmark Build Time 76.57 s 77.08 s 0.99
startup_benchmark File Size 0.73 MB 0.73 MB 1.00
Startup Time - 1 core 0.74 s (±0.02 s) 0.73 s (±0.02 s) 1.02
Startup Time - 2 cores 0.72 s (±0.02 s) 0.75 s (±0.02 s) 0.96
Startup Time - 4 cores 0.74 s (±0.02 s) 0.76 s (±0.02 s) 0.98
multithreaded_benchmark Build Time 76.92 s 80.66 s 0.95
multithreaded_benchmark File Size 0.86 MB 0.86 MB 1.00
Multithreaded Pi Efficiency - 2 Threads 90.43 % (±4.97 %) 89.44 % (±5.91 %) 1.01
Multithreaded Pi Efficiency - 4 Threads 43.63 % (±2.40 %) 44.06 % (±2.65 %) 0.99
Multithreaded Pi Efficiency - 8 Threads 25.72 % (±1.21 %) 25.56 % (±1.89 %) 1.01
micro_benchmarks Build Time 73.30 s 87.20 s 0.84
micro_benchmarks File Size 0.87 MB 0.87 MB 1.00
Scheduling time - 1 thread 75.19 ticks (±1.12 ticks) 67.04 ticks (±1.91 ticks) 1.12
Scheduling time - 2 threads 40.31 ticks (±3.78 ticks) 39.33 ticks (±3.42 ticks) 1.02
Micro - Time for syscall (getpid) 3.58 ticks (±0.35 ticks) 3.43 ticks (±0.24 ticks) 1.04
Memcpy speed - (built_in) block size 4096 83332.79 MByte/s (±57641.25 MByte/s) 84527.91 MByte/s (±58322.62 MByte/s) 0.99
Memcpy speed - (built_in) block size 1048576 30601.84 MByte/s (±24632.77 MByte/s) 30845.87 MByte/s (±25037.16 MByte/s) 0.99
Memcpy speed - (built_in) block size 16777216 29689.78 MByte/s (±24394.89 MByte/s) 27492.52 MByte/s (±22708.20 MByte/s) 1.08
Memset speed - (built_in) block size 4096 83461.91 MByte/s (±57733.48 MByte/s) 84666.80 MByte/s (±58421.97 MByte/s) 0.99
Memset speed - (built_in) block size 1048576 31334.89 MByte/s (±25053.50 MByte/s) 31593.14 MByte/s (±25470.12 MByte/s) 0.99
Memset speed - (built_in) block size 16777216 30456.42 MByte/s (±24833.78 MByte/s) 28269.66 MByte/s (±23193.64 MByte/s) 1.08
Memcpy speed - (rust) block size 4096 74417.37 MByte/s (±52137.63 MByte/s) 74722.10 MByte/s (±52209.99 MByte/s) 1.00
Memcpy speed - (rust) block size 1048576 30530.38 MByte/s (±24581.65 MByte/s) 30824.11 MByte/s (±25054.28 MByte/s) 0.99
Memcpy speed - (rust) block size 16777216 29665.68 MByte/s (±24368.16 MByte/s) 27597.67 MByte/s (±22766.03 MByte/s) 1.07
Memset speed - (rust) block size 4096 74973.55 MByte/s (±52546.16 MByte/s) 74865.22 MByte/s (±52302.23 MByte/s) 1.00
Memset speed - (rust) block size 1048576 31271.76 MByte/s (±25015.03 MByte/s) 31565.59 MByte/s (±25484.76 MByte/s) 0.99
Memset speed - (rust) block size 16777216 30433.51 MByte/s (±24807.93 MByte/s) 28374.92 MByte/s (±23250.83 MByte/s) 1.07
alloc_benchmarks Build Time 73.42 s 81.01 s 0.91
alloc_benchmarks File Size 0.81 MB 0.81 MB 1.00
Allocations - Allocation success 91.32 % 91.32 % 1
Allocations - Deallocation success 100.00 % 100.00 % 1
Allocations - Pre-fail Allocations 61.46 % 61.46 % 1
Allocations - Average Allocation time 5842.40 Ticks (±154.64 Ticks) 6133.98 Ticks (±129.62 Ticks) 0.95
Allocations - Average Allocation time (no fail) 6441.15 Ticks (±171.37 Ticks) 6784.80 Ticks (±184.79 Ticks) 0.95
Allocations - Average Deallocation time 1187.21 Ticks (±338.54 Ticks) 1292.85 Ticks (±344.36 Ticks) 0.92
mutex_benchmark Build Time 74.01 s 81.90 s 0.90
mutex_benchmark File Size 0.87 MB 0.87 MB 1.00
Mutex Stress Test Average Time per Iteration - 1 Threads 12.14 ns (±0.49 ns) 12.08 ns (±0.27 ns) 1.00
Mutex Stress Test Average Time per Iteration - 2 Threads 39.06 ns (±2.19 ns) 17.94 ns (±3.62 ns) 2.18

This comment was automatically generated by workflow using github-action-benchmark.

A write made the eventfd readable but only woke the oldest waiter
(`pop_front`). With the poll(2)-based fd multiplexing re-registering a
fresh waker on every poll, the read queue accumulates stale wakers, so the
single wake could hit a dead waker and miss the real one — a cross-thread
wakeup (e.g. tokio's I/O driver) could then be lost. Drain the queue and
wake every registered poller, matching level-triggered readiness.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants