Harden quartile statistics and measurement timeout handling by oleksandr-pavlyk · Pull Request #392 · NVIDIA/nvbench

oleksandr-pavlyk · 2026-06-26T21:49:41Z

This PR hardens the robust timing-summary utilities and the measurement paths that consume them.

Changes include:

Use std::move instead of std::forward when consuming an rvalue vector.
Avoid generating derived cold-measurement timing summaries when no samples were accepted, but keep wall-time summaries and timeout diagnostics available for zero-sample timeout cases.
Constrain direct percentile/quartile helpers to floating-point value types.
Guard percentile/quartile routines against NaN inputs before sorting or selection.
Add regression coverage for quartile behavior around the sort/selection threshold, including duplicate-heavy inputs.
Document the sort/selection threshold rationale.
Preserve legacy stdev/relative summary tags for low-sample runs by emitting the standard-deviation unavailable sentinel.
Treat missing, NaN, and negative stdev noise as unavailable in timeout diagnostics, while preserving +inf as the over-threshold sentinel.
Share timeout-warning logic between cold and CPU-only measurement paths.

Testing

Added/updated testing/statistics.cu coverage for quartiles, NaN inputs, invalid noise inputs, and threshold-boundary behavior.
Added testing/measure_timeout_warnings.cu coverage for timeout-warning noise diagnostics.
Existing measurement tests continue to exercise cold and CPU-only summary generation.

Changes are driven by agentic review.

Also add comment within percentile_rank to document precondition on input values checked with assert statement. Also, sharpened the comment around percentile_rank function

Prepare duplicate heavy input and check sort-based quartile computation result with selection-based one. std::nth_element only guarantees that the nth element is the value that would appear there in sorted order; it does not fully sort equal partitions. Bugs in the selection implementation, especially when selecting Q1 from the left half and Q3 from the right half after selecting the median, are more likely to show up when many samples equal the quartile values.

…ulated Cold measurement can discard throttled trials before incrementing the accepted sample count, then stop on timeout with zero recorded samples. In that case, only emit the sample-size summary and skip derived timing, bandwidth, clock, and bulk summaries that require accepted samples. This avoids divide-by-zero mean calculations and quartile/IQR computation over empty sample vectors. Keep timeout diagnostics reachable for zero-sample runs and add an explicit warning when no accepted cold samples were recorded. Factor timeout warning emission into a private helper so the zero-sample and normal paths share the same diagnostic logic. Suppress low-sample relative stdev noise Add a statistics helper that returns no relative standard-deviation noise until there are enough samples for a meaningful estimate. Use it for cold CPU/GPU and CPU-only summaries so the low-sample +inf stdev sentinel is not published as real relative noise or used for max-noise timeout warnings. Add statistics coverage for suppressing the low-sample sentinel and computing relative stdev noise once the sample threshold is reached. compute_standard_deviation_noise return nullopt if standard deviation is not finite Test verify that noise is nullopt when not enough samples are accumulated Added statistics::has_enough_samples_for_noise_estimate(...) Used it in standard_deviation, compute_standard_deviation_noise, compute_robust_noise. Added timeout diagnostics in cold and CPU-only paths. if max-noise is configured and the run timed out before enough samples exist to estimate noise, the log now says that explicitly, otherwise the existing “over noise threshold” warning remains unchanged. Added a statistics test assertion for the new sample-count predicate.

Add fixed expected-value assertions for quartile tests around the sort/selection switch point, including duplicate-heavy inputs. This keeps the tests from only proving that both implementations agree with each other.

Introduce new header file with inline implementation. Use it from measure_cold.cuh and measure_cpu_only.cxx

oleksandr-pavlyk · 2026-06-26T21:49:58Z

@coderabbitai full review

coderabbitai · 2026-06-26T21:57:27Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 63e553c4-61d8-43ef-bb86-49405d96d8da

📥 Commits

Reviewing files that changed from the base of the PR and between b1c61e5 and d230a16.

📒 Files selected for processing (8)

nvbench/detail/measure_cold.cu
nvbench/detail/measure_cold.cuh
nvbench/detail/measure_cpu_only.cxx
nvbench/detail/measure_timeout_warnings.cuh
nvbench/detail/statistics.cuh
testing/CMakeLists.txt
testing/measure_timeout_warnings.cu
testing/statistics.cu

📝 Walkthrough

Summary by CodeRabbit

New Features
- Enhanced measurement-timeout diagnostics with more contextual warnings when limits are exceeded or noise can’t be estimated.
Bug Fixes
- Cleaner handling for runs with zero accepted samples (early, explicit warning) and improved relative stdev noise behavior.
- Safer NaN-aware percentile/quartile outputs and more consistent noise estimation gating.
Tests
- Added dedicated timeout-warning tests and expanded coverage for NaN quartiles/percentiles, duplicate-heavy inputs, and standard-deviation-noise calculations.

Walkthrough

Adds shared statistics helpers, NaN-aware percentile/quartile handling, centralized timeout-warning logging for CPU-only and cold measurements, and expanded tests for NaN, duplicate-heavy, and invalid-noise cases.

Changes

Statistics and timeout diagnostics

Layer / File(s)	Summary
Noise helpers `nvbench/detail/statistics.cuh`	Adds a shared sample-size predicate, uses it in standard-deviation and robust-noise gating, and adds `compute_standard_deviation_noise(...)`.
NaN-aware percentiles `nvbench/detail/statistics.cuh`	Makes percentile and quartile helpers return quiet NaNs for NaN-containing inputs and updates the quartile-selection crossover comment.
Timeout warnings in CPU-only measurements `nvbench/detail/measure_timeout_warnings.cuh`, `nvbench/detail/measure_cpu_only.cxx`	Adds shared timeout-warning logging helpers and routes CPU-only timeout warnings through them with the new standard-deviation noise value.
Timeout warnings in cold measurements `nvbench/detail/measure_cold.cuh`, `nvbench/detail/measure_cold.cu`	Declares and uses `log_timeout_warnings(...)`, moves the walltime summary earlier, and adds the zero-accepted-samples early return.
Statistics validation cases `testing/CMakeLists.txt`, `testing/measure_timeout_warnings.cu`, `testing/statistics.cu`	Adds the new timeout-warning test target and expands statistics tests for NaN, duplicate-heavy, and invalid-noise cases.

_{Comment @coderabbitai help to get the list of available commands.}

… constant Expose quartile threshold value, use it in testing to test around that value.

Keep legacy stdev/relative summary tags present even when too few samples are available to compute a meaningful standard-deviation noise estimate. Use the standard-deviation unavailable sentinel for those values so existing summary consumers continue to see the expected tags. Factor the sentinel into the statistics helpers and use it from both standard_deviation() and stdev_noise_or_sentinel(), keeping the schema compatibility behavior explicit and tested.

…navailable Add a focused test target, nvbench.test.measure_timeout_warnings, covering: - NaN stdev noise -> “unable to estimate noise” - negative stdev noise -> “unable to estimate noise” - +inf stdev noise -> “over noise threshold”

oleksandr-pavlyk · 2026-06-28T12:59:28Z

@coderabbitai full review

check_noise_warning() now takes std::optional<nvbench::float64_t>, matching the production helper, and the test now covers std::nullopt explicitly in addition to NaN, negative, and +inf.

oleksandr-pavlyk · 2026-06-28T13:24:14Z

@coderabbitai full review

Document that percentile helpers return quiet NaNs for NaN-containing inputs. Make quartile expected-value tests compute ranks from the documented round(p / 100 * (n - 1)) rule instead of reusing statistics::percentile_rank(), so rank regressions are caught independently. Extend timeout-warning coverage to exercise the too-few-samples max-noise path in addition to unavailable, invalid, and infinite stdev-noise inputs.

oleksandr-pavlyk · 2026-06-28T13:51:09Z

@coderabbitai full review

oleksandr-pavlyk added 9 commits June 26, 2026 16:28

Replace forwarding with semantically more accurate std::move

214d286

Also add comment within percentile_rank to document precondition on input values checked with assert statement. Also, sharpened the comment around percentile_rank function

Test quartile values across selection threshold

d9cdd8b

Add fixed expected-value assertions for quartile tests around the sort/selection switch point, including duplicate-heavy inputs. This keeps the tests from only proving that both implementations agree with each other.

Add NaN guards to percentiles and quartiles computation routines

04290dd

Add tests for handling of NaNs in quartile routine inputs

86eb2a8

Add comment re magic sort/select threshold value

7069a6b

Refactor logic of emitting warnings between cold and cpu-only measures

b0932b0

Introduce new header file with inline implementation. Use it from measure_cold.cuh and measure_cpu_only.cxx

Add static assertion that ValueType is a floating-point type

6b85a9b

This comment was marked as outdated.

Sign in to view

This comment was marked as resolved.

Sign in to view

oleksandr-pavlyk added 5 commits June 26, 2026 17:02

Check consistency of sort- vs. select-based quartiles using threshold…

caa2f46

… constant Expose quartile threshold value, use it in testing to test around that value.

Collapsed two branches with identical bodies

b6b0dd1

test_compute_standard_deviation_noise exercises other invalid inputs

5546726

This comment was marked as outdated.

Sign in to view

This comment was marked as resolved.

Sign in to view

Test nullopt explicitly in warning check test

36d8c5b

check_noise_warning() now takes std::optional<nvbench::float64_t>, matching the production helper, and the test now covers std::nullopt explicitly in addition to NaN, negative, and +inf.

This comment was marked as outdated.

Sign in to view

This comment was marked as resolved.

Sign in to view

This comment was marked as outdated.

Sign in to view

oleksandr-pavlyk marked this pull request as ready for review June 28, 2026 13:57

oleksandr-pavlyk requested a review from fbusato June 28, 2026 13:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Harden quartile statistics and measurement timeout handling#392

Harden quartile statistics and measurement timeout handling#392
oleksandr-pavlyk wants to merge 16 commits into
NVIDIA:mainfrom
oleksandr-pavlyk:harden-robust-stats-and-timeout-diag

oleksandr-pavlyk commented Jun 26, 2026 •

edited

Loading

Uh oh!

oleksandr-pavlyk commented Jun 26, 2026

Uh oh!

This comment was marked as outdated.

coderabbitai Bot commented Jun 26, 2026 •

edited

Loading

Summary by CodeRabbit

Walkthrough

Changes

Uh oh!

This comment was marked as resolved.

Uh oh!

oleksandr-pavlyk commented Jun 28, 2026

Uh oh!

This comment was marked as outdated.

This comment was marked as resolved.

Uh oh!

oleksandr-pavlyk commented Jun 28, 2026

Uh oh!

This comment was marked as outdated.

This comment was marked as resolved.

Uh oh!

oleksandr-pavlyk commented Jun 28, 2026

Uh oh!

This comment was marked as outdated.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

oleksandr-pavlyk commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing

Uh oh!

oleksandr-pavlyk commented Jun 26, 2026

Uh oh!

This comment was marked as outdated.

coderabbitai Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Uh oh!

This comment was marked as resolved.

Uh oh!

oleksandr-pavlyk commented Jun 28, 2026

Uh oh!

This comment was marked as outdated.

This comment was marked as resolved.

Uh oh!

oleksandr-pavlyk commented Jun 28, 2026

Uh oh!

This comment was marked as outdated.

This comment was marked as resolved.

Uh oh!

oleksandr-pavlyk commented Jun 28, 2026

Uh oh!

This comment was marked as outdated.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

oleksandr-pavlyk commented Jun 26, 2026 •

edited

Loading

coderabbitai Bot commented Jun 26, 2026 •

edited

Loading