Add reference examples and guidance (especially for multi-device scenarios) for ValidateCompiledModelCompatibilityInfo #29168
Add reference examples and guidance (especially for multi-device scenarios) for ValidateCompiledModelCompatibilityInfo #29168adrastogi wants to merge 4 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR clarifies the intended semantics of OrtEpFactory::ValidateCompiledModelCompatibilityInfo—especially for multi-device EP scenarios—by documenting the required “per-device verdict then worst-of fold” behavior, and updates the example plugin EP + tests to match that guidance.
Changes:
- Expanded
ValidateCompiledModelCompatibilityInfodocumentation to define multi-device interpretation and required verdict folding rules (with guidance for “no opinion” vs “bad artifact” cases). - Updated the example plugin EP’s compatibility validation to compute per-device compatibility and combine results using the documented reduction.
- Added end-to-end and unit tests covering single-device, “no opinion”, and multi-device folding behavior.
Show a summary per file
| File | Description |
|---|---|
| onnxruntime/test/autoep/test_model_package.cc | Adds tests that exercise the public compatibility API against the example plugin EP, including multi-device folding behavior. |
| onnxruntime/test/autoep/library/example_plugin_ep/ep_factory.cc | Refactors the example EP factory’s compatibility validation to evaluate per-device and combine verdicts per the documented rules. |
| onnxruntime/test/autoep/library/example_plugin_ep/compatibility_combine.h | Introduces a shared helper to combine per-device compatibility verdicts using the documented identity + worst-of rule. |
| include/onnxruntime/core/session/onnxruntime_ep_c_api.h | Adds detailed guidance and a normative algorithm for multi-device compatibility validation and result combination. |
Copilot's findings
- Files reviewed: 4/4 changed files
- Comments generated: 3
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
| * \param[in] num_devices Number of entries in `devices`. | ||
| * \param[in] num_devices Number of entries in `devices`. May be 0 when no device-specific context is available; | ||
| * in that case evaluate `compatibility_info` against the EP's own configuration and do NOT | ||
| * dereference `devices`. |
There was a problem hiding this comment.
Do we ever call this with zero devices? OrtApis::GetModelCompatibilityForEpDevices requires at least one.
| * All devices provided belong to the same execution provider instance that this factory creates. The set represents | ||
| * the devices the EP would run the model on *together* (e.g., multi-adapter or multi-GPU scenarios), NOT a menu of |
There was a problem hiding this comment.
Maybe a concrete example would help. e.g OV EP has OrtEpDevice instances for CPU, GPU and NPU. How many calls are we expecting to this API and with what combinations?
| * Required implementation when num_devices > 1 (a "best of any device" result is NOT permitted -- a single verdict | ||
| * cannot convey which device it applies to, so ORT would otherwise be told a model is runnable on a set that | ||
| * contains a device it cannot run on): |
There was a problem hiding this comment.
Wouldn't the user provide the same set of OrtEpDevice instances when running the model and that's a hard requirement?
e.g. say the devices were CPU and NPU. model is optimal for NPU. it's up to the EP which devices it actually uses at runtime, so as long as I have the optimal device in that set why should I downgrade the rating due to extra devices as I am not forced to use those?
Description
ValidateCompiledModelCompatibilityInfois underspecified in that it does not provide enough guidance for EP implementors on how to handle multi-device cases. This change attempts to fill that gap by defining the intended prioritization of the various compatibility states. It also updates the reference implementation in the test plugin EP.Motivation and Context
This change is being made in response to customer feedback.