feat: add STS web identity and stabilize live e2e#128
Conversation
1a1645e to
84c5551
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 84c5551c4c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let sts_tls_config = crate::sts::tls::OperatorStsTlsConfig::from_env(); | ||
| let tls_server_config = if sts_tls_config.enabled { | ||
| let material = | ||
| crate::sts::tls::load_or_create_sts_tls_material(&client, &sts_tls_config).await?; |
There was a problem hiding this comment.
Keep controller running when STS bootstrap fails
This ? makes STS bootstrap a hard prerequisite for the entire operator process. If STS TLS setup fails (for example, missing/invalid sts-tls with OPERATOR_STS_TLS_AUTO=false) the function returns early and the reconcile controller never starts. Since STS is enabled by default, an STS-only misconfiguration can cause a full control-plane outage; this should degrade by disabling STS and continuing controller startup.
Useful? React with 👍 / 👎.
| error = %error.code(), | ||
| "TokenReview denied STS request" | ||
| ); | ||
| return xml_response(StatusCode::BAD_REQUEST, error.as_xml()); |
There was a problem hiding this comment.
Return 500 for TokenReview backend failures
The TokenReview path can produce StsError::InternalError (e.g., Kubernetes API failure or RBAC regression), but this branch always returns HTTP 400. That misclassifies server-side outages as client input errors, making failures harder to detect and potentially preventing correct retry behavior. This branch should map InternalError to 500 instead of always using 400.
Useful? React with 👍 / 👎.
Type of Change
Related Issues
N/A
Summary of Changes
This PR adds the operator STS web identity path and stabilizes the live e2e workflow around it.
PolicyBindingAPI, generated CRDs, RBAC, Helm/k8s-dev manifests, and an operator STS service endpoint.e2e-live-runrepeatable by resetting Tenant/PVC/PV/hostPath fixtures before the suites run, whilests_functionalreuses the Ready smoke Tenant instead of recreating storage.Checklist
make pre-commit(fmt-check + clippy + test + console-lint + console-fmt-check)[Unreleased](if user-visible change)Impact
Verification
Additional Notes
The repeated
make e2e-live-runverification checks that local PVs and hostPath data are reset between live runs and that STS reuses the smoke Tenant without destabilizing PVC binding.Thank you for your contribution! Please ensure your PR follows the community standards (CODE_OF_CONDUCT.md) and sign the CLA if this is your first contribution.