Skip to content

feat: migrate infracost to console repo#3626

Open
kinjalh wants to merge 1 commit into
masterfrom
infracost-deployment-op-migration
Open

feat: migrate infracost to console repo#3626
kinjalh wants to merge 1 commit into
masterfrom
infracost-deployment-op-migration

Conversation

@kinjalh

@kinjalh kinjalh commented May 28, 2026

Copy link
Copy Markdown
Member

Test Plan

https://console.kinjal-gitgud.onplural.sh/

Checklist

  • I have added a meaningful title and summary to convey the impact of this PR to a user.
  • I have deployed the agent to a test environment and verified that it works as expected.
    • Agent starts successfully.
    • Service creation works without any issues when using raw manifests and Helm templates.
    • Service creation works when resources contain both CRD and CRD instances.
    • Service templating works correctly.
    • Service errors are reported properly and visible in the UI.
    • Service updates are reflected properly in the cluster.
    • Service resync triggers immediately and works as expected.
    • Sync waves annotations are respected.
    • Sync phases annotations are respected. Phases are executed in the correct order.
    • Sync hook delete policies are respected. Resources are not recreated once they reach the desired state.
    • Service deletion works and cleanups resources properly.
    • Services can be recreated after deletion.
    • Service detachment works and keeps resources unaffected.
    • Services can be recreated after detachment.
    • Service component trees are working as expected.
    • Cluster health statuses are being updated.
    • Agent logs do not contain any errors (after running for at least 30 minutes).
    • There are no visible anomalies in Datadog (after running for at least 30 minutes).
  • I have added tests to cover my changes.
  • If required, I have updated the Plural documentation accordingly.

@kinjalh kinjalh added the enhancement New feature or request label May 28, 2026
@greptile-apps

greptile-apps Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR migrates the infracost integration from the deployment-operator repo into the console repo, adding cost estimation for Terraform stack runs. When an INFRACOST_API_KEY env var is present on the stack run, the harness converts the binary plan to JSON, invokes infracost breakdown, and reports per-resource cost estimates back to the console API.

  • A new multi-arch Dockerfile stage downloads the infracost binary from the official GitHub release, enabling support beyond linux/amd64.
  • controller_hooks.go calls Infracost() after plan, with soft error handling so a failure never blocks the run.
  • The New() function in terraform.go adds a nil guard for config.Run but the guard comes after config.Run.Parallelism and config.Run.Refresh are dereferenced unconditionally in the struct literal, leaving a panic path open.

Confidence Score: 3/5

Safe to merge for the infracost feature itself, but the restructured New() function in terraform.go has a nil dereference that will panic if config.Run is ever nil — worth fixing before this lands.

The infracost logic is well-structured and the Dockerfile change is clean. The concern is in terraform.go's New(): the PR adds a config.Run != nil guard but the struct literal above it still directly accesses config.Run.Parallelism and config.Run.Refresh, so the guard provides no protection against a nil panic. Whether config.Run can realistically be nil in production depends on all callers, but the partial guard makes the intent unclear and leaves a latent crash path.

go/deployment-operator/pkg/harness/tool/terraform/terraform.go needs the nil guard corrected to cover all three config.Run field accesses.

Important Files Changed

Filename Overview
go/deployment-operator/pkg/harness/tool/terraform/terraform.go Refactors New() to populate tf.env from config.Run.Env(), but the added nil guard for config.Run comes after two unconditional dereferences of config.Run.Parallelism and config.Run.Refresh in the struct literal, leaving the panic risk unresolved.
go/deployment-operator/pkg/harness/tool/terraform/infracost.go New file implementing infracost integration: plan-to-JSON conversion, subprocess execution, and report parsing. Logic is sound; minor inconsistency in env passing for terraform show vs. infracost subprocess.
go/deployment-operator/pkg/harness/tool/terraform/infracost_types.go Defines JSON-mapping structs for the infracost report format and scope constants; InfracostResourceScopeFree is declared but never used.
go/deployment-operator/pkg/harness/controller/controller_hooks.go Calls in.tool.Infracost() after afterPlan, logs errors non-fatally, and forwards results to UpdateStackRun. Error handling is intentionally soft (infracost is optional).
go/deployment-operator/pkg/harness/tool/v1/tool.go Adds default Infracost() implementation on DefaultTool returning nil, nil — correct no-op for tools that don't support cost estimation.
go/deployment-operator/pkg/harness/tool/v1/types.go Adds Infracost() to the Tool interface; straightforward interface extension.
go/deployment-operator/dockerfiles/harness/terraform.Dockerfile Adds a multi-arch infracost binary download stage using the official GitHub release tarball, then copies the binary into the final image.
go/deployment-operator/pkg/harness/tool/terraform/terraform_types.go Adds the env []string field to Terraform struct for storing stack-run environment variables.

Reviews (1): Last reviewed commit: "migrate infracost to console repo" | Re-trigger Greptile

Comment on lines +270 to 282
tf := &Terraform{
DefaultTool: v1.DefaultTool{Scanner: config.Scanner},
workDir: config.WorkDir,
dir: config.ExecDir,
variables: config.Variables,
parallelism: config.Run.Parallelism,
refresh: config.Run.Refresh,
}).init()
}
if config.Run != nil {
tf.env = config.Run.Env()
}
return tf.init()
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Nil guard added after unconditional dereferences

The struct literal on lines 275–276 accesses config.Run.Parallelism and config.Run.Refresh before the if config.Run != nil check on line 278. If config.Run is ever nil, the function panics at the struct literal — the guard only protects tf.env. The nil check should wrap all three config.Run accesses, or the existing accesses should be moved inside the guard.

Comment on lines +59 to +65
type InfracostResourceScope string

const (
InfracostResourceScopeBreakdown InfracostResourceScope = "breakdown"
InfracostResourceScopePastBreakdown InfracostResourceScope = "past_breakdown"
InfracostResourceScopeDiff InfracostResourceScope = "diff"
InfracostResourceScopeFree InfracostResourceScope = "free"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 InfracostResourceScopeFree constant is unused

InfracostResourceScopeFree is declared but never referenced in convertResource or anywhere else in this PR. Resources with no hourly or monthly cost are silently dropped via the if hourlyCost == nil && monthlyCost == nil early return in convertResource, so this scope value is never emitted. Either remove the constant or use it to represent explicitly free resources.

Comment on lines +117 to +131
klog.V(log.LogLevelExtended).InfoS("executing", "command", "terraform show -json "+in.planFileName)

runErr := cmd.Run()
if closeErr := tmpFile.Close(); closeErr != nil && runErr == nil {
runErr = closeErr
}
if runErr != nil {
_ = os.Remove(tmpFile.Name())
return "", fmt.Errorf("failed converting plan to JSON: %s: %w", stderr.String(), runErr)
}

klog.V(log.LogLevelTrace).InfoS("converted terraform plan to JSON", "tempFile", filepath.Base(tmpFile.Name()))
return tmpFile.Name(), nil
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 terraform show subprocess does not inherit stack-run env vars

exec.CommandContext without setting cmd.Env inherits the calling process's OS environment, not the user-configured in.env passed to the infracost subprocess via harnessexec.WithEnv(in.env). For the terraform show -json call this is likely benign (plan conversion only reads a local file), but the inconsistency could bite if a custom provider or wrapper reads a credential from the stack-run env. Consider adding cmd.Env = in.env for parity.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@kinjalh kinjalh changed the title migrate infracost to console repo feat: migrate infracost to console repo Jun 2, 2026
@kinjalh kinjalh force-pushed the infracost-deployment-op-migration branch from 889ef17 to 8c71b43 Compare June 2, 2026 15:13
@kinjalh kinjalh force-pushed the infracost-deployment-op-migration branch from 8c71b43 to c34f0b4 Compare June 5, 2026 14:39
@michaeljguarino

Copy link
Copy Markdown
Member

@kinjalh I'm not the right person to review this, make sure sebastian, marcin or lukasz look it over.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants