Skip to content

feat(verifier): implement local directory support#37

Merged
yusufozturk merged 2 commits into
mainfrom
DT-819
Jun 30, 2026
Merged

feat(verifier): implement local directory support#37
yusufozturk merged 2 commits into
mainfrom
DT-819

Conversation

@erenaslandev

@erenaslandev erenaslandev commented Jun 29, 2026

Copy link
Copy Markdown
Member

Summary by CodeRabbit

  • New Features

    • Added support for verifying generator outputs from a local filesystem directory, not just S3-backed storage.
    • Updated verifier configuration to select local vs S3 mode, with correct path/glob handling and format-specific reads.
    • Compose-based runs now automatically initialize and mount shared data for local verification, and inject only the required verifier environment variables.
  • Bug Fixes

    • Improved validation to enforce exactly one verifier source (either local or S3) and reject invalid local paths outside the shared mount.
  • Tests

    • Expanded coverage for local SQL generation, prelude behavior, and compose rendering.

@coderabbitai

coderabbitai Bot commented Jun 29, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d072dbea-ac4e-48c4-8a5f-d02aa514b12e

📥 Commits

Reviewing files that changed from the base of the PR and between 2288beb and f3ce343.

📒 Files selected for processing (4)
  • internal/config/case.go
  • internal/config/verifier_test.go
  • internal/orchestrator/docker.go
  • internal/orchestrator/verifier_render_test.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • internal/config/verifier_test.go
  • internal/config/case.go
  • internal/orchestrator/docker.go

Walkthrough

Adds local filesystem verification as an alternative to S3-backed verification. VerifierConfig gains a local_dir field and validation rules for mutual exclusivity and /data-scoped paths. The verifier container reads local output paths directly, and Docker Compose adds shared-volume and initialization wiring for local mode.

Changes

Local Filesystem Verifier Mode

Layer / File(s) Summary
VerifierConfig schema and validation
internal/config/case.go, internal/config/verifier_test.go
Adds local_dir and IsLocal() to VerifierConfig; validateVerifier() enforces exactly one of s3_bucket/local_dir, rejects paths outside /data, and skips emulator requirements for local mode. Tests cover local, mutual-exclusion, path, and format cases.
Verifier container local source and prelude
containers/verifier/main.go, containers/verifier/main_test.go
Adds LocalDir to runtime config, reads VERIFIER_LOCAL_DIR, validates mutual exclusivity, and switches sourceExpr and prelude between local filesystem reads and S3/httpfs setup. Tests cover local glob patterns and SQL redaction behavior.
Docker Compose shared-data volume and data-init service
internal/orchestrator/docker.go, internal/orchestrator/verifier_render_test.go
Adds VerifierLocalDir compose wiring, forces shared-data usage for local verifier runs, renders the data-init service, and conditions verifier env wiring on local versus S3-backed mode. Tests cover env vars, volume mounts, and clustered dependencies.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • yusufozturk

Poem

🐇 No S3 bucket? Hop along, hooray,
local paths now lead the verifier day.
With shared-data mounted and data-init bright,
the bunny reads files by filesystem light.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the main change: adding local directory support to the verifier.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch DT-819

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
internal/orchestrator/verifier_render_test.go (1)

120-145: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Cover the data-init startup gate in this render test.

The local verifier path depends on data-init completing before the subject starts, but this test would still pass if that service or dependency were removed. Assert both the service and depends_on.data-init.condition.

Suggested test addition
 	subject := services["subject"].(map[string]any)
 	if !hasSharedDataMount(subject) {
 		t.Errorf("subject missing shared-data:/data mount for local verifier:\n%s", out)
 	}
+
+	if _, ok := services["data-init"].(map[string]any); !ok {
+		t.Fatalf("data-init service not rendered for local verifier:\n%s", out)
+	}
+	deps, _ := subject["depends_on"].(map[string]any)
+	dataInitDep, _ := deps["data-init"].(map[string]any)
+	if dataInitDep["condition"] != "service_completed_successfully" {
+		t.Errorf("subject missing data-init completion dependency:\n%s", out)
+	}
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/orchestrator/verifier_render_test.go` around lines 120 - 145, The
local verifier render test only checks mounts and env vars, so it can miss
regressions where the startup gate is removed. Update
TestComposeRendersLocalVerifier in verifier_render_test.go to also assert that
the rendered services include data-init and that the subject service’s
depends_on entry for data-init uses the expected condition, alongside the
existing shared-data mount checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@containers/verifier/main.go`:
- Around line 181-195: The local-mode glob built in the path selection logic for
the verifier misses files directly under cfg.LocalDir because the current
recursive pattern only matches subdirectories. Update the glob construction in
the verifier’s reader helper to include both top-level and recursive matches for
local directories (for example by combining patterns before sqlQuote is
applied), while keeping the S3 branch unchanged and preserving the existing
read_parquet/read_avro selection.

In `@internal/config/case.go`:
- Around line 1904-1916: The verifier validation in tc.Verifier needs to reject
local_dir values outside the shared /data mount, since the current
mutual-exclusion checks in the case validation path only verify presence and
exclusivity. Update the validation logic around tc.Verifier.IsLocal() to require
a cleaned container path rooted under /data for local_dir, and return a clear
error from the same case-validation flow when the path is outside that mount.

In `@internal/orchestrator/docker.go`:
- Around line 179-184: The clustered subject service block in docker generation
is missing the same local /data wiring that the singular subject branch uses.
Update the service template logic around the subject rendering path so clustered
subjects either also depend on data-init and mount shared-data:/data, or are
rejected during validation when Verifier.LocalDir is enabled. Use the existing
docker template branches for .Subjects, subject, data-init, and VerifierLocalDir
to locate and mirror the current singular behavior.
- Line 596: The data-init command in docker.go is directly templating
VerifierLocalDir into a root shell command, which allows shell metacharacters to
alter execution. Update the Docker command construction used by the relevant
init/container setup to pass VerifierLocalDir via an environment variable
instead of interpolating it into /bin/sh -c, and ensure the shell reference is
quoted/safely expanded when used in the command string.

---

Nitpick comments:
In `@internal/orchestrator/verifier_render_test.go`:
- Around line 120-145: The local verifier render test only checks mounts and env
vars, so it can miss regressions where the startup gate is removed. Update
TestComposeRendersLocalVerifier in verifier_render_test.go to also assert that
the rendered services include data-init and that the subject service’s
depends_on entry for data-init uses the expected condition, alongside the
existing shared-data mount checks.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 80f0eccc-cd89-42b2-9bc9-f6582747270d

📥 Commits

Reviewing files that changed from the base of the PR and between 3c78e70 and 2288beb.

📒 Files selected for processing (6)
  • containers/verifier/main.go
  • containers/verifier/main_test.go
  • internal/config/case.go
  • internal/config/verifier_test.go
  • internal/orchestrator/docker.go
  • internal/orchestrator/verifier_render_test.go

Comment thread containers/verifier/main.go
Comment thread internal/config/case.go
Comment thread internal/orchestrator/docker.go
Comment thread internal/orchestrator/docker.go Outdated
@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 30, 2026

Copy link
Copy Markdown

Deploying pipebench with  Cloudflare Pages  Cloudflare Pages

Latest commit: f3ce343
Status: ✅  Deploy successful!
Preview URL: https://9bc10105.pipebench.pages.dev
Branch Preview URL: https://dt-819.pipebench.pages.dev

View logs

@yusufozturk yusufozturk merged commit 9beffb9 into main Jun 30, 2026
5 checks passed
@yusufozturk yusufozturk deleted the DT-819 branch June 30, 2026 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants