Skip to content

feat(host-to-org): route verified custom domains to their org (CL-3)#500

Merged
TerrifiedBug merged 2 commits into
mainfrom
feat/cl-3-custom-domain-routing
Jun 8, 2026
Merged

feat(host-to-org): route verified custom domains to their org (CL-3)#500
TerrifiedBug merged 2 commits into
mainfrom
feat/cl-3-custom-domain-routing

Conversation

@TerrifiedBug

@TerrifiedBug TerrifiedBug commented Jun 8, 2026

Copy link
Copy Markdown
Owner

CL-3 — custom-domain routing slice

Today resolveOrgIdFromHost (src/lib/host-to-org.ts) resolves an org from a
<slug>.vectorflow.sh subdomain and falls back to DEFAULT_ORG_ID for any
other host. So a custom hostname pointed at the platform (e.g. logs.acme.com
CNAME'd in) never routes to its org, even when the org has proven ownership via
a verified OrganizationDomainClaim (DNS-TXT).

Change

resolveOrgIdFromHost now resolves in two ordered paths:

  1. Subdomain path (unchanged hot path): if the first label is a valid slug,
    look up Organization.slug. The <orgSlug>.vectorflow.sh scheme + slug
    grammar (isValidOrgSlug) are preserved and take precedence.
  2. Custom-domain path (new): when no slug matches, look up a verified
    OrganizationDomainClaim (verifiedAt not null) whose domain equals the
    full hostname → that org.

Only when neither matches do we fall back to DEFAULT_ORG_ID (still fails open;
cross-org leakage is prevented by RLS + per-org JWT secrets, not this lookup).

Caching

Added a short (30s) in-process TTL cache keyed by normalised hostname, caching
both positive and negative results so custom-domain hosts don't pay a DB
round-trip per request. Oldest-first eviction caps memory (1024 entries) against
abusive Host: headers. Transient DB errors are not cached (fail open +
retry next request). The short TTL is the staleness bound that holds across
multiple server instances (newly-verified / removed claims start / stop routing
within the TTL).

Runtime boundary (documented in code)

The custom-domain lookup is a DB read. resolveOrgIdFromHost is consumed only
by the Node auth layer — src/auth.ts (per-org NextAuth instance + OIDC)
and src/app/api/scim/v2/auth.ts. The edge middleware src/proxy.ts
deliberately does not call this resolver (it only does auth-gating + CSP nonce
and resolves org from the session, not the host), so no DB access is added at
the edge
. createContext (src/trpc/init.ts) resolves org from the session's
OrgMember, so it is unaffected.

Migration

None. OrganizationDomainClaim already exists, and @@index([domain])
(plus the partial unique index OrganizationDomainClaim_domain_verified_unique
on verifiedAt IS NOT NULL rows) already cover the equality probe
WHERE domain = ? AND verifiedAt IS NOT NULL. No schema change.

Tests

Extended src/lib/__tests__/host-to-org.test.ts (18 pass):

  • verified claim for logs.acme.com → its org;
  • unverified claim → does NOT resolve (query filters verifiedAt: { not: null }) → DEFAULT_ORG_ID;
  • unknown custom domain → DEFAULT_ORG_ID;
  • existing subdomain / OSS / IP / fail-open cases still pass;
  • cache: repeated host resolves once (asserts the claim + slug lookups are each called once).

npx vitest run src/lib/__tests__/host-to-org.test.ts → 18/18. Filtered
tsc --noEmit clean for the changed files. Direct consumer
(scim/v2/Groups/route.test.ts) still passes.

Editor/agent/cloud/clickhouse paths are not locally verifiable; unit tests +
types are the bar for this change. No project-wide lint/build was run.

Acceptance: a verified custom domain routes to its org.

Resolve the org for a non-subdomain Host via a verified
OrganizationDomainClaim (verifiedAt not null) whose domain equals the
full hostname, falling back to DEFAULT_ORG_ID only when no verified
claim matches. The existing <orgSlug>.vectorflow.sh subdomain path and
slug grammar are preserved and still take precedence.

Add a short (30s) in-process TTL cache keyed by hostname (positive +
negative) so custom-domain hosts don't pay a DB round-trip per request,
with oldest-first eviction to bound memory against abusive Host headers.

The custom-domain lookup is a DB read, consumed only by the Node auth
layer (src/auth.ts per-org NextAuth/OIDC, SCIM auth). The edge
middleware (src/proxy.ts) deliberately does not call this resolver and
stays on the subdomain/auth-gate path, so no DB access is added at the
edge. Boundary documented in code.

No migration: @@index([domain]) and the partial unique index on
verified rows already cover the equality probe.
@github-actions github-actions Bot added feature and removed feature labels Jun 8, 2026
…L-3)

The slug path ran on any multi-label host's first label, so a custom domain
(e.g. logs.acme.com) whose first label collides with an existing org slug (logs)
was misrouted to the slug-org, shadowing the verified claim (confused-deputy on
auth/OIDC/SCIM config). Reorder to claim-first: a DNS-verified full-host
OrganizationDomainClaim wins over the slug-prefix match; a claim can't exist for
*.vectorflow.sh (no tenant DNS-TXT), so genuine subdomains are unaffected. Cache
covers the extra indexed probe on the subdomain hot path. +shadowing test.
@github-actions github-actions Bot added feature and removed feature labels Jun 8, 2026
@TerrifiedBug TerrifiedBug merged commit 175f5ed into main Jun 8, 2026
14 checks passed
@TerrifiedBug TerrifiedBug deleted the feat/cl-3-custom-domain-routing branch June 8, 2026 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant