Reorganize backend by topic (vertical slice) [stacked on #112]#116
Reorganize backend by topic (vertical slice) [stacked on #112]#116anth-volk wants to merge 12 commits into
Conversation
Rename backend/tooling/ -> backend/engine/ and backend/scripts/build_reference.py -> backend/engine/reference.py, so the deterministic PolicyEngine compute layer is one topic folder (#114 slice 1). - Update all tooling.* imports -> engine.* (internal engine files, agent_tools.py, tool_definitions.py, tests/test_agent_tools.py). - Update run paths: modal_app.py, backend/Dockerfile (python engine/reference.py), plus the reference.py header and the chatbot/prompts comment strings. - reference.md / scope_descriptor.md output paths are unchanged (parent.parent still resolves to backend/). No behaviour change. Full suite: 155 passed, 10 skipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the single backend/model_config.py with a config/ topic package (#114 slice 2), and move the chat-only model constants out of chatbot.py so all model-call configuration lives in one place: - config/sampling.py DEFAULT_TEMPERATURE, SUGGESTION_TEMPERATURE - config/models.py DEFAULT_FAST_MODEL, DEFAULT_COMPLEX_MODEL, TITLE_MODEL, SUGGESTION_MODEL, SUGGESTION_TIMEOUT_SECS, FAST_MODEL_MAX_INPUT_TOKENS (the last five moved from chatbot.py — kills the fast-here/complex-there asymmetry) - config/clients.py get_sync_client - config/scope.py load_scope_descriptor (path fixed parent -> parent.parent now that it sits one level under backend/) - config/__init__.py re-exports the public surface, so importers just swap the module name (from model_config import X -> from config import X). Importers updated: gateway.py, evaluation/providers.py, routes/chatbot.py. MAX_ITERATIONS stays in chatbot.py (orchestration, not model config). No behaviour change. Full suite: 155 passed, 10 skipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Split the single backend/prompts.py into a prompts/ topic package by audience (#114 slice 3): - prompts/system.py the compute SYSTEM_PROMPT sections + CHARTS_MODE_DIRECTIVE - prompts/gateway.py DEFAULT_SCOPE_DESCRIPTOR, lightweight_system, the gateway classifier prompt, and the 4 per-outcome writer directives - prompts/meta.py TITLE_SYSTEM, SUGGESTION_SYSTEM - prompts/__init__.py re-exports the public surface, so every `from prompts import X` site (gateway.py, chatbot.py, evaluation/runner.py, test_prompts.py) is unchanged. No behaviour change. Full suite: 155 passed, 10 skipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Move backend/tool_definitions.py -> tools/definitions.py (the schemas the model sees) and backend/agent_tools.py -> tools/dispatch.py (TOOL_HANDLERS + execute_tool), so the model-facing tool seam is one topic folder (#114 slice 4). Importers point at the specific submodule rather than a re-export facade, so lightweight TOOL_DEFINITIONS consumers (gateway, gateway_config) don't pull in the heavy dispatch+engine import chain: - chatbot.py, evaluation/runner.py: from tools.definitions/from tools.dispatch - gateway.py, gateway_config.py: from tools.definitions import TOOL_DEFINITIONS - tests: import tools.dispatch as agent_tools / tools.definitions as tool_definitions (keeps the module-attribute + private-name access intact) No behaviour change. Full suite: 155 passed, 10 skipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Split the gateway into a topic package (#114 slice 5): - gateway/runtime.py was gateway.py (run_gateway, emit_plan, GatewayVerdict) - gateway/policy.py was gateway_config.py (criticality, gate, inferable) - gateway/__init__.py re-exports the runtime surface, so chatbot.py and the eval runner's lazy `from gateway import run_gateway` are unchanged. runtime.py now imports `from gateway.policy import ...`. test_gateway.py uses `from gateway import runtime as gateway` so its `gateway.get_sync_client` patch still targets the namespace run_gateway resolves against, and `from gateway.policy import ...` for the gate tests. No behaviour change. Gateway tests 33 passed; full suite 155 passed, 10 skipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Split the 301-line route file into a billing/ topic package (#114 slice 6): - billing/pricing.py pricing table + calculate_cost_gbp - billing/credits.py Supabase credit accounting (get_supabase, check_balance, record_usage, ...) - billing/stripe.py Stripe checkout-session + webhook integration (the SDK calls + credit side-effects, extracted verbatim) - billing/routes.py the /billing router + 4 thin endpoints - billing/__init__.py re-exports router (main.py) and check_balance/record_usage (the chat route's dynamic imports become `from billing import ...`) Function bodies are relocated verbatim; route paths are unchanged (/billing/balance,/usage,/checkout,/webhook all still register). Importers updated: main.py, chatbot.py (3 dynamic imports), test_billing.py. No behaviour change. Full suite: 155 passed, 10 skipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Split the 348-line route file into a conversations/ topic package (#114 slice 7): - conversations/models.py ChatConversation SQLModel + get_engine + ensure_table - conversations/schemas.py request/response models - conversations/store.py CRUD endpoints - conversations/sharing.py share + read-shared endpoints - conversations/reports.py GitHub-issue report builder + endpoint - conversations/routes.py prefix-less aggregator that include_router's the three endpoint groups (each carries the /conversations prefix so empty "" paths are legal) - conversations/__init__.py re-exports router + ensure_table (main.py startup) Function bodies relocated verbatim; the five route paths are unchanged. conftest's fixture now patches conversations.models._engine (via `from conversations import models as conversations`). No behaviour change. Full suite: 155 passed, 10 skipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Decompose the 707-line chat route into a chat/ topic package (#114 slice 8): - chat/orchestrator.py the SSE streaming tool loop (stream_chat) — verbatim - chat/system_blocks.py reference doc + scope descriptor + block assembly - chat/model_selection.py model pick + turn-classification helpers - chat/schemas.py ChatMessage / ChatRequest / TitleRequest - chat/suggestions.py follow-up suggestion chips - chat/titles.py title generation - chat/routes.py the /chat router + thin decorated endpoints - chat/__init__.py re-exports router The async Anthropic client factory moves to config/clients.get_async_client (get_sync_client's sibling). routes/ is now empty and removed. Importers updated: main.py (import chat), test_prompts.py (chat.system_blocks), test_api.py (patches chat.orchestrator: get_async_client / _generate_followup_suggestions / execute_tool; imports chat.routes.chat_message). Route paths /chat/message and /chat/title are unchanged. No behaviour change. Full suite: 155 passed, 10 skipped (incl. the mocked end-to-end stream + tool-loop test). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Rename the eval harness package backend/evaluation/ -> backend/eval/ (#114 slice 9). Updated only the module-path references (from eval.* / import eval.* / python -m eval.*) — the word "evaluation" in docstrings is untouched. - internal eval imports, run.py, __init__.py - tests/test_evaluation.py - Makefile (python -m eval.sync_policyengine_uk, python -m eval.run) SUITE_DIRS still resolves to evals/cases/* (parents[2] depth unchanged). No behaviour change. Full suite: 155 passed, 10 skipped. make eval-ai-offline: 95 passed, 0 failed, 41 skipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Move the HTTP surface into an api/ topic package (#114 slice 10, final): - api/main.py the FastAPI app, CORS, router mounting, /health, /version - api/errors.py NaNSafeJSONResponse + the rate-limit exception handler (wired via app.add_exception_handler instead of a decorator) - api/rate_limit.py the slowapi limiter + key funcs (verbatim) - api/__init__.py Importers updated: chat/routes.py + tests/test_api.py (from api.rate_limit import ...); test_api.py + modal_app.py (from api.main import app); Dockerfile CMD (uvicorn api.main:app). `uvicorn api.main:app` resolves; all routers mount. No behaviour change. Full suite: 155 passed, 10 skipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reflect the #114 reorg in the AI-facing skill docs: - uk-chat-runtime.md §Source Boundaries rewritten to the new packages (chat/, gateway/, prompts/, tools/, engine/, config/, api/); also fixes the prompts/, engine/, and config/ references in the Tool Boundary and Deterministic sections. - ai-evals.md: backend/evaluation/ -> backend/eval/. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Beta preview is ready.
|
Small fixes from the #116 review (no behaviour change): - config/clients.py: extract _api_key() so get_sync_client / get_async_client stop duplicating the env-var fetch + RuntimeError guard (DRY; the async factory was newly added in the chat slice). - gateway/policy.py: docstring referenced the old prompts.py/chatbot.py paths -> prompts/ / chat/. - .gitignore: comment referenced scripts/build_reference.py -> engine/reference.py. - gateway/runtime.py: import order (config before gateway.policy). Full suite: 155 passed, 10 skipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
vahid-ahmadi
left a comment
There was a problem hiding this comment.
Review — backend reorg to package-by-topic (draft, stacked on #112)
I verified this as what it claims to be: a faithful pure-structural refactor. I checked the low-similarity splits closely (the risky ones), confirmed the re-export shims, and confirmed the deploy wiring. Findings below.
Faithfulness — verified ✅
- The big split
routes/chatbot.py(707 lines) →chat/{orchestrator,routes,schemas,suggestions,system_blocks,titles,model_selection}: every function/route present once with identical bodies; constants unchanged (MAX_ITERATIONS=30, max_tokens 16000/200/32,MAX_RESULT_CHARS,max_retries=2); inline model/sampling constants moved toconfig/*with identical values. SSE tool loop kept verbatim. prompts.py→prompts/{system,gateway,meta}: every triple-quoted prompt literal is byte-identical (no wording/whitespace drift), sameSYSTEM_PROMPT_SECTIONSordering and join.routes/billing.py→billing/*: all pricing constants, Stripe params, webhook signature handling, and 4 routes preserved.main.py→api/{main,errors}androutes/conversations.py→conversations/*: app config, CORS, routers, startup hook, thechat_conversationsDB model (column-for-column), andensure_table()all preserved.- Re-export
__init__shims (prompts,gateway,config,billing,conversations,chat) re-export the public names external importers actually use, sofrom prompts import …/from gateway import run_gatewayetc. still resolve to the same objects. - Entrypoints updated consistently:
Dockerfile(uvicorn api.main:app,python engine/reference.py),modal_app.py(from api.main import app, build step),Makefile(python -m eval.run). No staleevaluation./scripts//main:appreferences left. - Test coverage preserved: I counted
def test_across all six touched test files — identical base vs head (no tests dropped or merged). The162 → 155 passeddelta between this and #112's description is measurement context, not lost coverage; worth a one-line "CI green" confirmation when you de-draft.
The one buried behavioural change — fine, but mildly mis-described
The PR calls the parent → parent.parent edit in the scope-descriptor path a "fix". It's actually compensation: load_scope_descriptor moved from backend/model_config.py to backend/config/scope.py (one directory deeper), so it needs an extra .parent to resolve to the same backend/scope_descriptor.md. Net runtime target is unchanged — correct, but "fix" overstates it; it'd be clearer to describe it as path-compensation for the move so a future reader doesn't go hunting for the bug it "fixed".
Other small undocumented deltas (benign)
api/main.pyswaps@app.exception_handler(RateLimitExceeded)forapp.add_exception_handler(RateLimitExceeded, rate_limit_handler)— equivalent in Starlette, but it's a behavioural-looking change inside a "pure move" PR; worth a word in the commit body.conversations/models.pydrops an unusedtimezoneimport. Harmless.- New
tests/conftest.py(rate-limit env bumps, sqlite isolation, arequires_compiledskip marker) is a good addition and well-commented.
My actual concern: the stacking order
The code is sound; the risk is process. This is a 74-file move stacked on #112, which is uncalibrated and explicitly expects tuning (its own follow-up is "run the live eval and tune the gateway prompt + promotions against misroutes"). Any such tuning — prompt edits, the keyword-promotion fixes I flagged on #112, criticality tweaks — lands in the exact files this PR moves (gateway/*, prompts/*, chat/*), forcing a rebase of all 74 files each time. Refactoring on top of a moving, unmerged target is avoidable churn.
Recommendation, matching the draft status:
- Land #112 and let it stabilise through its live calibration.
- Then do this reorg against
main(retarget as the description says).
Keeping it a draft until #112 merges is exactly right. Once it's rebased onto a settled #112, this is a clean, well-sequenced refactor (one topic package per commit, test-green at each) and I'd be happy to approve it. Nice work on the leaf→root commit discipline.
Summary
Reorganizes the backend from its mixed structure (HTTP-layer
routes/, topictooling/+evaluation/, grab-bag package root) into package-by-topic: eachfolder is one requirement, each file one sub-responsibility — including that
topic's own route file and config. Pure structural change; runtime behaviour
is identical.
Final tree:
api/ · billing/ · chat/ · config/ · conversations/ · engine/ · eval/ · gateway/ · prompts/ · tools/(+tests/).Closes #115. Implements the proposal in #114.
This branch is based on
feat/chat-gateway(#112) and the PR targets it, notmain. Three packages (config/,gateway/, the gateway-awarechat/split)come from files that only exist on #112, and most slices collide with #112's
edits to
chatbot.py/prompts.py/evaluation/*/build_reference.py.Merge after #112, then retarget this PR to
main.How it was done
A 10-step loop — one topic package per commit, leaf → root, each independently
runnable and test-green:
refactor(engine)tooling/+build_reference.py→engine/refactor(config)model_config.py+ chat-only model constants →config/{models,sampling,clients,scope}refactor(prompts)prompts.py→prompts/{system,gateway,meta}refactor(tools)tool_definitions.py/agent_tools.py→tools/{definitions,dispatch}refactor(gateway)gateway.py/gateway_config.py→gateway/{runtime,policy}refactor(billing)routes/billing.py→billing/{pricing,credits,stripe,routes}refactor(conversations)routes/conversations.py→conversations/{models,schemas,store,sharing,reports,routes}refactor(chat)routes/chatbot.py(707 lines) →chat/{orchestrator,system_blocks,model_selection,schemas,titles,suggestions,routes}refactor(eval)evaluation/→eval/refactor(api)main.py/rate_limit.py→api/{main,errors,rate_limit}docs(skills)uk-chat-runtime.md+ai-evals.mdpath mapsDesign notes:
__init__s (config, prompts, gateway, billing, conversations,chat) keep most importers unchanged. Tests that reach into module internals
got precise submodule aliases (
chat.orchestrator,gateway.runtime,tools.dispatch,conversations.models).config/collapses theDEFAULT_FAST_MODEL-here /DEFAULT_COMPLEX_MODEL-thereasymmetry — all model-call config now lives in one place; the
scope_descriptor.mdpath was fixed (parent→parent.parent).chat/orchestrator.pykeeps the SSE streaming tool loop verbatim; the mockedend-to-end stream + tool-loop test validates it.
/chat/*,/billing/*,/conversations/*), thereference.md/scope_descriptor.mdoutput locations, and theevals/cases/*resolution are all unchanged.Verification
make eval-ai-offline: 95 passed, 0 failed, 41 skipped (baseline unchanged).uvicorn api.main:appboots;/health+/versionreturn 200; startupensure_tablewired.modal_app.py,backend/Dockerfile,Makefile).🤖 Generated with Claude Code