Skip to content

feat(harness): surface skill-load errors + per-invoke max_llm_calls#612

Merged
zakahan merged 2 commits into
mainfrom
feat/harness-skill-error-maxcalls
Jun 15, 2026
Merged

feat(harness): surface skill-load errors + per-invoke max_llm_calls#612
zakahan merged 2 commits into
mainfrom
feat/harness-skill-error-maxcalls

Conversation

@yaozheng-fang

Copy link
Copy Markdown
Collaborator

What

Two harness-server fixes/additions.

1. Skill-load errors are surfaced (not silently skipped)

Previously build_skill_toolset caught any per-skill failure and only logged a
warning, leaving the agent with no skills while ADK's SkillToolset still
told the model it had them — so the model answered "I can't find that skill."
(Most commonly the skill failed ADK's Frontmatter validation, e.g. its
description exceeds the hardcoded 1024-char limit.)

Now it fast-fails with a SkillLoadError naming the skill and the reason:

  • a base skill (from harness.yaml) that fails aborts server startup, so
    a bad config surfaces at deploy instead of shipping a skill-less runtime;
  • a once-time override skill that fails is caught in /harness/invoke and its
    reason is returned to the caller (HTTP 200, in the response output).

2. max_llm_calls, threaded into the runner's RunConfig

  • A harness default: harness.yamlMAX_LLM_CALLS env → HarnessConfig.
  • A per-invocation override: run_agent_request.max_llm_calls.

Both flow into runner.run(run_config=RunConfig(max_llm_calls=...)). CLI:
veadk harness add --max-llm-calls N and veadk harness invoke --max-llm-calls N.

Verified locally

  • Override skill clawhub/heardlyapp/first-class-usps (desc > 1024) → invoke
    returns Skill '...' failed to load: ... description must be at most 1024 characters.
  • Valid skill clawhub/harrylabsj/route-weaver → loads; the model calls
    load_skill and follows the skill workflow.
  • A bad base skill → server startup raises SkillLoadError.
  • max_llm_calls: per-call 2 / 7 and the harness default 50 all appear in
    the runner's run config; the per-invoke value takes effect dynamically.
  • ruff + pyright clean.

🤖 Generated with Claude Code

Skill loading no longer silently skips failures (which left the agent with no
skills while the model was told it had them). build_skill_toolset now fast-fails
with a SkillLoadError naming the skill and reason:
- a base skill that fails aborts server startup (deploy surfaces the bad config);
- a once-time override skill that fails is caught in /harness/invoke and its
  reason is returned to the caller (HTTP 200, in the response output).

Add max_llm_calls, threaded into the runner's RunConfig:
- a harness default (harness.yaml -> MAX_LLM_CALLS env -> HarnessConfig), and
- a per-invocation override via run_agent_request.max_llm_calls.
CLI: 'veadk harness add --max-llm-calls' and 'veadk harness invoke --max-llm-calls'.

Verified locally: bad skill returns its error (e.g. ADK's 'description must be at
most 1024 characters'); a valid skill (route-weaver) loads and the model uses it;
max_llm_calls=2/7 per call and the 50 default all show up in the run config.
…tests

max_llm_calls is now optional everywhere (HarnessConfig / HarnessApp): when
neither the harness default nor the per-call override is set, the runner uses
ADK RunConfig's own default (500) instead of a forced 100.

Update the contract tests to include the new max_llm_calls field on
HarnessConfig and RunAgentRequest.
@zakahan zakahan merged commit f79f7a2 into main Jun 15, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants