feat(harness): surface skill-load errors + per-invoke max_llm_calls#612
Merged
Conversation
Skill loading no longer silently skips failures (which left the agent with no skills while the model was told it had them). build_skill_toolset now fast-fails with a SkillLoadError naming the skill and reason: - a base skill that fails aborts server startup (deploy surfaces the bad config); - a once-time override skill that fails is caught in /harness/invoke and its reason is returned to the caller (HTTP 200, in the response output). Add max_llm_calls, threaded into the runner's RunConfig: - a harness default (harness.yaml -> MAX_LLM_CALLS env -> HarnessConfig), and - a per-invocation override via run_agent_request.max_llm_calls. CLI: 'veadk harness add --max-llm-calls' and 'veadk harness invoke --max-llm-calls'. Verified locally: bad skill returns its error (e.g. ADK's 'description must be at most 1024 characters'); a valid skill (route-weaver) loads and the model uses it; max_llm_calls=2/7 per call and the 50 default all show up in the run config.
…tests max_llm_calls is now optional everywhere (HarnessConfig / HarnessApp): when neither the harness default nor the per-call override is set, the runner uses ADK RunConfig's own default (500) instead of a forced 100. Update the contract tests to include the new max_llm_calls field on HarnessConfig and RunAgentRequest.
zakahan
approved these changes
Jun 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Two harness-server fixes/additions.
1. Skill-load errors are surfaced (not silently skipped)
Previously
build_skill_toolsetcaught any per-skill failure and only logged awarning, leaving the agent with no skills while ADK's
SkillToolsetstilltold the model it had them — so the model answered "I can't find that skill."
(Most commonly the skill failed ADK's
Frontmattervalidation, e.g. itsdescription exceeds the hardcoded 1024-char limit.)
Now it fast-fails with a
SkillLoadErrornaming the skill and the reason:harness.yaml) that fails aborts server startup, soa bad config surfaces at deploy instead of shipping a skill-less runtime;
/harness/invokeand itsreason is returned to the caller (HTTP 200, in the response
output).2.
max_llm_calls, threaded into the runner'sRunConfigharness.yaml→MAX_LLM_CALLSenv →HarnessConfig.run_agent_request.max_llm_calls.Both flow into
runner.run(run_config=RunConfig(max_llm_calls=...)). CLI:veadk harness add --max-llm-calls Nandveadk harness invoke --max-llm-calls N.Verified locally
clawhub/heardlyapp/first-class-usps(desc > 1024) → invokereturns
Skill '...' failed to load: ... description must be at most 1024 characters.clawhub/harrylabsj/route-weaver→ loads; the model callsload_skilland follows the skill workflow.SkillLoadError.max_llm_calls: per-call2/7and the harness default50all appear inthe runner's run config; the per-invoke value takes effect dynamically.
🤖 Generated with Claude Code