Skip to content

Refactor: separate package-format detection from classification and normalization strategy #782

@danielmeppiel

Description

@danielmeppiel

Follow-up from #781 (which fixed the immediate ordering bug behind #780).

Problem

detect_package_type() in src/apm_cli/models/validation.py is a positional-priority if/elif cascade that conflates three concerns:

  1. Evidence gathering -- did we see apm.yml, SKILL.md, hooks/*.json, plugin.json, agents/skills/commands/?
  2. Classification -- which PackageType enum value to assign.
  3. Normalization-strategy selection -- which downstream synthesizer to run (e.g. normalize_plugin_directory for MARKETPLACE_PLUGIN).

The cascade's first-match-wins semantics means every new bundled-format introduces a potential ordering bug. #780 was the latest one; the next one is just a matter of time.

Proposed shape

Move to a composition model:

  • FormatDetector interface, one impl per format: ApmYmlDetector, SkillMdDetector, HookJsonDetector, ClaudePluginDetector. Each returns Optional[FormatEvidence] independent of the others.
  • PackageFormatRegistry runs all detectors and produces a DetectionReport (set of evidences -- not a single PackageType).
  • NormalizationPlanner reads the report and decides which normalizers to run. Mixed packages become first-class (e.g. a Claude plugin that legitimately ships hooks runs both the plugin synthesizer AND the hooks pass-through, instead of being shoehorned into one classification).

The DetectionEvidence dataclass + gather_detection_evidence() helper landed in #781 are intentionally a step in this direction -- they decouple the evidence scan from classification so observability code can already use the richer payload.

Benefits

  • New formats are additive -- no ordering bugs possible.
  • Mixed packages are first-class (today's implicit "MARKETPLACE_PLUGIN happens to also handle hooks" becomes explicit).
  • The DetectionReport doubles as the observability payload (verbose detection traces, deploy-summary labels, near-miss warnings).
  • The classification logic stops being a positional puzzle.

Cost

Medium -- touches models/validation.py, install/sources.py, plus all consumers of PackageType (lockfile recording, install summary, drift detection). Should land behind a feature flag for one release if downstream test churn is high.

Out of scope for this issue

  • Tightening _has_hook_json to only count files matching the actual Claude hooks schema (today: any *.json in hooks/).
  • Introducing an explicit MIXED PackageType -- the DetectionReport model makes the enum redundant for mixed cases.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions