Skip to content

Commit e717787

Browse files
Copilotpelikhangithub-actions[bot]claude
authored
Skip non-frontmatter Markdown files during compile-all workflow discovery (#27387)
* Initial plan * fix: skip non-frontmatter markdown files during workflow compilation Agent-Logs-Url: https://github.com/github/gh-aw/sessions/3aba940b-5a8f-4680-be99-b6aedaf23188 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> * perf: tighten frontmatter prefix scan in workflow filter Agent-Logs-Url: https://github.com/github/gh-aw/sessions/3aba940b-5a8f-4680-be99-b6aedaf23188 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> * style: refine frontmatter line extraction for markdown filter Agent-Logs-Url: https://github.com/github/gh-aw/sessions/3aba940b-5a8f-4680-be99-b6aedaf23188 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> * test: tighten frontmatter filter tests and setup assertions Agent-Logs-Url: https://github.com/github/gh-aw/sessions/3aba940b-5a8f-4680-be99-b6aedaf23188 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> * test: cover frontmatter filter edge cases Agent-Logs-Url: https://github.com/github/gh-aw/sessions/3aba940b-5a8f-4680-be99-b6aedaf23188 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> * docs(adr): add draft ADR-27387 for frontmatter-based markdown filtering Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: align frontmatter discovery with parser semantics Agent-Logs-Url: https://github.com/github/gh-aw/sessions/d59aec7b-aba8-41e6-a13d-92a4ca0198b5 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent befcc0d commit e717787

6 files changed

Lines changed: 198 additions & 3 deletions
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
# ADR-27387: Filter Non-Frontmatter Markdown Files During compile-all Discovery
2+
3+
**Date**: 2026-04-20
4+
**Status**: Draft
5+
**Deciders**: pelikhan, Copilot
6+
7+
---
8+
9+
## Part 1 — Narrative (Human-Friendly)
10+
11+
### Context
12+
13+
The gh-aw `compile-all` command discovers workflow sources by listing every `*.md` file (excluding `README.md`) found in `.github/workflows/`. Some repositories legitimately place non-workflow documentation — notes, runbooks, guides — as Markdown files in that same directory. These files have no YAML frontmatter, so the compiler encounters them, fails to parse them, and emits noisy `no frontmatter found` errors. The result is inflated compatibility-failure counts and degraded signal-to-noise for developers running compile-all.
14+
15+
### Decision
16+
17+
We will add a `filterMarkdownFilesWithFrontmatter` step that runs immediately after Markdown file discovery and before any compilation attempt. The filter reads the first line of each candidate file and keeps only those whose first line is exactly `---` — the YAML frontmatter delimiter used by every gh-aw workflow. Files that are empty or whose first line is anything other than `---` are silently skipped. The filter is applied in both `compileAllWorkflowFiles` (in `compile_file_operations.go`) and `compileAllFilesInDirectory` (in `compile_pipeline.go`); the underlying `getMarkdownWorkflowFiles` listing function is left unchanged so that non-compile callers are not affected.
18+
19+
### Alternatives Considered
20+
21+
#### Alternative 1: Suppress the "no frontmatter" error in the parser
22+
23+
The parser could downgrade the `no frontmatter found` diagnostic from an error to a debug-level log, letting compile-all silently continue. This approach avoids the additional I/O of a pre-scan. However, it only masks the symptom: the compiler still attempts a full parse of every non-workflow file, wasting CPU, and the file-discovery boundary between "any Markdown" and "workflow Markdown" remains blurred. It was rejected because it papers over the root cause rather than establishing a clean separation.
24+
25+
#### Alternative 2: Enforce a separate directory for non-workflow docs
26+
27+
Repositories could be required to keep documentation Markdown in a directory other than `.github/workflows/`. This would make the directory semantics unambiguous and eliminate the need for a filter entirely. It was rejected because it is a breaking change for existing repositories that already co-locate docs and workflows, and it imposes a structural constraint on users that has no benefit beyond compile-time convenience.
28+
29+
#### Alternative 3: Name-based convention for workflow files
30+
31+
Workflow Markdown files could be required to follow a specific naming pattern (e.g., end with `-workflow.md`), and compile-all would only process matching files. This would make the filter a simple `filepath.Match` with no I/O. It was rejected because it would require renaming every existing workflow, making it a major backwards-incompatible change.
32+
33+
### Consequences
34+
35+
#### Positive
36+
- `compile-all` no longer generates false `no frontmatter found` errors for documentation files co-located with workflows.
37+
- Compatibility-failure counts become more accurate, improving the signal value of compile-all output.
38+
- The fix is surgically scoped: `getMarkdownWorkflowFiles` and all non-compile callers are untouched.
39+
40+
#### Negative
41+
- A workflow file whose frontmatter block is accidentally missing or malformed (e.g., starts with a BOM or a blank line before `---`) will be silently skipped rather than producing a clear error, potentially hiding real authoring mistakes.
42+
- The filter reads file contents for every discovered Markdown file before compilation begins, adding one extra `os.ReadFile` per file over the previous behavior.
43+
44+
#### Neutral
45+
- The first-line check is intentionally strict (`bytes.Equal(firstLine, []byte("---"))`); any leading whitespace or UTF-8 BOM before `---` will cause the file to be skipped. This is consistent with how YAML frontmatter is defined in the gh-aw spec.
46+
- Both compile pipelines now share a single filtering function, reducing future drift risk.
47+
48+
---
49+
50+
## Part 2 — Normative Specification (RFC 2119)
51+
52+
> The key words **MUST**, **MUST NOT**, **REQUIRED**, **SHALL**, **SHALL NOT**, **SHOULD**, **SHOULD NOT**, **RECOMMENDED**, **MAY**, and **OPTIONAL** in this section are to be interpreted as described in [RFC 2119](https://www.rfc-editor.org/rfc/rfc2119).
53+
54+
### Markdown File Discovery
55+
56+
1. Implementations **MUST** apply the frontmatter filter to the list of Markdown files produced by `getMarkdownWorkflowFiles` before passing any file to the compiler.
57+
2. Implementations **MUST** retain a Markdown file for compilation if and only if the file's first line (the bytes before the first `\n`) is exactly the three-byte sequence `---`.
58+
3. Implementations **MUST NOT** pass a Markdown file to the compiler when the file is empty (zero bytes).
59+
4. Implementations **MUST NOT** pass a Markdown file to the compiler when its first line contains any bytes other than `---` (including leading whitespace or BOM characters).
60+
5. Implementations **SHOULD** emit a debug-level log entry naming each skipped file so that developers can diagnose unexpected omissions.
61+
62+
### Compile Pipeline Integration
63+
64+
1. Implementations **MUST** apply the frontmatter filter in every code path that calls `getMarkdownWorkflowFiles` and subsequently compiles the resulting files.
65+
2. Implementations **MUST NOT** modify `getMarkdownWorkflowFiles` to incorporate the filter; the filter **MUST** remain a separate, composable step.
66+
3. Implementations **MAY** cache file-read results within a single compile-all invocation to avoid reading the same file twice if the filter and the compiler would otherwise both read it.
67+
68+
### Error Handling
69+
70+
1. Implementations **MUST** propagate any `os.ReadFile` error that occurs during filtering as a compile-time error; they **MUST NOT** silently skip a file whose content cannot be read due to an I/O error.
71+
72+
### Conformance
73+
74+
An implementation is considered conformant with this ADR if it satisfies all **MUST** and **MUST NOT** requirements above. Failure to meet any **MUST** or **MUST NOT** requirement constitutes non-conformance.
75+
76+
---
77+
78+
*This is a DRAFT ADR generated by the [Design Decision Gate](https://github.com/github/gh-aw/actions/runs/24679557357) workflow. The PR author must review, complete, and finalize this document before the PR can merge.*

pkg/cli/commands_file_watching_test.go

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@ import (
1515

1616
"github.com/github/gh-aw/pkg/testutil"
1717
"github.com/github/gh-aw/pkg/workflow"
18+
"github.com/stretchr/testify/assert"
19+
"github.com/stretchr/testify/require"
1820
)
1921

2022
// TestWatchAndCompileWorkflows tests the watchAndCompileWorkflows function
@@ -189,6 +191,37 @@ func TestCompileAllWorkflowFiles(t *testing.T) {
189191
}
190192
})
191193

194+
t.Run("compile all skips markdown files without frontmatter", func(t *testing.T) {
195+
tempDir := testutil.TempDir(t, "test-*")
196+
workflowsDir := filepath.Join(tempDir, ".github/workflows")
197+
err := os.MkdirAll(workflowsDir, 0o755)
198+
require.NoError(t, err)
199+
200+
validWorkflow := filepath.Join(workflowsDir, "valid.md")
201+
validContent := "---\non: push\nengine: claude\n---\n# Valid Workflow\n\nContent"
202+
err = os.WriteFile(validWorkflow, []byte(validContent), 0o644)
203+
require.NoError(t, err)
204+
205+
docsFile := filepath.Join(workflowsDir, "docs.md")
206+
err = os.WriteFile(docsFile, []byte("# Documentation\n\nNo frontmatter here."), 0o644)
207+
require.NoError(t, err)
208+
209+
compiler := workflow.NewCompiler()
210+
stats, err := compileAllWorkflowFiles(compiler, workflowsDir, false)
211+
if err != nil {
212+
t.Fatalf("compileAllWorkflowFiles failed: %v", err)
213+
}
214+
215+
assert.Equal(t, 1, stats.Total, "Should compile only frontmatter-based markdown workflows")
216+
assert.Equal(t, 0, stats.Errors, "Valid workflow should compile without errors")
217+
218+
validLockFile := filepath.Join(workflowsDir, "valid.lock.yml")
219+
assert.FileExists(t, validLockFile, "Expected lock file for valid workflow")
220+
221+
docsLockFile := filepath.Join(workflowsDir, "docs.lock.yml")
222+
assert.NoFileExists(t, docsLockFile, "Should not emit lock file for documentation markdown without frontmatter")
223+
})
224+
192225
t.Run("compile all handles glob error", func(t *testing.T) {
193226
// Use a malformed glob pattern that will cause filepath.Glob to error
194227
invalidDir := "/tmp/gh-aw/[invalid"

pkg/cli/compile_file_operations.go

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -125,10 +125,14 @@ func compileAllWorkflowFiles(compiler *workflow.Compiler, workflowsDir string, v
125125
if err != nil {
126126
return stats, fmt.Errorf("failed to find markdown files: %w", err)
127127
}
128+
mdFiles, err = filterMarkdownFilesWithFrontmatter(mdFiles)
129+
if err != nil {
130+
return stats, fmt.Errorf("failed to filter markdown files: %w", err)
131+
}
128132
if len(mdFiles) == 0 {
129-
compileHelpersLog.Printf("No markdown files found in %s", workflowsDir)
133+
compileHelpersLog.Printf("No workflow markdown files found in %s after frontmatter filtering", workflowsDir)
130134
if verbose {
131-
fmt.Fprintln(os.Stderr, console.FormatInfoMessage("No markdown files found in "+workflowsDir))
135+
fmt.Fprintln(os.Stderr, console.FormatInfoMessage("No workflow markdown files found in "+workflowsDir+" (workflow files must start with a frontmatter opener on the first line)"))
132136
}
133137
return stats, nil
134138
}

pkg/cli/compile_pipeline.go

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -233,9 +233,13 @@ func compileAllFilesInDirectory(
233233
if err != nil {
234234
return nil, fmt.Errorf("failed to find markdown files: %w", err)
235235
}
236+
mdFiles, err = filterMarkdownFilesWithFrontmatter(mdFiles)
237+
if err != nil {
238+
return nil, fmt.Errorf("failed to filter markdown files: %w", err)
239+
}
236240

237241
if len(mdFiles) == 0 {
238-
return nil, fmt.Errorf("no markdown files found in %s", workflowsDir)
242+
return nil, fmt.Errorf("no workflow markdown files found in %s (workflow files must start with a frontmatter opener on the first line)", workflowsDir)
239243
}
240244

241245
compileOrchestrationLog.Printf("Found %d markdown files to compile", len(mdFiles))

pkg/cli/workflows.go

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,11 @@
11
package cli
22

33
import (
4+
"bufio"
45
"encoding/json"
56
"errors"
67
"fmt"
8+
"io"
79
"os"
810
"os/exec"
911
"path/filepath"
@@ -288,6 +290,41 @@ func getMarkdownWorkflowFiles(workflowDir string) ([]string, error) {
288290
return mdFiles, nil
289291
}
290292

293+
// filterMarkdownFilesWithFrontmatter keeps only markdown files that begin with frontmatter.
294+
func filterMarkdownFilesWithFrontmatter(mdFiles []string) ([]string, error) {
295+
workflowFiles := make([]string, 0, len(mdFiles))
296+
for _, file := range mdFiles {
297+
fd, err := os.Open(file)
298+
if err != nil {
299+
return nil, fmt.Errorf("failed to read workflow file %s: %w", file, err)
300+
}
301+
302+
reader := bufio.NewReader(fd)
303+
firstLine, readErr := reader.ReadString('\n')
304+
closeErr := fd.Close()
305+
if closeErr != nil {
306+
return nil, fmt.Errorf("failed to close workflow file %s: %w", file, closeErr)
307+
}
308+
if readErr != nil && !errors.Is(readErr, io.EOF) {
309+
return nil, fmt.Errorf("failed to read workflow file %s: %w", file, readErr)
310+
}
311+
312+
if firstLine == "" {
313+
workflowsLog.Printf("Skipping empty markdown file: %s", file)
314+
continue
315+
}
316+
317+
if strings.TrimSpace(firstLine) != "---" {
318+
workflowsLog.Printf("Skipping markdown file without frontmatter: %s", file)
319+
continue
320+
}
321+
322+
workflowFiles = append(workflowFiles, file)
323+
}
324+
325+
return workflowFiles, nil
326+
}
327+
291328
// fastParseTitle scans markdown content for the first H1 header, skipping an
292329
// optional frontmatter block, without performing a full YAML parse.
293330
//

pkg/cli/workflows_test.go

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ import (
88
"testing"
99

1010
"github.com/stretchr/testify/assert"
11+
"github.com/stretchr/testify/require"
1112
)
1213

1314
func TestIsWorkflowFile(t *testing.T) {
@@ -235,3 +236,41 @@ func TestGetMarkdownWorkflowFilesExcludesREADME(t *testing.T) {
235236
// Verify total count
236237
assert.Len(t, files, 5, "Should have exactly 5 workflow files (excluding README variants)")
237238
}
239+
240+
func TestFilterMarkdownFilesWithFrontmatter(t *testing.T) {
241+
tempDir := t.TempDir()
242+
workflowsDir := filepath.Join(tempDir, ".github", "workflows")
243+
err := os.MkdirAll(workflowsDir, 0o755)
244+
require.NoError(t, err)
245+
246+
testFiles := map[string]string{
247+
"workflow1.md": "---\non: push\n---\n# Workflow 1",
248+
"workflow-crlf.md": "---\r\non: push\r\n---\r\n# Workflow CRLF",
249+
"docs.md": "# This is documentation",
250+
"empty.md": "",
251+
"leading-whitespace.md": " ---\non: push\n---\n# Valid Frontmatter Start",
252+
"delimiter-not-first.md": "# Header\n---\non: push\n---\n# Not Valid Frontmatter Start",
253+
}
254+
255+
for filename, content := range testFiles {
256+
path := filepath.Join(workflowsDir, filename)
257+
err := os.WriteFile(path, []byte(content), 0o644)
258+
require.NoError(t, err)
259+
}
260+
261+
inputFiles := []string{
262+
filepath.Join(workflowsDir, "workflow1.md"),
263+
filepath.Join(workflowsDir, "workflow-crlf.md"),
264+
filepath.Join(workflowsDir, "docs.md"),
265+
filepath.Join(workflowsDir, "empty.md"),
266+
filepath.Join(workflowsDir, "leading-whitespace.md"),
267+
filepath.Join(workflowsDir, "delimiter-not-first.md"),
268+
}
269+
270+
filtered, err := filterMarkdownFilesWithFrontmatter(inputFiles)
271+
require.NoError(t, err)
272+
assert.Len(t, filtered, 3)
273+
assert.Contains(t, filtered, filepath.Join(workflowsDir, "workflow1.md"))
274+
assert.Contains(t, filtered, filepath.Join(workflowsDir, "workflow-crlf.md"))
275+
assert.Contains(t, filtered, filepath.Join(workflowsDir, "leading-whitespace.md"))
276+
}

0 commit comments

Comments
 (0)