|
1 | | -# Kusto tooling |
| 1 | +# Metrics |
2 | 2 |
|
3 | | -CLI tools to fetch data from the Kusto API. |
| 3 | +The metrics subject provides CLI tools for fetching analytics data from Kusto (Azure Data Explorer) about GitHub Docs usage. These tools help content strategists, writers, and engineers understand page performance, user behavior, and content effectiveness. |
4 | 4 |
|
5 | | -## Installation and authentication |
| 5 | +## Purpose & Scope |
6 | 6 |
|
7 | | -1. Install the Azure CLI with `brew install azure-cli`. |
8 | | - * If you have the option to **not** update all your brew packages, choose that, or it will take a really long time. |
9 | | -1. Run `az login`. |
10 | | - * You'll have to run `az login` whenever your session expires. The sessions are fairly long lasting. |
11 | | -1. Enter your `<username>@githubazure.com` credentials. |
12 | | - * These will get cached for future logins. |
13 | | -1. At the prompt in Terminal asking which subscription you want to use, just press Enter to choose the default. |
14 | | -1. Open or create an `.env` file in the root directory of your checkout (this file is already in `.gitignore` so it won't be tracked by Git). |
15 | | -1. Add the `KUSTO_CLUSTER` and `KUSTO_DATABASE` values to the `.env` (_these values are pinned in slack_): |
16 | | - ``` |
17 | | - KUSTO_CLUSTER='<value>' |
18 | | - KUSTO_DATABASE='<value>' |
19 | | - ``` |
| 7 | +This subject is responsible for: |
| 8 | +- Providing CLI tools to query Kusto for docs analytics |
| 9 | +- `docstat` - Get metrics for a single URL (views, users, bounces, etc.) |
| 10 | +- `docsaudit` - Get metrics for an entire content directory |
| 11 | +- Kusto query abstractions for common metrics |
| 12 | +- Authentication and connection to Azure Kusto |
| 13 | +- Date range calculations for time-series queries |
20 | 14 |
|
21 | | -## docstat usage |
| 15 | +## Architecture & Key Assets |
22 | 16 |
|
23 | | -Run `npm run docstat -- <URL>` on any GitHub Docs URL to gather a set of default metrics about it, including 30d views, users, view duration, bounces, helpfulness score, and exits to support. |
| 17 | +### Key capabilities and their locations |
24 | 18 |
|
25 | | -Notes: |
26 | | -* If the URL doesn't include a version, `docstat` will return data that includes **all versions** (so FPT, Cloud, Server, etc.). |
27 | | - * If you want data for FPT only, pass the `--fptOnly` option. |
28 | | -* `docstat` only accepts URLs with an `en` language code or no language code, and it only fetches English data. |
| 19 | +- `lib/kusto-client.ts` - `getKustoClient()`: Creates authenticated Kusto client using Azure CLI |
| 20 | +- `lib/kusto-client.ts` - `runQuery()`: Executes Kusto queries and returns results |
| 21 | +- `scripts/docstat.ts` - CLI tool: Fetches metrics for a single docs URL |
| 22 | +- `scripts/docsaudit.ts` - CLI tool: Audits entire content directories with CSV output |
| 23 | +- `queries/*.ts` - Pre-defined Kusto queries for specific metrics |
29 | 24 |
|
30 | | -To see all the options: |
31 | | -``` |
32 | | -npm run docstat -- --help |
33 | | -``` |
34 | | -You can combine options like this: |
35 | | -``` |
36 | | -npm run docstat -- https://docs.github.com/copilot/tutorials/modernize-legacy-code --compare --range 60 |
37 | | -``` |
38 | | -Use `--redirects` to include `redirect_from` frontmatter paths in the queries (this is helpful if the article may have moved recently): |
39 | | -``` |
40 | | -npm run docstat -- https://docs.github.com/copilot/tutorials/modernize-legacy-code --redirects |
41 | | -``` |
42 | | -Use the `--json` (or `-j`) option to output JSON: |
43 | | -``` |
44 | | -npm run docstat -- https://docs.github.com/copilot/tutorials/modernize-legacy-code --json |
45 | | -``` |
46 | | -If you want to pass the results of the JSON to `jq`, you need to use `silent` mode: |
47 | | -``` |
48 | | -npm run --silent docstat -- https://docs.github.com/copilot/tutorials/modernize-legacy-code --json | jq .data.users |
49 | | -``` |
| 25 | +## Setup & Usage |
| 26 | + |
| 27 | +### Installation and authentication |
| 28 | + |
| 29 | +1. Install Azure CLI: |
| 30 | + ```bash |
| 31 | + brew install azure-cli |
| 32 | + ``` |
| 33 | + |
| 34 | +2. Login with Azure credentials: |
| 35 | + ```bash |
| 36 | + az login |
| 37 | + ``` |
| 38 | + Use your `<username>@githubazure.com` credentials. |
50 | 39 |
|
51 | | -## docsaudit usage |
| 40 | +3. Add Kusto configuration to `.env` file (values pinned in Slack): |
| 41 | + ``` |
| 42 | + KUSTO_CLUSTER='<value>' |
| 43 | + KUSTO_DATABASE='<value>' |
| 44 | + ``` |
52 | 45 |
|
53 | | -Run `npm run docsaudit` on a top-level content directory to gather data about its files—including title, path, versions, 30d views, and 30d users—and output it to a CSV file. |
| 46 | +### docstat usage |
54 | 47 |
|
55 | | -To see all the options: |
| 48 | +Get metrics for a single URL: |
| 49 | + |
| 50 | +```bash |
| 51 | +npm run docstat -- <URL> |
56 | 52 | ``` |
57 | | -npm run docsaudit -- --help |
| 53 | + |
| 54 | +Example: |
| 55 | +```bash |
| 56 | +npm run docstat -- https://docs.github.com/copilot/tutorials/modernize-legacy-code |
58 | 57 | ``` |
59 | | -Run the script on any top-level content directory: |
| 58 | + |
| 59 | +Default metrics returned: |
| 60 | +- 30-day views |
| 61 | +- 30-day unique users |
| 62 | +- Average view duration |
| 63 | +- Bounce rate |
| 64 | +- Helpfulness score (survey data) |
| 65 | +- Exits to support |
| 66 | + |
| 67 | +#### Options |
| 68 | + |
| 69 | +```bash |
| 70 | +# Compare with previous period |
| 71 | +npm run docstat -- <URL> --compare |
| 72 | + |
| 73 | +# Custom date range (60 days) |
| 74 | +npm run docstat -- <URL> --range 60 |
| 75 | + |
| 76 | +# Include redirects from frontmatter |
| 77 | +npm run docstat -- <URL> --redirects |
| 78 | + |
| 79 | +# FPT data only (default includes all versions) |
| 80 | +npm run docstat -- <URL> --fptOnly |
| 81 | + |
| 82 | +# JSON output |
| 83 | +npm run docstat -- <URL> --json |
| 84 | + |
| 85 | +# Combine options |
| 86 | +npm run docstat -- <URL> --compare --range 60 --redirects |
60 | 87 | ``` |
61 | | -npm run docsaudit -- <content directory name> |
| 88 | + |
| 89 | +#### JSON output with jq |
| 90 | + |
| 91 | +```bash |
| 92 | +npm run --silent docstat -- <URL> --json | jq .data.users |
62 | 93 | ``` |
63 | | -For example: |
| 94 | + |
| 95 | +### docsaudit usage |
| 96 | + |
| 97 | +Audit an entire content directory: |
| 98 | + |
| 99 | +```bash |
| 100 | +npm run docsaudit -- <content-directory> |
64 | 101 | ``` |
| 102 | + |
| 103 | +Example: |
| 104 | +```bash |
65 | 105 | npm run docsaudit -- actions |
66 | 106 | ``` |
67 | 107 |
|
68 | | -## Future development |
| 108 | +Output includes: |
| 109 | +- Title |
| 110 | +- Path |
| 111 | +- Versions |
| 112 | +- 30-day views |
| 113 | +- 30-day unique users |
| 114 | + |
| 115 | +Results are saved to a CSV file in the project root. |
| 116 | + |
| 117 | +## Data & External Dependencies |
| 118 | + |
| 119 | +### Data sources |
| 120 | +- Kusto (Azure Data Explorer) - GitHub's data warehouse for analytics |
| 121 | +- Docs event data - Page views, user interactions, surveys |
| 122 | +- Content frontmatter - For path resolution and redirect detection |
| 123 | + |
| 124 | +### Dependencies |
| 125 | +- `azure-kusto-data` - Official Azure Kusto SDK |
| 126 | +- Azure CLI - For authentication (`az login`) |
| 127 | +- Environment variables: `KUSTO_CLUSTER`, `KUSTO_DATABASE` |
| 128 | + |
| 129 | +### Authentication |
| 130 | +- Uses Azure CLI identity via `withAzLoginIdentity()` |
| 131 | +- Sessions are long-lasting but expire periodically |
| 132 | +- Re-run `az login` when session expires |
| 133 | + |
| 134 | +### Queries |
| 135 | +Pre-defined queries in `queries/` directory: |
| 136 | +- `views.ts` - Total page views |
| 137 | +- `users.ts` - Unique users |
| 138 | +- `view-duration.ts` - Average session duration |
| 139 | +- `bounces.ts` - Percentage of single-page sessions |
| 140 | +- `survey-score.ts` - Helpfulness rating from surveys |
| 141 | +- `exits-to-support.ts` - Clicks on support links |
| 142 | + |
| 143 | +## Cross-links & Ownership |
| 144 | + |
| 145 | +### Related subjects |
| 146 | +- [`src/events`](../events/README.md) - Source of analytics event data |
| 147 | +- [`src/frame`](../frame/README.md) - Frontmatter reading for path resolution |
| 148 | +- Kusto database - Contains aggregated event data |
| 149 | + |
| 150 | +### Internal documentation |
| 151 | +For Kusto cluster details and database schema, see internal Docs Engineering documentation. Credentials are pinned in the #docs-engineering Slack channel. |
| 152 | + |
| 153 | +### Ownership |
| 154 | +- Team: Docs Content (with engineering support and reviews) |
| 155 | +- Data questions: #docs-data |
| 156 | + |
| 157 | +## Current State & Next Steps |
| 158 | + |
| 159 | +### Known limitations |
| 160 | +- Date range only accepts start date (end date is always current) |
| 161 | +- Only English (`en`) language data is supported |
| 162 | +- Queries are hardcoded in `queries/` directory |
| 163 | +- URLs without version include all versions (FPT, GHEC, GHES combined) |
| 164 | + |
| 165 | +### Metrics available |
| 166 | +Current metrics: |
| 167 | +- Views (page view count) |
| 168 | +- Users (unique user count) |
| 169 | +- View duration (average time on page) |
| 170 | +- Bounces (single-page sessions) |
| 171 | +- Survey score (helpfulness rating) |
| 172 | +- Exits to support (support link clicks) |
| 173 | + |
| 174 | +### Adding a new query |
| 175 | + |
| 176 | +1. Create new file in `src/metrics/queries/` |
| 177 | +2. Export a function that returns a Kusto query string |
| 178 | +3. Import and call in `docstat.ts` or `docsaudit.ts` |
| 179 | +4. Update CLI options if needed |
| 180 | + |
| 181 | +Example: |
| 182 | +```typescript |
| 183 | +// queries/my-metric.ts |
| 184 | +export function getMyMetric(path: string, startDate: string, endDate: string): string { |
| 185 | + return ` |
| 186 | + PageViews |
| 187 | + | where Timestamp between (datetime(${startDate}) .. datetime(${endDate})) |
| 188 | + | where Path == "${path}" |
| 189 | + | summarize Count = count() |
| 190 | + ` |
| 191 | +} |
| 192 | +``` |
| 193 | + |
| 194 | +### Troubleshooting |
| 195 | + |
| 196 | +**Azure login expired:** |
| 197 | +```bash |
| 198 | +az login |
| 199 | +``` |
69 | 200 |
|
70 | | -Applies to all scripts: |
| 201 | +**Missing environment variables:** |
| 202 | +Check `.env` file has `KUSTO_CLUSTER` and `KUSTO_DATABASE` (values in Slack) |
71 | 203 |
|
72 | | -* The date range option only accepts a start date (via `-r <number>`, where the number means "`<number>` days ago"). The end date will always be the current date. |
73 | | - * In the future, we can add an option to set a custom end date. |
| 204 | +**No data found:** |
| 205 | +- Verify URL is correct and includes `https://docs.github.com` |
| 206 | +- Check date range (older content may have limited data) |
| 207 | +- Try `--redirects` if article was recently moved |
74 | 208 |
|
75 | | -* The only Kusto queries available are hardcoded in the `kusto/queries` directory. |
76 | | - * In the future, we can hardcode more queries, add the ability to send custom queries, or perhaps create pre-defined sets of queries. |
| 209 | +**Permission errors:** |
| 210 | +Ensure your Azure account has read access to the Kusto database. Contact #docs-data if needed. |
0 commit comments