|
1 | 1 | # Events |
2 | 2 |
|
3 | | -We record events from the browser into our data pipeline to aggregate anonymous data about how folks are using the Docs. |
| 3 | +The events subject handles client-side analytics by recording user interactions from the browser and sending them to GitHub's data pipeline. Events track anonymous usage data to help understand how users interact with docs.github.com and identify areas for improvement. |
4 | 4 |
|
5 | | -## Why events |
| 5 | +## Purpose & Scope |
6 | 6 |
|
7 | | -Data helps us to understand where our Docs are successful, and where we need to improve. |
| 7 | +This subject is responsible for: |
| 8 | +- Recording browser events (page views, clicks, searches, surveys, etc.) |
| 9 | +- Validating event data against JSON schemas |
| 10 | +- Sending events to Hydro (GitHub's data warehouse) |
| 11 | +- Analyzing survey comments with sentiment analysis |
| 12 | +- Providing React components for event tracking |
| 13 | +- Server-side event endpoint (`POST /events`) |
8 | 14 |
|
9 | | -## How to view events |
| 15 | +## Architecture & Key Assets |
10 | 16 |
|
11 | | -1. We send a `POST /events` request from the browser. |
12 | | -2. Any data sent we check against our JSON schema. |
13 | | -3. After passing the schema check, we send the data along the path to the warehouse. |
| 17 | +### Key capabilities and their locations |
14 | 18 |
|
15 | | -## How to work on event |
| 19 | +- `middleware.ts` - Express router handling `POST /events` endpoint, validates and publishes events |
| 20 | +- `lib/schema.ts` - JSON Schema definitions for all event types using AJV validation |
| 21 | +- `components/events.ts` - Client-side utilities for sending events from the browser |
16 | 22 |
|
17 | | -When adding or changing properties, make sure to update the schema in both the JS file as well as the schema for the warehouse. |
| 23 | +## Setup & Usage |
18 | 24 |
|
19 | | -## How to get help for events |
| 25 | +### Event flow |
| 26 | + |
| 27 | +1. Browser sends `POST /events` request with event data |
| 28 | +2. Middleware validates against JSON schema |
| 29 | +3. If valid, event is sent to Hydro data warehouse |
| 30 | +4. If invalid, validation error is logged (not sent to warehouse) |
| 31 | + |
| 32 | +### Event types |
| 33 | + |
| 34 | +Supported event types (see `EventType` enum): |
| 35 | +- `page` - Page view |
| 36 | +- `exit` - User leaving page |
| 37 | +- `link` - Link click |
| 38 | +- `search` - Search query |
| 39 | +- `survey` - Survey response |
| 40 | + |
| 41 | +### Sending events from the browser |
| 42 | + |
| 43 | +```typescript |
| 44 | +import { sendEvent } from '@/events/components/events' |
| 45 | + |
| 46 | +sendEvent({ |
| 47 | + type: 'link', |
| 48 | + link_url: 'https://example.com', |
| 49 | +}) |
| 50 | +``` |
| 51 | + |
| 52 | +### Event schema structure |
| 53 | + |
| 54 | +All events require a `context` object with: |
| 55 | +- `event_id` (UUID) |
| 56 | +- `user` (UUID) - Anonymous user identifier |
| 57 | +- `version` - Schema version |
| 58 | +- `created` - Timestamp |
| 59 | +- `path` - Current page path |
| 60 | +- Browser metadata (user agent, viewport size, etc.) |
| 61 | + |
| 62 | +Each event type has additional required/optional fields defined in `lib/schema.ts`. |
| 63 | + |
| 64 | +### Local testing |
| 65 | + |
| 66 | +Test event validation locally: |
| 67 | +```bash |
| 68 | +npm run test -- src/events/tests |
| 69 | +``` |
| 70 | + |
| 71 | +Test comment analysis: |
| 72 | +```bash |
| 73 | +tsx src/events/scripts/analyze-comment-cli.ts "This is a great article!" |
| 74 | +``` |
| 75 | + |
| 76 | +## Data & External Dependencies |
| 77 | + |
| 78 | +### Data inputs |
| 79 | +- Browser events from client-side JavaScript |
| 80 | +- Survey responses and comments |
| 81 | +- User context (language, version, product, path) |
| 82 | +- Browser metadata (user agent, viewport, etc.) |
| 83 | + |
| 84 | +### Dependencies |
| 85 | +- Hydro API - GitHub's data warehouse |
| 86 | +- AJV - JSON schema validation |
| 87 | +- AI comment analysis service (internal) |
| 88 | +- `@/versions`, `@/products`, `@/languages` - For enum validation |
| 89 | + |
| 90 | +### Schema validation |
| 91 | + |
| 92 | +Schemas enforce: |
| 93 | +- Required fields for each event type |
| 94 | +- Enum values (languages, versions, products, tools) |
| 95 | +- Format validation (UUID, date-time, URI) |
| 96 | +- Additional properties not allowed |
| 97 | + |
| 98 | +### Data outputs |
| 99 | +- Events sent to Hydro data warehouse |
| 100 | +- Validation errors logged to Failbot (production) |
| 101 | +- Survey sentiment analysis results |
| 102 | + |
| 103 | +## Cross-links & Ownership |
| 104 | + |
| 105 | +### Related subjects |
| 106 | +- [`src/observability`](../observability/README.md) - Error logging and monitoring |
| 107 | +- [`src/versions`](../versions/README.md) - Version enum validation |
| 108 | +- [`src/products`](../products/README.md) - Product enum validation |
| 109 | +- [`src/languages`](../languages/README.md) - Language enum validation |
| 110 | +- [`src/tools`](../tools/README.md) - Tool enum validation |
| 111 | + |
| 112 | +### Internal documentation |
| 113 | +For detailed internal documentation about the data pipeline and Hydro, see the internal Docs Engineering repository. |
| 114 | + |
| 115 | +### Ownership |
| 116 | +- Team: Docs Engineering (code and analytics), Data Engineering (data pipeline) |
| 117 | + |
| 118 | +## Current State & Next Steps |
| 119 | + |
| 120 | +### Known limitations |
| 121 | +- Survey comment sentiment analysis requires network call (adds latency) |
| 122 | +- Event validation errors are deduplicated with LRU cache to prevent spam |
| 123 | +- In production, events are fire-and-forget (don't wait for response) |
| 124 | +- Validation errors sent to Hydro to track schema mismatches |
| 125 | + |
| 126 | +### Adding a new event type |
| 127 | + |
| 128 | +1. Add event type to `EventType` enum in `types.ts` |
| 129 | +2. Add type-specific properties to `EventPropsByType` in `types.ts` |
| 130 | +3. Add schema definition to `lib/schema.ts` |
| 131 | +4. Update warehouse schema (internal process) |
| 132 | +5. Add client-side tracking code in components as needed |
| 133 | +6. Test validation with unit tests |
| 134 | + |
| 135 | +### Survey comment analysis |
| 136 | + |
| 137 | +Survey responses with comments are analyzed for sentiment: |
| 138 | +- Positive/negative/neutral rating assigned |
| 139 | +- Language detection for comment text |
| 140 | +- Results stored in `survey_rating` and `survey_comment_language` fields |
| 141 | + |
| 142 | +### Monitoring and debugging |
| 143 | + |
| 144 | +- Validation errors appear in server logs |
| 145 | +- Production validation errors sent to Hydro for tracking |
| 146 | +- Use `analyze-comment-cli.ts` to test sentiment analysis locally |
20 | 147 |
|
21 | | -For hubbers, see the internal docs in the internal engineering repository. |
|
0 commit comments