Skip to content

Commit faf9a44

Browse files
authored
Expand README for src/events (#58885)
1 parent fd8f05c commit faf9a44

1 file changed

Lines changed: 137 additions & 11 deletions

File tree

src/events/README.md

Lines changed: 137 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,147 @@
11
# Events
22

3-
We record events from the browser into our data pipeline to aggregate anonymous data about how folks are using the Docs.
3+
The events subject handles client-side analytics by recording user interactions from the browser and sending them to GitHub's data pipeline. Events track anonymous usage data to help understand how users interact with docs.github.com and identify areas for improvement.
44

5-
## Why events
5+
## Purpose & Scope
66

7-
Data helps us to understand where our Docs are successful, and where we need to improve.
7+
This subject is responsible for:
8+
- Recording browser events (page views, clicks, searches, surveys, etc.)
9+
- Validating event data against JSON schemas
10+
- Sending events to Hydro (GitHub's data warehouse)
11+
- Analyzing survey comments with sentiment analysis
12+
- Providing React components for event tracking
13+
- Server-side event endpoint (`POST /events`)
814

9-
## How to view events
15+
## Architecture & Key Assets
1016

11-
1. We send a `POST /events` request from the browser.
12-
2. Any data sent we check against our JSON schema.
13-
3. After passing the schema check, we send the data along the path to the warehouse.
17+
### Key capabilities and their locations
1418

15-
## How to work on event
19+
- `middleware.ts` - Express router handling `POST /events` endpoint, validates and publishes events
20+
- `lib/schema.ts` - JSON Schema definitions for all event types using AJV validation
21+
- `components/events.ts` - Client-side utilities for sending events from the browser
1622

17-
When adding or changing properties, make sure to update the schema in both the JS file as well as the schema for the warehouse.
23+
## Setup & Usage
1824

19-
## How to get help for events
25+
### Event flow
26+
27+
1. Browser sends `POST /events` request with event data
28+
2. Middleware validates against JSON schema
29+
3. If valid, event is sent to Hydro data warehouse
30+
4. If invalid, validation error is logged (not sent to warehouse)
31+
32+
### Event types
33+
34+
Supported event types (see `EventType` enum):
35+
- `page` - Page view
36+
- `exit` - User leaving page
37+
- `link` - Link click
38+
- `search` - Search query
39+
- `survey` - Survey response
40+
41+
### Sending events from the browser
42+
43+
```typescript
44+
import { sendEvent } from '@/events/components/events'
45+
46+
sendEvent({
47+
type: 'link',
48+
link_url: 'https://example.com',
49+
})
50+
```
51+
52+
### Event schema structure
53+
54+
All events require a `context` object with:
55+
- `event_id` (UUID)
56+
- `user` (UUID) - Anonymous user identifier
57+
- `version` - Schema version
58+
- `created` - Timestamp
59+
- `path` - Current page path
60+
- Browser metadata (user agent, viewport size, etc.)
61+
62+
Each event type has additional required/optional fields defined in `lib/schema.ts`.
63+
64+
### Local testing
65+
66+
Test event validation locally:
67+
```bash
68+
npm run test -- src/events/tests
69+
```
70+
71+
Test comment analysis:
72+
```bash
73+
tsx src/events/scripts/analyze-comment-cli.ts "This is a great article!"
74+
```
75+
76+
## Data & External Dependencies
77+
78+
### Data inputs
79+
- Browser events from client-side JavaScript
80+
- Survey responses and comments
81+
- User context (language, version, product, path)
82+
- Browser metadata (user agent, viewport, etc.)
83+
84+
### Dependencies
85+
- Hydro API - GitHub's data warehouse
86+
- AJV - JSON schema validation
87+
- AI comment analysis service (internal)
88+
- `@/versions`, `@/products`, `@/languages` - For enum validation
89+
90+
### Schema validation
91+
92+
Schemas enforce:
93+
- Required fields for each event type
94+
- Enum values (languages, versions, products, tools)
95+
- Format validation (UUID, date-time, URI)
96+
- Additional properties not allowed
97+
98+
### Data outputs
99+
- Events sent to Hydro data warehouse
100+
- Validation errors logged to Failbot (production)
101+
- Survey sentiment analysis results
102+
103+
## Cross-links & Ownership
104+
105+
### Related subjects
106+
- [`src/observability`](../observability/README.md) - Error logging and monitoring
107+
- [`src/versions`](../versions/README.md) - Version enum validation
108+
- [`src/products`](../products/README.md) - Product enum validation
109+
- [`src/languages`](../languages/README.md) - Language enum validation
110+
- [`src/tools`](../tools/README.md) - Tool enum validation
111+
112+
### Internal documentation
113+
For detailed internal documentation about the data pipeline and Hydro, see the internal Docs Engineering repository.
114+
115+
### Ownership
116+
- Team: Docs Engineering (code and analytics), Data Engineering (data pipeline)
117+
118+
## Current State & Next Steps
119+
120+
### Known limitations
121+
- Survey comment sentiment analysis requires network call (adds latency)
122+
- Event validation errors are deduplicated with LRU cache to prevent spam
123+
- In production, events are fire-and-forget (don't wait for response)
124+
- Validation errors sent to Hydro to track schema mismatches
125+
126+
### Adding a new event type
127+
128+
1. Add event type to `EventType` enum in `types.ts`
129+
2. Add type-specific properties to `EventPropsByType` in `types.ts`
130+
3. Add schema definition to `lib/schema.ts`
131+
4. Update warehouse schema (internal process)
132+
5. Add client-side tracking code in components as needed
133+
6. Test validation with unit tests
134+
135+
### Survey comment analysis
136+
137+
Survey responses with comments are analyzed for sentiment:
138+
- Positive/negative/neutral rating assigned
139+
- Language detection for comment text
140+
- Results stored in `survey_rating` and `survey_comment_language` fields
141+
142+
### Monitoring and debugging
143+
144+
- Validation errors appear in server logs
145+
- Production validation errors sent to Hydro for tracking
146+
- Use `analyze-comment-cli.ts` to test sentiment analysis locally
20147

21-
For hubbers, see the internal docs in the internal engineering repository.

0 commit comments

Comments
 (0)