Skip to content

Commit 7c1ed5d

Browse files
authored
Expand README for src/shielding (#58891)
1 parent d77a3f8 commit 7c1ed5d

1 file changed

Lines changed: 155 additions & 28 deletions

File tree

src/shielding/README.md

Lines changed: 155 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -1,41 +1,168 @@
11
# Shielding
22

3-
## Overview
3+
The shielding subject protects docs.github.com from junk requests, abuse, and unnecessary server load. It implements various middleware to detect and handle suspicious traffic patterns, invalid requests, and rate limiting.
44

5-
Essentially code in our server that controls the prevention of "junk requests" is scripted HTTP requests to endpoints that are _not_ made by regular browser users.
5+
## Purpose & Scope
66

7-
For example, there's middleware code that sees if a `GET` request
8-
comes in with a bunch of random looking query strings keys. This would cause a PASS on the CDN but would not actually matter to the rendering. In this
9-
case, we spot this early and return a redirect response to the same URL
10-
without the unrecognized query string keys so that if the request follows
11-
redirects, the eventual 200 would be normalized by a common URL so the CDN
12-
can serve a HIT.
7+
This subject is responsible for:
8+
- Detecting and handling invalid or suspicious requests
9+
- Rate limiting suspicious traffic patterns
10+
- Normalizing URLs to improve CDN cache hit rates
11+
- Preventing abuse from scripted/bot traffic
12+
- Redirecting malformed requests
13+
- Protecting backend servers from unnecessary work
1314

14-
Here's an in-time discussion post that summaries the _need_ and much of the
15-
recent things we've done to fortify our backend servers to avoid unnecessary
16-
work loads:
15+
Shielding code controls the prevention of "junk requests" - scripted HTTP requests that are not made by regular browser users.
1716

18-
**[How we have fortified Docs for better resiliency and availability (June 2023)](https://github.com/github/docs-engineering/discussions/3262)**
17+
## Architecture & Key Assets
1918

20-
## How it works
19+
### Key capabilities and their locations
2120

22-
At its root, the `src/shielding/frame/middleware/index.ts` is injected into our
23-
Express server. From there, it loads all its individual middleware handlers.
21+
- `middleware/index.ts` - Main entry point that orchestrates all shielding middleware and rate limiting
22+
- Individual middleware files - Each focuses on a single abuse pattern identified from log analysis
23+
- Rate limiting logic - Uses `createRateLimiter()` for suspicious and API routes
2424

25-
Each middleware is one file that focuses on a single use-case. The
26-
use-cases are borne from studying log files to
27-
spot patterns of request abuse.
25+
## Setup & Usage
2826

29-
> [!NOTE]
30-
> Some shielding "tricks" appear in other places throughout the code
31-
> base such as controlling the 404 response for `/assets/*` URLs.
27+
### How it works
3228

33-
## Rate limiting
29+
1. `src/shielding/middleware/index.ts` is injected into the Express server
30+
2. Loads all individual middleware handlers
31+
3. Each middleware focuses on a single use-case/abuse pattern
32+
4. Abuse patterns discovered by studying log files
3433

35-
We rate limit at multiple levels:
34+
### Rate limiting
35+
36+
Three levels of rate limiting:
37+
38+
1. **CDN (Fastly)** - First line of defense
39+
2. **Suspicious routes** - Via shielding middleware
40+
- Only rate limited if deemed suspicious based on checked parameters
41+
- Implemented in `middleware/index.ts` with `createRateLimiter()`
42+
3. **API routes** - Via API declaration
43+
- Limited to certain # of requests per minute, regardless of request characteristics
44+
- Implemented in `src/frame/middleware/api.ts`
45+
46+
### Common shielding patterns
47+
48+
**Invalid query strings:**
49+
- Request: `GET /path?random=abc&weird=xyz`
50+
- Action: Redirect to `/path` (normalized URL)
51+
- Benefit: CDN can serve cached response for normalized URL
52+
53+
**Malformed URLs:**
54+
- Invalid characters or patterns in URL
55+
- Action: Return 400 or redirect to corrected URL
56+
- Benefit: Prevent errors propagating to application code
57+
58+
**Invalid paths:**
59+
- Suspicious path patterns (probes, exploits)
60+
- Action: Reject with appropriate status code
61+
- Benefit: Prevent unnecessary processing
62+
63+
### Running tests
64+
65+
```bash
66+
npm run test -- src/shielding/tests
67+
```
68+
69+
## Data & External Dependencies
70+
71+
### Data inputs
72+
- HTTP request metadata (path, query strings, headers)
73+
- Known good/bad patterns from log analysis
74+
- CDN cache behavior data
75+
76+
### Dependencies
77+
- Express middleware
78+
- Rate limiting library (likely `express-rate-limit` or similar)
79+
- `@/frame` - Express server integration
80+
- CDN configuration (Fastly)
81+
82+
### Data outputs
83+
- HTTP responses (redirects, 400s, 429s for rate limit)
84+
- Cache-friendly normalized URLs
85+
- Reduced backend server load
86+
87+
## Cross-links & Ownership
88+
89+
### Related subjects
90+
- [`src/frame`](../frame/README.md) - Express middleware pipeline integration
91+
- [`src/observability`](../observability/README.md) - Logging suspicious traffic patterns
92+
- CDN configuration - Fastly edge rules
93+
94+
### Internal documentation
95+
For detailed discussion on resilience and availability improvements, see:
96+
- [How we have fortified Docs for better resiliency and availability (June 2023)](https://github.com/github/docs-engineering/discussions/3262)
97+
98+
### Ownership
99+
- Team: Docs Engineering
100+
101+
## Current State & Next Steps
102+
103+
### Shielding strategies
104+
105+
Each middleware implements a specific strategy based on observed abuse:
106+
- Query string normalization for CDN optimization
107+
- Path validation to reject probes/exploits
108+
- Header validation to detect bot traffic
109+
- Next.js path handling for framework-specific patterns
110+
111+
### Known limitations
112+
- Shielding is reactive (based on observing abuse patterns)
113+
- Some legitimate traffic may be affected if patterns overlap with abuse
114+
- Rate limits are tuned based on historical data
115+
- Some shielding logic exists outside this subject (e.g., `/assets/*` 404 handling)
116+
117+
### Adding new shielding middleware
118+
119+
1. Identify abuse pattern from logs
120+
2. Create new middleware file in `src/shielding/middleware/`
121+
3. Implement detection and handling logic
122+
4. Add to orchestrator in `index.ts`
123+
5. Add tests in `tests/`
124+
6. Monitor impact on CDN cache hit rate and server load
125+
126+
### Monitoring shielding effectiveness
127+
128+
Key metrics:
129+
- CDN cache hit rate (should increase)
130+
- Backend server load (should decrease)
131+
- 4xx/5xx error rates (monitor for false positives)
132+
- Rate limit triggers (logged in observability)
133+
134+
Check #docs-ops and monitoring dashboards for ongoing effectiveness.
135+
136+
### Configuration
137+
138+
Rate limit configuration:
139+
- Thresholds tuned based on traffic patterns
140+
- Different limits for different route types
141+
- Suspicious request detection parameters
142+
143+
CDN integration:
144+
- Works with Fastly configuration
145+
- Ensures normalized URLs maximize cache hits
146+
- Some shielding happens at CDN edge
147+
- Dashboard for real-time shielding metrics
148+
149+
### Troubleshooting
150+
151+
**Legitimate traffic blocked:**
152+
- Check shielding logs in Splunk
153+
- Identify which middleware triggered
154+
- Adjust pattern matching or rate limits
155+
- Consider allowlist for specific use cases
156+
157+
**Abuse still getting through:**
158+
- Analyze logs for new patterns
159+
- Add new middleware to handle pattern
160+
- Adjust existing middleware thresholds
161+
- Consider CDN-level blocking
162+
163+
**CDN cache hit rate not improving:**
164+
- Verify URL normalization is working
165+
- Check that redirects are followed
166+
- Analyze cache miss patterns
167+
- Coordinate with CDN configuration
36168

37-
1. CDN (Fastly)
38-
2. All routes via [src/shielding/frame/index.ts](./middleware/index.ts) and the `createRateLimiter()` middleware.
39-
- These routes are _only_ rate limited if they are deemed suspicious based on parameters we check.
40-
3. API routes via their declaration in [src/frame/middleware/api.ts](../frame/middleware/api.ts) using the `createRateLimiter()` middleware.
41-
- These routes are limited to a certain # of requests per minute, regardless of what the request looks like.

0 commit comments

Comments
 (0)