"Our load test passed with 10,000 users". Great. But does your test simulate the real behavior of 10,000 different people, or 10,000 robots doing the same thing? The difference between real and simulated traffic can be the difference between success and disaster in production.
A load test is only as good as its ability to simulate reality.
The Illusion of Synthetic Traffic
The problem
Typical test:
- 1000 virtual users
- All login
- All search "product X"
- All add to cart
- All checkout
Reality:
- 70% just browse
- 20% search different products
- 8% add to cart
- 2% checkout
- Each at different speeds
Why this matters
Synthetic test:
- Cache hit rate: 99% (all search "product X")
- DB connections: stable
- Result: "System handles 10K users"
Real production:
- Cache hit rate: 40% (diverse searches)
- DB connections: spikes from varied queries
- Result: System crashes at 3K users
Fundamental Differences
1. Action distribution
Real Traffic:
┌─────────────────────────────────┐
│ Action │ % of users │
├───────────────┼────────────────┤
│ Home │ 100% │
│ Browse │ 80% │
│ Search │ 45% │
│ Product View │ 60% │
│ Add to Cart │ 15% │
│ Checkout │ 3% │
│ Payment │ 2.5% │
└─────────────────────────────────┘
Typical Synthetic Test:
- Everyone goes through all steps
- Artificial 100% funnel
2. Think time
Real Traffic:
- User reads page: 30s-5min
- Decides to buy: variable
- Gets distracted, comes back: common
Synthetic Test:
- Fixed think time: 5s
- No variation
- Robotic behavior
3. Unique data
Real Traffic:
- Millions of different products
- Thousands of search terms
- Each user has unique history
Synthetic Test:
- 10 test products
- 5 search terms
- Sequential IDs
4. Temporal patterns
Real Traffic:
- Organic ramp-up over hours
- Peaks at specific times
- Variation by day of week
Synthetic Test:
- 60s ramp-up to full load
- Constant load
- No temporal variation
Impact on the System
Cache effectiveness
Real:
GET /product/12345 → cache miss
GET /product/67890 → cache miss
GET /product/11111 → cache miss
Cache hit rate: 30-40%
Synthetic:
GET /product/TEST1 → cache miss
GET /product/TEST1 → cache hit
GET /product/TEST1 → cache hit
Cache hit rate: 95%
Result: DB overloads in production
Connection patterns
Real:
- Persistent connections vary
- Long and short sessions mixed
- Frequent reconnections
Synthetic:
- Stable connection pool
- Maximum reuse
- No connection churn
Query diversity
Real:
SELECT * FROM products WHERE name LIKE '%shoes%'
SELECT * FROM products WHERE category = 'electronics'
SELECT * FROM products WHERE price < 100
→ Varied query plans, different indexes
Synthetic:
SELECT * FROM products WHERE id = 1
SELECT * FROM products WHERE id = 1
SELECT * FROM products WHERE id = 1
→ Same query plan, maximized DB cache
How to Approximate Reality
1. Analyze real traffic first
-- Endpoint distribution
SELECT
endpoint,
count(*) as hits,
count(*) * 100.0 / sum(count(*)) over() as percentage
FROM access_logs
WHERE timestamp > now() - interval '7 days'
GROUP BY endpoint
ORDER BY hits DESC;
-- Most common search terms
SELECT
search_term,
count(*) as frequency
FROM search_logs
WHERE timestamp > now() - interval '30 days'
GROUP BY search_term
ORDER BY frequency DESC
LIMIT 1000;
2. Model real distributions
// k6 with realistic distribution
export default function () {
const action = randomWeighted({
browse: 0.80,
search: 0.45,
view_product: 0.60,
add_cart: 0.15,
checkout: 0.03,
});
// Think time with log-normal distribution
// (more realistic than uniform)
sleep(randomLogNormal(30, 2));
// Varied data from realistic pool
const product = getRandomProduct(productPool);
const searchTerm = getRandomSearchTerm(searchTerms);
}
3. Use production data (anonymized)
Ideal:
- Export of real product IDs
- Real search terms (no PII)
- Real category distribution
Process:
1. Query production to extract data
2. Anonymize/mask sensitive data
3. Create data pool for testing
4. Use pool with proportional distribution
4. Simulate temporal patterns
// Simulate organic ramp-up
export const options = {
scenarios: {
organic_ramp: {
executor: 'ramping-arrival-rate',
startRate: 10,
timeUnit: '1m',
preAllocatedVUs: 50,
maxVUs: 500,
stages: [
{ duration: '2h', target: 100 }, // Morning
{ duration: '2h', target: 300 }, // Peak
{ duration: '2h', target: 200 }, // Afternoon
{ duration: '2h', target: 400 }, // Evening peak
{ duration: '2h', target: 50 }, // Late night
],
},
},
};
5. Include edge-case behaviors
Don't forget:
- Users who abandon midway
- Duplicate requests (double-click)
- Client timeouts and retries
- Sessions that expire
- Old/slow browsers
- Mobile vs Desktop mix
Validating the Simulation
Compare metrics
Metrics to validate:
Cache:
Production hit rate: 42%
Test hit rate: 85% ← Too high, adjust
DB:
Production queries/s: 5000
Test queries/s: 1200 ← Too low
Latency distribution:
Production p99/p50 ratio: 8x
Test p99/p50 ratio: 2x ← Too uniform
Realism checklist
## Scenario Validation
### Data
- [ ] Product pool > 1000 items?
- [ ] Search terms > 500 unique?
- [ ] User IDs non-sequential?
### Behavior
- [ ] Funnel reflects real conversion?
- [ ] Think time is variable?
- [ ] Action mix reflects analytics?
### Temporal
- [ ] Ramp-up > 30 minutes?
- [ ] Load variation included?
- [ ] Minimum duration 1 hour?
### Edge cases
- [ ] Abandonment included?
- [ ] Client errors simulated?
- [ ] Device mix?
Advanced Techniques
Traffic replay
Concept:
Capture real traffic from production
Replay in test environment
Tools:
- GoReplay (goreplay)
- TCPReplay
- Custom log replay
Cautions:
- Anonymize sensitive data
- Adjust timestamps
- Scale proportionally
Shadow traffic
Concept:
Duplicate % of real traffic to test environment
Compare behavior in real-time
Advantage:
100% real traffic
Disadvantage:
Requires duplicate infrastructure
Careful with side effects
Production testing (with caution)
Techniques:
- Canary deployment
- Feature flags
- Gradual traffic shifting
Allows:
Validate with real traffic
At real scale
With fast rollback
Conclusion
Synthetic traffic is necessary, but needs to be realistic:
- Analyze production first - understand real behavior
- Model distributions - don't use fixed values
- Diversify data - avoid artificial cache
- Simulate temporal patterns - organic ramp-up
- Validate against production - compare metrics
The goal isn't to pass the test. It's to predict production behavior.
This article is part of the series on the OCTOPUS Performance Engineering methodology.