Real vs Simulated Traffic: when numbers lie

"Our load test passed with 10,000 users". Great. But does your test simulate the real behavior of 10,000 different people, or 10,000 robots doing the same thing? The difference between real and simulated traffic can be the difference between success and disaster in production.

A load test is only as good as its ability to simulate reality.

The Illusion of Synthetic Traffic

The problem

Typical test:
  - 1000 virtual users
  - All login
  - All search "product X"
  - All add to cart
  - All checkout

Reality:
  - 70% just browse
  - 20% search different products
  - 8% add to cart
  - 2% checkout
  - Each at different speeds

Why this matters

Synthetic test:
  - Cache hit rate: 99% (all search "product X")
  - DB connections: stable
  - Result: "System handles 10K users"

Real production:
  - Cache hit rate: 40% (diverse searches)
  - DB connections: spikes from varied queries
  - Result: System crashes at 3K users

Fundamental Differences

1. Action distribution

Real Traffic:
  ┌─────────────────────────────────┐
  │ Action        │ % of users     │
  ├───────────────┼────────────────┤
  │ Home          │ 100%           │
  │ Browse        │ 80%            │
  │ Search        │ 45%            │
  │ Product View  │ 60%            │
  │ Add to Cart   │ 15%            │
  │ Checkout      │ 3%             │
  │ Payment       │ 2.5%           │
  └─────────────────────────────────┘

Typical Synthetic Test:
  - Everyone goes through all steps
  - Artificial 100% funnel

2. Think time

Real Traffic:
  - User reads page: 30s-5min
  - Decides to buy: variable
  - Gets distracted, comes back: common

Synthetic Test:
  - Fixed think time: 5s
  - No variation
  - Robotic behavior

3. Unique data

Real Traffic:
  - Millions of different products
  - Thousands of search terms
  - Each user has unique history

Synthetic Test:
  - 10 test products
  - 5 search terms
  - Sequential IDs

4. Temporal patterns

Real Traffic:
  - Organic ramp-up over hours
  - Peaks at specific times
  - Variation by day of week

Synthetic Test:
  - 60s ramp-up to full load
  - Constant load
  - No temporal variation

Impact on the System

Cache effectiveness

Real:
  GET /product/12345 → cache miss
  GET /product/67890 → cache miss
  GET /product/11111 → cache miss
  Cache hit rate: 30-40%

Synthetic:
  GET /product/TEST1 → cache miss
  GET /product/TEST1 → cache hit
  GET /product/TEST1 → cache hit
  Cache hit rate: 95%

Result: DB overloads in production

Connection patterns

Real:
  - Persistent connections vary
  - Long and short sessions mixed
  - Frequent reconnections

Synthetic:
  - Stable connection pool
  - Maximum reuse
  - No connection churn

Query diversity

Real:
  SELECT * FROM products WHERE name LIKE '%shoes%'
  SELECT * FROM products WHERE category = 'electronics'
  SELECT * FROM products WHERE price < 100
  → Varied query plans, different indexes

Synthetic:
  SELECT * FROM products WHERE id = 1
  SELECT * FROM products WHERE id = 1
  SELECT * FROM products WHERE id = 1
  → Same query plan, maximized DB cache

How to Approximate Reality

1. Analyze real traffic first

-- Endpoint distribution
SELECT
  endpoint,
  count(*) as hits,
  count(*) * 100.0 / sum(count(*)) over() as percentage
FROM access_logs
WHERE timestamp > now() - interval '7 days'
GROUP BY endpoint
ORDER BY hits DESC;

-- Most common search terms
SELECT
  search_term,
  count(*) as frequency
FROM search_logs
WHERE timestamp > now() - interval '30 days'
GROUP BY search_term
ORDER BY frequency DESC
LIMIT 1000;

2. Model real distributions

// k6 with realistic distribution
export default function () {
  const action = randomWeighted({
    browse: 0.80,
    search: 0.45,
    view_product: 0.60,
    add_cart: 0.15,
    checkout: 0.03,
  });

  // Think time with log-normal distribution
  // (more realistic than uniform)
  sleep(randomLogNormal(30, 2));

  // Varied data from realistic pool
  const product = getRandomProduct(productPool);
  const searchTerm = getRandomSearchTerm(searchTerms);
}

3. Use production data (anonymized)

Ideal:
  - Export of real product IDs
  - Real search terms (no PII)
  - Real category distribution

Process:
  1. Query production to extract data
  2. Anonymize/mask sensitive data
  3. Create data pool for testing
  4. Use pool with proportional distribution

4. Simulate temporal patterns

// Simulate organic ramp-up
export const options = {
  scenarios: {
    organic_ramp: {
      executor: 'ramping-arrival-rate',
      startRate: 10,
      timeUnit: '1m',
      preAllocatedVUs: 50,
      maxVUs: 500,
      stages: [
        { duration: '2h', target: 100 },  // Morning
        { duration: '2h', target: 300 },  // Peak
        { duration: '2h', target: 200 },  // Afternoon
        { duration: '2h', target: 400 },  // Evening peak
        { duration: '2h', target: 50 },   // Late night
      ],
    },
  },
};

5. Include edge-case behaviors

Don't forget:
  - Users who abandon midway
  - Duplicate requests (double-click)
  - Client timeouts and retries
  - Sessions that expire
  - Old/slow browsers
  - Mobile vs Desktop mix

Validating the Simulation

Compare metrics

Metrics to validate:

Cache:
  Production hit rate: 42%
  Test hit rate: 85% ← Too high, adjust

DB:
  Production queries/s: 5000
  Test queries/s: 1200 ← Too low

Latency distribution:
  Production p99/p50 ratio: 8x
  Test p99/p50 ratio: 2x ← Too uniform

Realism checklist

## Scenario Validation

### Data
- [ ] Product pool > 1000 items?
- [ ] Search terms > 500 unique?
- [ ] User IDs non-sequential?

### Behavior
- [ ] Funnel reflects real conversion?
- [ ] Think time is variable?
- [ ] Action mix reflects analytics?

### Temporal
- [ ] Ramp-up > 30 minutes?
- [ ] Load variation included?
- [ ] Minimum duration 1 hour?

### Edge cases
- [ ] Abandonment included?
- [ ] Client errors simulated?
- [ ] Device mix?

Advanced Techniques

Traffic replay

Concept:
  Capture real traffic from production
  Replay in test environment

Tools:
  - GoReplay (goreplay)
  - TCPReplay
  - Custom log replay

Cautions:
  - Anonymize sensitive data
  - Adjust timestamps
  - Scale proportionally

Shadow traffic

Concept:
  Duplicate % of real traffic to test environment
  Compare behavior in real-time

Advantage:
  100% real traffic

Disadvantage:
  Requires duplicate infrastructure
  Careful with side effects

Production testing (with caution)

Techniques:
  - Canary deployment
  - Feature flags
  - Gradual traffic shifting

Allows:
  Validate with real traffic
  At real scale
  With fast rollback

Conclusion

Synthetic traffic is necessary, but needs to be realistic:

Analyze production first - understand real behavior
Model distributions - don't use fixed values
Diversify data - avoid artificial cache
Simulate temporal patterns - organic ramp-up
Validate against production - compare metrics

The goal isn't to pass the test. It's to predict production behavior.

This article is part of the series on the OCTOPUS Performance Engineering methodology.