When testing performance, we have two fundamental approaches: create artificial (synthetic) load or use real traffic. Each approach has advantages and limitations. Choosing wrong can give false confidence or waste resources.
Synthetic test shows what the system can handle. Real test shows what the system actually handles.
Synthetic Tests
What they are
Load artificially generated by tools like k6, Gatling, JMeter, Locust.
// k6 example
import http from 'k6/http';
export const options = {
vus: 100,
duration: '10m',
};
export default function() {
http.get('https://api.example.com/products');
http.post('https://api.example.com/orders', JSON.stringify({
product_id: 123,
quantity: 1
}));
}
Advantages
1. Controllable and reproducible
Test 1: 1000 VUs, 10 min → Result A
Test 2: 1000 VUs, 10 min → Result A (same)
→ Valid comparisons between versions
2. Scalable
Real traffic: 5,000 RPS
Synthetic test: 50,000 RPS
→ Can simulate future scenarios
3. Safe
- Isolated environment
- Test data
- No impact on real users
4. Specific scenarios
- 10x spike in 5 seconds
- 100% of a specific endpoint
- Only non-authenticated users
Limitations
1. Artificial patterns
Synthetic:
- Uniformly distributed requests
- Identical payloads
- Predictable timing
Real:
- Irregular bursts
- Varied payloads
- Unpredictable human behavior
2. Test data vs production
Synthetic with 1,000 products:
Query: 5ms
Production with 10 million products:
Query: 500ms
→ Data volume changes everything
3. Mocked dependencies
Payment mock: fixed 10ms
Real gateway: variable 100-3000ms
→ Dependency latency not captured
Real Traffic Tests
What they are
Using production traffic to validate performance.
Techniques
1. Shadow Traffic (Traffic Mirroring)
Request → Main App → Response
↓
Canary App (copy)
↓
Metrics (discard response)
# Istio mirroring
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
spec:
http:
- route:
- destination:
host: app-v1
mirror:
host: app-v2
mirrorPercentage:
value: 100.0
2. Canary Release
100% traffic → v1
↓
95% → v1, 5% → v2 (observe metrics)
↓
90% → v1, 10% → v2
↓
... gradually ...
↓
100% → v2
3. Blue-Green with test
Blue (production) ← 100% traffic
Green (new) ← Synthetic tests
After validation:
Green ← 100% traffic
Blue ← Standby
Advantages
1. Real patterns
- Real endpoint distribution
- Real user mix
- Real peak times
2. Real data
- Production volume
- Real queries
- Natural edge cases
3. Real dependencies
- Real API latency
- Real cache behavior
- Real network failures
Limitations
1. Unpredictable
Monday: 10,000 RPS
Black Friday: 100,000 RPS
→ Can't guarantee specific scenario
2. Risk
Bug in production → Affects real users
Degradation → Revenue impact
3. Not reproducible
Problem at 2:32 PM → Can't recreate exactly
Comparison
| Aspect | Synthetic | Real |
|---|---|---|
| Control | High | Low |
| Reproducibility | High | Low |
| Risk | None | High |
| Authenticity | Low | High |
| Scalability | Unlimited | Limited to current traffic |
| Cost | Test infra | Minimal |
| Extreme scenarios | Easy | Difficult |
When to Use Each
Use synthetic when:
✓ Before releases (validation)
✓ Comparing versions (technical A/B)
✓ Testing limits (stress test)
✓ Specific scenarios (spike, soak)
✓ Development environment
✓ Compliance/audit (reproducible evidence)
Use real when:
✓ Final validation before complete rollout
✓ Continuous performance monitoring
✓ Edge case discovery
✓ Testing real dependencies
✓ Canary releases
✓ Validating optimizations in production
Use both when:
Complete pipeline:
1. Synthetic in CI/CD (quality gate)
2. Synthetic in staging (pre-prod validation)
3. Canary with real (final validation)
4. Production monitoring (continuous)
Recommended Hybrid Strategy
Phase 1: Development
Local synthetic:
- Quick smoke tests
- Basic endpoint validation
- Obvious regression detection
Phase 2: CI/CD
Automated synthetic:
- Load test on each PR
- Baseline comparisons
- Performance gate
# GitHub Actions example
- name: Performance Test
run: |
k6 run --out json=results.json tests/load.js
- name: Compare with Baseline
run: |
./scripts/compare-perf.sh results.json baseline.json
Phase 3: Staging
Realistic synthetic:
- Volume close to production
- Data similar to production
- Real or realistically simulated dependencies
Phase 4: Canary
Progressive real:
- 1% real traffic
- Metrics compared with baseline
- Automatic rollback if degradation
# Argo Rollouts
spec:
strategy:
canary:
steps:
- setWeight: 1
- pause: {duration: 10m}
- analysis:
templates:
- templateName: success-rate
- setWeight: 5
- pause: {duration: 10m}
...
Phase 5: Production
Continuous monitoring:
- Real-time SLOs
- Degradation alerts
- Performance dashboards
Common Pitfalls
1. Only synthetic, never real
❌ "Passed load test, it's ready"
→ Production has unforeseen behaviors
✅ "Passed load test, let's validate with canary"
2. Only real, never synthetic
❌ "Let's see how it behaves in production"
→ Discovers problem with affected users
✅ "Validated in staging, now gradual canary"
3. Synthetic with unrealistic data
❌ Test with 100 products, production with 10M
→ Non-representative results
✅ Test data proportional to production
4. Ignoring real variability
❌ Compare synthetic average with real average
→ Synthetic is stable, real varies a lot
✅ Compare percentiles and distributions
Conclusion
Synthetic and real are complementary, not exclusive:
| Synthetic | Real |
|---|---|
| Validates capacity | Validates behavior |
| Finds limits | Finds edge cases |
| Comparable | Authentic |
| Pre-production | Production |
The ideal strategy:
- Frequent synthetic in development and CI/CD
- Intensive synthetic in staging before releases
- Gradual real via canary for final validation
- Continuous real monitoring in production
The best test is the one you actually do. Imperfect synthetic is better than no test.