Fundamentals7 min

Network latency: the invisible cost of communication

In distributed systems, the network is a critical part of performance. Understand how network latency affects your system.

In a monolithic application, function calls are practically instantaneous — nanoseconds. In distributed systems, each call traverses the network, adding milliseconds of latency. This 6 orders of magnitude difference changes everything.

This article explores how network latency affects performance and strategies to minimize its impact.

The network is the most expensive lie in distributed computing.

The Real Cost of the Network

Latency comparison

Operation Typical time
L1 cache access 0.5 ns
RAM access 100 ns
SSD read 150 μs
Round-trip same datacenter 0.5 ms
Round-trip same region 1-5 ms
Round-trip cross-region 50-150 ms
Round-trip intercontinental 100-300 ms

A network call in the same datacenter is 1 million times slower than RAM access.

The problem multiplies

User request
    ↓
API Gateway (1ms)
    ↓
Service A (2ms)
    ↓
Service B (2ms)
    ↓
Database (3ms)
    ↓
Response: 8ms just from network

And that's assuming everything works on the first try.

Network Latency Components

1. Propagation

Time for the physical signal to travel through the medium.

Speed of light in fiber ≈ 200,000 km/s
São Paulo → Virginia ≈ 8,000 km
Minimum time: 40ms (one way)
Round-trip: 80ms (theoretical minimum)

You can't beat physics.

2. Transmission

Time to put all bits on the medium.

1MB payload on 1Gbps link = 8ms
1MB payload on 100Mbps link = 80ms

3. Processing

Time in routers, firewalls, load balancers.

Each hop adds microseconds to milliseconds.

4. Queuing

Waiting when links are congested.

Can vary from 0 to hundreds of milliseconds.

Common Problems

Chatty protocols

Many small calls instead of few large calls.

// Bad: 100 network calls
for (id in ids) {
    items.push(await fetch(`/api/items/${id}`));
}

// Good: 1 network call
items = await fetch(`/api/items?ids=${ids.join(',')}`);

N+1 in services

The same N+1 problem from databases, but between services.

// Orders service
orders = getOrders(userId)         // 1 call
for order in orders:
    customer = getCustomer(order.customerId)  // N calls

Synchronous chained calls

A → B → C → D → Database
   5ms  5ms  5ms  5ms  = 20ms minimum

Total latency is the sum of all calls.

Retry storms

When timeouts cause retries that cause more load that causes more timeouts.

Slow service
    ↓
Timeout (2s)
    ↓
3 retries × 1000 clients = 3000 extra requests
    ↓
Service even slower
    ↓
Collapse

Mitigation Strategies

1. Reduce calls

Batching: group operations

// Instead of
await Promise.all(ids.map(id => getItem(id)));

// Use batch API
await getItems(ids);

Prefetching: fetch data before you need it

// While processing page 1, fetch page 2
const page1 = await getPage(1);
const page2Promise = getPage(2);  // Already started
// ... process page 1 ...
const page2 = await page2Promise;

2. Parallelism

// Sequential: 15ms
const a = await serviceA();  // 5ms
const b = await serviceB();  // 5ms
const c = await serviceC();  // 5ms

// Parallel: 5ms
const [a, b, c] = await Promise.all([
    serviceA(),
    serviceB(),
    serviceC()
]);

3. Aggressive caching

Avoid network calls when possible.

const cache = new Map();

async function getUser(id) {
    if (cache.has(id)) return cache.get(id);

    const user = await userService.get(id);
    cache.set(id, user);
    return user;
}

4. Compression

Reduce bytes transmitted.

JSON response: 100KB
Compressed (gzip): 15KB
Savings: 85% of bandwidth

5. Keep-alive connections

Avoid overhead of establishing connections.

New TCP connection: ~1-3ms
Existing connection: ~0ms overhead

6. Locality

Keep services that communicate a lot close together.

Service A and B in same zone: 0.5ms
Service A and B in different regions: 50ms

7. Asynchronous communication

When possible, don't wait for a response.

// Synchronous: blocks
await notificationService.send(email);

// Asynchronous: doesn't block
messageQueue.publish('send-email', email);

Timeouts and Retries

Configuring timeouts

Timeout = p99 latency × 2 (or more)

Too short: false positives Too long: stuck resources

Retry with backoff

async function fetchWithRetry(url, maxRetries = 3) {
    for (let i = 0; i < maxRetries; i++) {
        try {
            return await fetch(url);
        } catch (e) {
            if (i === maxRetries - 1) throw e;
            await sleep(Math.pow(2, i) * 100);  // Exponential backoff
        }
    }
}

Circuit breaker

Stop trying when the service is clearly having problems.

if (failureRate > 50%) {
    // Circuit open: fail fast
    throw new Error('Service unavailable');
}

Essential Metrics

Metric Why it matters
Latency p50, p95, p99 Real user experience
Request rate Call volume
Error rate Communication failures
Retries Indicates stability issues
Connection pool usage Connection pressure
Bytes in/out Data volume

Conclusion

The network is an inevitable reality in distributed systems. To minimize its impact:

  1. Reduce calls — batch, cache, prefetch
  2. Parallelize — when there's no dependency
  3. Compress — fewer bytes = less time
  4. Plan locality — nearby services = lower latency
  5. Use async — don't wait when you don't need to
  6. Configure timeouts — don't wait forever
  7. Monitor — you can't improve what you don't measure

Every network call is a bet. Minimize your bets.

networklatencydistributedmicroservices

Want to understand your platform's limits?

Contact us for a performance assessment.

Contact Us