Fundamentals9 min

Microservices and Performance: the cost of distribution

Microservices bring organizational benefits, but also performance challenges. Understand the trade-offs.

Microservices have become the standard for modern systems. They offer real benefits: deploy independence, granular scalability, team autonomy. But these benefits come with performance costs that are often underestimated.

This article explores performance challenges in microservices architectures and strategies to mitigate them.

Microservices aren't free. You trade internal complexity for network complexity.

The Cost of Distribution

From function to network

Monolith:
userService.getUser(id)  → 0.001ms (function call)

Microservice:
HTTP GET /users/{id}     → 1-5ms (network call)

1,000x to 5,000x difference per call.

Accumulated latency

Monolith:
Request → Internal process → Response
          2ms total

Microservices:
Request → API Gateway → Auth → User Service → Order Service → Response
          1ms        2ms    3ms      4ms           3ms
          = 13ms total

Cascading failures

In a monolith, a failure usually affects only part of the system. In microservices, a failure can propagate:

Payment Service slow
    ↓
Order Service timeout waiting
    ↓
API Gateway accumulates connections
    ↓
All endpoints become slow
    ↓
Entire system degraded

Common Problems

1. Over-fetching and Under-fetching

Over-fetching: fetching more data than needed

// Client needs only name
GET /users/123
{
    "id": 123,
    "name": "John",
    "email": "...",
    "address": {...},
    "preferences": {...},
    "history": [...]  // 50KB of unnecessary data
}

Under-fetching: needing multiple calls

// To build a page
const user = await getUser(id);           // Call 1
const orders = await getOrders(user.id);  // Call 2
const products = await getProducts(orders.map(o => o.productId));  // Call 3

2. Distributed Monolith

Tightly coupled microservices that need to be deployed together.

Service A changes interface
    ↓
Service B needs to update
    ↓
Service C depends on B
    ↓
Coordinated deploy of A, B, C
    ↓
Worst of both worlds

3. Serialization overhead

Object in memory: direct access
Object via network: serialize → transmit → deserialize

Large JSON:
Serialization: 5ms
Transmission: 10ms
Deserialization: 5ms
= 20ms overhead

4. Service mesh overhead

Sidecars add latency to each call.

App → Sidecar → Network → Sidecar → App
     0.5ms              0.5ms

+1ms per hop (minimum)

Patterns for Better Performance

1. API Composition

Aggregate data in a single endpoint.

// Instead of 3 calls from client
GET /users/123
GET /orders?userId=123
GET /products?ids=1,2,3

// One call to an aggregator
GET /user-dashboard/123
{
    "user": {...},
    "orders": [...],
    "products": [...]
}

2. GraphQL

Client specifies exactly what it needs.

query {
    user(id: 123) {
        name
        orders {
            id
            total
        }
    }
}

3. CQRS (Command Query Responsibility Segregation)

Separate models for reading and writing.

Writes: normalized microservices
Reads: optimized denormalized views

User requests dashboard → Materialized view → Fast response

4. Event-Driven Architecture

Communicate via events instead of synchronous calls.

// Synchronous
orderService.createOrder(order)
    → inventoryService.reserve(items)    // Wait
    → paymentService.charge(amount)      // Wait
    → notificationService.send(email)    // Wait

// Asynchronous
orderService.createOrder(order)
    → publish("OrderCreated")

// Other services react to events
// No waiting, no cascade

5. Inter-service cache

// Service A caches data from Service B
async function getUser(id) {
    const cached = await redis.get(`user:${id}`);
    if (cached) return JSON.parse(cached);

    const user = await userService.get(id);
    await redis.setex(`user:${id}`, 300, JSON.stringify(user));
    return user;
}

6. Bulk APIs

Support batch operations.

// Inefficient
GET /products/1
GET /products/2
GET /products/3

// Efficient
GET /products?ids=1,2,3
// or
POST /products/batch
{"ids": [1, 2, 3]}

Distributed Observability

Distributed tracing

Essential to understand end-to-end latency.

Request ID: abc-123
├── API Gateway (2ms)
├── Auth Service (5ms)
├── User Service (15ms)  ← Bottleneck!
│   └── Database (12ms)
└── Response (total: 22ms)

Per-service metrics

Metric Why it matters
Latency per endpoint Identify slow endpoints
Error rate per dependency Identify problematic services
Request rate Communication volume
Timeout rate Integration problems

Correlated logs

Same request ID across all services.

[abc-123] API Gateway: Received request
[abc-123] Auth: Token validated
[abc-123] User: Fetching user 456
[abc-123] User: Database query took 12ms

When NOT to Use Microservices

Microservices aren't always the answer.

Avoid when:

  • Small team (< 10 people)
  • Domain not well understood
  • Latency is critical and every ms matters
  • No observability infrastructure
  • No capacity to operate distributed systems

The well-structured monolith can have:

  • Simple deployment
  • Easy debugging
  • Minimal latency
  • ACID transactions

Conclusion

Microservices are a trade-off:

Gain Cost
Deploy independence Operational complexity
Granular scalability Network latency
Team autonomy Harder observability
Resilience (if done well) Cascading failures (if done poorly)

For performance in microservices:

  1. Minimize calls — aggregators, batch, cache
  2. Use async when possible — events, queues
  3. Invest in observability — tracing, metrics, logs
  4. Design for failure — timeouts, circuit breakers, fallbacks
  5. Question the need — not everything needs to be a microservice

Microservices are an organizational decision with technical consequences. Understand the cost before paying.

microservicesarchitecturedistributedlatency

Want to understand your platform's limits?

Contact us for a performance assessment.

Contact Us