Fallacies

Fallacies of distributed computing

List

The network is reliable.

  • assume that network calls always succeed.

  • networks can be unreliable — packets get dropped, connections time out, routers go down.

  • System my crash or hang

  • Best Practice

    • Use timeouts, retries, circuit breakers (e.g. Resilience4j).

Latency is zero.

  • assume messages travel instantly.

  • even on fast networks, latency can vary significantly

  • e.g. network hops, cloud VPC

  • If you assume zero latency, you might chain too many remote calls synchronously, leading to bad performance.

  • Best Practice

    • Design APIs with latency in mind

    • Use asynchronous messaging (e.g. Kafka) where appropriate.

Bandwidth is infinite.

  • We assume the network can handle as much data as we send.

  • In reality, bandwidth is limited and can saturate quickly.

  • A service tries to send a 1GB file to another service via REST — network congestion slows everything else down.

  • A burst of large responses might overwhelm the load balancer or the database replication stream.

  • Best Practice

    • Use streaming APIs (e.g. application/ndjson).

    • Limit payload sizes.

    • Compress data if needed.

The network is secure.

  • assume the network is private and can’t be attacked.

  • In reality, networks can be sniffed, man-in-the-middled, or spoofed.

  • Always use TLS (HTTPS).

  • Authenticate and authorize service calls (e.g. mutual TLS or OAuth2).

Topology doesn't change.

  • We assume the network structure (hosts, IPs, routing) is stable.

  • In reality, servers go down, IPs change (especially in the cloud), or services are moved.

  • Use service discovery (e.g. Eureka, Consul) instead of hard-coded IPs.

  • Expect dynamic scaling and design for it.

There is one administrator.

  • assume a single entity manages the entire system.

  • In reality, different teams or organizations might manage different parts.

  • Your team manages a microservice but relies on a shared Kafka cluster managed by another team — they might change configurations or restart the broker unexpectedly.

  • Define clear contracts and SLAs.

  • Expect partial outages of dependencies.

Transport cost is zero.

  • assume that sending data over the network is free.

  • In reality, data transfer can cost money (especially in the cloud) and CPU.

  • AWS charges for data transfer between availability zones or regions. Sending lots of logs or files can generate high bills.

  • Large payloads also cost CPU cycles on both ends.

  • Optimize data transfer.

  • Use caching and batching where appropriate.

The network is homogeneous.

  • assume all parts of the network are the same.

  • In reality, they vary: different OS, software versions, hardware, languages, protocols.

  • A Spring Boot service on Linux calls a .NET service on Windows — different serialization, case sensitivity, or character encodings might cause bugs.

  • Kafka clients might behave differently between Java and Python.

  • Use standard protocols (e.g. JSON, gRPC).

  • Test across multiple environments.

  • https://ably.com/blog/8-fallacies-of-distributed-computing

Last updated

Was this helpful?