CAP theorem is overused as an excuse and underused as a design constraint.
The CAP theorem gets cited constantly in system design interviews and architecture discussions, usually as a way to explain why a system made a tradeoff rather than as a genuine design tool. The more useful framing for actual system design is PACELC — which acknowledges that even when there is no partition, you still face a latency-consistency tradeoff. DynamoDB, Cassandra, and Cosmos DB all expose this as a per-request tunable. Most teams use the defaults and never think about it again, which is usually fine — until the day they need strong consistency for a financial transaction and discover their eventual-consistency store cannot provide it.
The distributed systems concept most teams underinvest in is failure mode analysis. Clock skew, network partitions, and split-brain scenarios all look obvious in a textbook and genuinely tricky in production. The tools that have matured most since 2022 are chaos engineering platforms — AWS Fault Injection Simulator, Gremlin, and Netflix's ChAOS all allow teams to inject failures deliberately in staging environments. Teams that practice chaos engineering regularly discover failure modes before they discover them in production. The ROI on a few hours of chaos testing before a major release is difficult to overstate.
For engineers learning distributed systems: Martin Kleppmann's Designing Data-Intensive Applications remains the clearest book on this subject written for practitioners, not academics. Chapter 8 (The Trouble with Distributed Systems) and Chapter 9 (Consistency and Consensus) are worth reading multiple times.