Mastering High-Concurrency Systems in Production

Welcome to the definitive guide on designing and implementing ultra-high-concurrency backend architectures. If you are a Software Engineer or DevOps professional looking to scale Golang services to handle millions of requests per second (C10M), this series is for you.

We dissect real-world production challenges such as the Dual-Write problem, Cache Avalanches, and Distributed Race Conditions, and explore how tech giants like Shopee and Alipay solve them.

Series Chapters

  1. Chapter 1: How Systems Handle Millions of Requests/s (C10M)?
  2. Chapter 2: The 3 Caching Vulnerabilities & Go Singleflight
  3. Chapter 3: Distributed Rate Limiting with Redis & GCRA
  4. Chapter 4: Solving the Dual-Write Problem with Transactional Outbox Pattern
  5. Chapter 5: Optimizing Golang Database Connection Pools
  6. Chapter 6: API Gateway vs Service Mesh
  7. Chapter 7: Designing Idempotency APIs for Payment Systems
  8. Chapter 8: Distributed Locking: Redlock vs ZooKeeper
  9. Chapter 9: Database Sharding & Read/Write Splitting

Chapter 9: Database Sharding & Read/Write Splitting for Billion-Record Tables

← Previous | Series hub Chapter 9: Scaling the Final Database Bottleneck When your application reaches tens of millions of users, the Database becomes the ultimate bottleneck. CPU maxes out at 100%, RAM depletes, and queries take seconds instead of milliseconds. This is the stage where you must deploy distributed database strategies. 1. Read/Write Splitting Answer-first: Because 80% of traffic is Read-only, separate your DB into a Write Master and Read Slaves. Use GORM’s dbresolver plugin to route queries automatically without altering business logic. ...

June 9, 2026 · 3 min · Lê Tuấn Anh

Chapter 8: Distributed Locking for Race Conditions: Redlock vs ZooKeeper

← Previous | Series hub | Next → Chapter 8: Synchronizing Clusters with Distributed Locks In a standalone Go application, preventing two Goroutines from overwriting the same data (Race Condition) is achieved via sync.Mutex. However, when your system scales out to 10 servers behind a Load Balancer, sync.Mutex is useless because it only locks local RAM. You need a Distributed Lock. 1. Basic Redis Locks Answer-first: A basic Redis lock utilizes SET resource id NX PX ttl. It works for simple caching but suffers from Single Point of Failure vulnerabilities if the Redis Master crashes before syncing. ...

June 9, 2026 · 4 min · Lê Tuấn Anh

Chapter 7: Designing Idempotency APIs for Payment Systems

← Previous | Series hub | Next → Chapter 7: Fortifying Payment Systems with Idempotent APIs In E-commerce or Fintech, the ultimate nightmare is not a system crash, but charging a customer twice for a single order. This is usually caused by network lag, an impatient user double-clicking “Pay”, or automated app retry logic. The mandatory solution for any transactional API (Payment/Order) is Idempotency. 1. What is Idempotency? Answer-first: An operation is idempotent if executing it once or N times yields the exact same system state and outcome. While GET and PUT are natively idempotent, POST requires explicit engineering. ...

June 9, 2026 · 4 min · Lê Tuấn Anh

Chapter 6: API Gateway vs Service Mesh in Microservices Architecture

← Previous | Series hub | Next → Chapter 6: Clarifying the Boundaries: API Gateway vs Service Mesh When your Golang application scales from dozens to hundreds of Microservices, managing communication becomes a macro-level challenge. You will constantly encounter two tightly coupled concepts: API Gateway and Service Mesh. Many engineers ask: “If I already deploy Istio (Service Mesh), do I still need Kong (API Gateway)?” The answer lies in the fundamental difference between North-South and East-West traffic. ...

June 9, 2026 · 3 min · Lê Tuấn Anh

Chapter 5: Optimizing Golang Database Connection Pools to Prevent Bottlenecks

← Previous | Series hub | Next → Chapter 5: Unlocking Database Performance via Connection Pooling If your Golang system processes business logic blazingly fast but chokes at the Database layer, 90% of the time, it is due to an incorrectly configured *sql.DB. 1. Understanding *sql.DB Answer-first: In Golang, sql.Open() does NOT create a direct database connection. It instantiates a thread-safe Connection Pool manager. You must initialize the db variable only once during app startup. ...

June 9, 2026 · 3 min · Lê Tuấn Anh

Chapter 4: Solving the Dual-Write Problem with Transactional Outbox Pattern

← Previous | Series hub | Next → Chapter 4: Eliminating the Dual-Write Nightmare When your Golang application migrates from a Monolith to Event-Driven Microservices, you will immediately face an architectural nightmare: the Dual-Write Problem. 1. What is the Dual-Write Problem? Answer-first: Dual-Write occurs when an app attempts to write to a Database and publish to a Message Broker (Kafka) simultaneously. Without a distributed transaction, network failures will cause the two systems to fall out of sync. ...

June 9, 2026 · 3 min · Lê Tuấn Anh

Chapter 3: Distributed Rate Limiting with Redis & GCRA Algorithm

← Previous | Series hub | Next → Chapter 3: Securing APIs with Distributed Rate Limiting If caching is the shield protecting your database, Rate Limiting is the armor guarding your API servers from DDoS attacks and resource exhaustion caused by abusive clients. Why Local Rate Limiting Fails in Microservices Answer-first: Local RAM limiters fail because Load Balancers distribute traffic across multiple nodes. A user allowed 100 req/sec can exploit a 5-node cluster by sending 500 req/sec, bypassing the intended limit. Centralized state via Redis is required. ...

June 9, 2026 · 3 min · Lê Tuấn Anh

Chapter 2: The 3 Caching Vulnerabilities (Penetration, Breakdown, Avalanche) & Go Singleflight

← Previous | Series hub | Next → Chapter 2: The 3 Deadliest Cache Vulnerabilities Caching is the ultimate shield for databases in distributed systems. However, poorly implemented caches can become the exact reason your system crashes. In this chapter, we dissect three classic caching phenomenons and how to defend against them using Golang. 1. Cache Penetration Answer-first: Cache penetration occurs when attackers query non-existent IDs, bypassing the cache entirely. Defend against it by caching NULL values or utilizing Bloom Filters at the memory level. ...

June 9, 2026 · 3 min · Lê Tuấn Anh

Chapter 1: How Systems Handle Millions of Requests/s (C10M)? Lessons from Shopee & Alipay

← Series hub Next → Chapter 1: Overcoming the C10M Barrier To build a system capable of handling millions of Requests Per Second (RPS) — known as the C10M problem — vertical scaling is never enough. It requires a meticulously designed Distributed Architecture. 1. The Shift from C10K to C10M Answer-first: While C10K was solved by non-blocking I/O (like NGINX), C10M shifts the bottleneck to the OS kernel. Systems must bypass the kernel using DPDK or XDP to handle 10 million connections efficiently. ...

June 9, 2026 · 3 min · Lê Tuấn Anh

The Reality of C10M: Surviving Extreme Traffic — Exec Summary

Despite the massive advancements in cloud computing, enterprise applications facing explosive traffic growth inevitably hit a brutal wall: the Database and the Network layer. The root cause lies not in the hardware, but in the Architecture. We attempt to solve the “Millions of Requests per Second” (C10M) problem by simply throwing more servers at it (Vertical/Horizontal Scaling), only to realize that stateful bottlenecks, cache stampedes, and dual-write inconsistencies bring the entire cluster to its knees. ...

June 9, 2026 · 3 min · Lê Tuấn Anh