Go Security & API Rate Limiting — Token Bucket, Leaky Bucket & Redis Lua

Prerequisite: This is Part 11 of the System Design Masterclass. Previous parts built the core components — this part covers securing APIs and managing client traffic spikes at scale. Answer-first: API rate limiting defends backend services by restricting request volume. Security requires a layered defense: Web Application Firewalls (WAF) block edge-level volumetric spikes, API Gateways manage L7 credentials and quotas, and application middleware enforces fine-grained business limits. Client identification must rely on validated, secure IP parsing (using the PROXY protocol or rightmost X-Forwarded-For checks). ...

June 18, 2026 · 9 min · Tanh

Idempotent API Design in Go — Idempotency Key & Redis SetNX

Prerequisite: Part 7 of the System Design Masterclass. Read Part 6: Distributed Locks — concurrent duplicate request blocking relies on the same mutual exclusion primitives. Answer-first: API idempotency ensures that retrying an identical request (same Idempotency-Key) never produces additional side effects beyond the first execution. This is foundational for payment APIs where network timeouts force client retries, and a duplicate execution would mean a double charge. What Is an Idempotency Key? Answer-first: An Idempotency Key is a unique token — typically UUID v4 — generated by the client and attached as an Idempotency-Key HTTP header. The server uses this key to detect duplicate requests: if the key has been seen before, return the cached response from the first execution without re-executing the handler. ...

June 18, 2026 · 8 min · Tanh

Distributed Locks in Go — Redlock Math, etcd & Split-Brain

Prerequisite: Part 6 of the System Design Masterclass. Read Part 5: Kafka & Event-Driven to understand event sourcing patterns before tackling lock coordination. Answer-first: Distributed locks solve the mutual exclusion problem across independent servers — ensuring only one server can modify a shared resource at a time. Redis Redlock provides high-performance locking using majority quorum across multiple master nodes; etcd provides stronger guarantees via Raft consensus at the cost of higher latency. ...

June 18, 2026 · 8 min · Tanh

Caching Strategies in Go — Cache Stampede, XFetch & Redis LFU

Prerequisite: Part 3 of the System Design Masterclass. Read Part 2: Load Balancing L4/L7 to understand the traffic layer before diving into the caching tier. Answer-first: Effective caching strategy selection hinges on the acceptable consistency window and the read/write access pattern of the workload. Write-Through suits financial records; Write-Behind suits analytics and event counters; Cache-Aside is the default for read-heavy API responses. How Does Cache Stampede Happen? Answer-first: Cache Stampede (thundering herd) occurs when a popular cached key expires and multiple concurrent goroutines simultaneously detect a cache miss — then all query the database simultaneously. The burst of duplicate DB queries can exceed connection pool capacity and cause cascading failure. ...

June 18, 2026 · 9 min · Tanh

Part 6: Location Clustering with Uber H3 & Redis Semantic Caching

Caching an exact GPS coordinate is impossible. Because floating-point numbers are infinitely precise, two users standing 1 meter apart will have completely different coordinates (106.0001 vs 106.0002). If your Redis key is simply lat1,lng1:lat2,lng2, your Cache Hit Rate will forever remain at 0%. Answer-first: To survive massive scale, you must implement Semantic Caching. Instead of caching raw coordinates, use Uber H3 to “snap” coordinates into 100-meter hexagonal buckets. Your cache key becomes route:{h3_origin}:{h3_dest}. This instantly transforms a compute-heavy routing problem into a lightning-fast Redis memory lookup. ...

June 15, 2026 · 4 min · Lê Tuấn Anh

Part 3: Spatial Indexing (Uber H3, PostGIS & Redis GEO)

A fatal mistake made by junior engineers building ride-hailing apps is connecting their API Gateway directly to the Routing Engine. Answer-first: Graphhopper is extremely CPU-intensive. If you ask it to calculate the ETA to all 10,000 drivers currently online in a city, your servers will melt. You must introduce Spatial Indexing (like Uber H3 or Redis GEO) as a high-speed “Pre-filter”. The index quickly finds the 50 closest drivers “as the crow flies” using RAM, and only those 50 are sent to Graphhopper for heavy ETA calculations. ...

June 14, 2026 · 5 min · Lê Tuấn Anh

Chapter 8: Distributed Locking for Race Conditions: Redlock vs ZooKeeper

← Previous | Series hub | Next → Chapter 8: Synchronizing Clusters with Distributed Locks In a standalone Go application, preventing two Goroutines from overwriting the same data (Race Condition) is achieved via sync.Mutex. However, when your system scales out to 10 servers behind a Load Balancer, sync.Mutex is useless because it only locks local RAM. You need a Distributed Lock. 1. Basic Redis Locks Answer-first: A basic Redis lock utilizes SET resource id NX PX ttl. It works for simple caching but suffers from Single Point of Failure vulnerabilities if the Redis Master crashes before syncing. ...

June 9, 2026 · 4 min · Lê Tuấn Anh

Chapter 7: Designing Idempotency APIs for Payment Systems

← Previous | Series hub | Next → Chapter 7: Fortifying Payment Systems with Idempotent APIs In E-commerce or Fintech, the ultimate nightmare is not a system crash, but charging a customer twice for a single order. This is usually caused by network lag, an impatient user double-clicking “Pay”, or automated app retry logic. The mandatory solution for any transactional API (Payment/Order) is Idempotency. 1. What is Idempotency? Answer-first: An operation is idempotent if executing it once or N times yields the exact same system state and outcome. While GET and PUT are natively idempotent, POST requires explicit engineering. ...

June 9, 2026 · 4 min · Lê Tuấn Anh

Chapter 3: Distributed Rate Limiting with Redis & GCRA Algorithm

← Previous | Series hub | Next → Chapter 3: Securing APIs with Distributed Rate Limiting If caching is the shield protecting your database, Rate Limiting is the armor guarding your API servers from DDoS attacks and resource exhaustion caused by abusive clients. Why Local Rate Limiting Fails in Microservices Answer-first: Local RAM limiters fail because Load Balancers distribute traffic across multiple nodes. A user allowed 100 req/sec can exploit a 5-node cluster by sending 500 req/sec, bypassing the intended limit. Centralized state via Redis is required. ...

June 9, 2026 · 3 min · Lê Tuấn Anh

Chapter 2: The 3 Caching Vulnerabilities (Penetration, Breakdown, Avalanche) & Go Singleflight

← Previous | Series hub | Next → Chapter 2: The 3 Deadliest Cache Vulnerabilities Caching is the ultimate shield for databases in distributed systems. However, poorly implemented caches can become the exact reason your system crashes. In this chapter, we dissect three classic caching phenomenons and how to defend against them using Golang. 1. Cache Penetration Answer-first: Cache penetration occurs when attackers query non-existent IDs, bypassing the cache entirely. Defend against it by caching NULL values or utilizing Bloom Filters at the memory level. ...

June 9, 2026 · 3 min · Lê Tuấn Anh