Rate Limiting

Go API Rate Limiting: Token Bucket & Redis Lua

Prerequisite: This is Part 11 of the System Design Masterclass. Previous parts built the core components — this part covers securing APIs and managing client traffic spikes at scale. Answer-first: API rate limiting defends backend services by restricting request volume. Security requires a layered defense: Web Application Firewalls (WAF) block edge-level volumetric spikes, API Gateways manage L7 credentials and quotas, and application middleware enforces fine-grained business limits. Client identification must rely on validated, secure IP parsing (using the PROXY protocol or rightmost X-Forwarded-For checks). ...

Load Balancing L4/L7 in Go — DSR, Rate Limiting & API Gateway

Prerequisite: Part 2 of the System Design Masterclass. Read Part 1: System Design Thinking first to understand foundational trade-off frameworks. Answer-first: L4 load balancing routes traffic by transport-layer (IP/TCP/UDP) metadata — minimal CPU overhead but limited intelligence. L7 load balancing inspects HTTP headers, paths, and cookies — enables content-based routing and advanced health checks at the cost of higher processing overhead per request. L4 vs L7 Load Balancing — The Definitive Comparison Answer-first: The fundamental difference is where in the network stack the routing decision is made. L4 (Transport Layer) routes at TCP/UDP level using IP+port tuples. L7 (Application Layer) routes at HTTP level using headers, URLs, and payloads. ...

Chapter 3: Distributed Rate Limiting with Redis & GCRA Algorithm

← Previous | Series hub | Next → Chapter 3: Securing APIs with Distributed Rate Limiting If caching is the shield protecting your database, Rate Limiting is the armor guarding your API servers from DDoS attacks and resource exhaustion caused by abusive clients. Why Local Rate Limiting Fails in Microservices Answer-first: Local RAM limiters fail because Load Balancers distribute traffic across multiple nodes. A user allowed 100 req/sec can exploit a 5-node cluster by sending 500 req/sec, bypassing the intended limit. Centralized state via Redis is required. ...

Shopee Flash Sale Architecture: Rate Limiting & Redis

Answer-first: Architecting flash sale systems requires preventing database overload using multi-stage rate limiting and Redis-based inventory pre-decisions. Traffic shields block duplicate requests, and asynchronous checkout queues decouple order submission from payment processing. What You’ll Learn That AI Won’t Tell You Multi-stage cache invalidation strategies that keep checkout fast. Handling concurrent Redis connections during million-user flash sales. At exactly midnight on 11.11, Shopee users across Southeast Asia and Taiwan simultaneously tap the same button. In the first 10 seconds of a flash sale, a single product page can receive requests from millions of concurrent sessions — all competing to purchase the same 1,000 units of inventory. One oversell, one server crash, or one database deadlock during that window results in a cascade of chargebacks, angry users, and front-page news headlines. ...