Part 7: Load Testing and Performance Tuning for Production

Load testing is the final boss of System Design. A junior engineer runs a script, sees “20,000 RPS” with 0 errors, and assumes the system is ready. A Principal Engineer knows that unless you tune the Linux Kernel, bypass Coordinated Omission, and simulate realistic chaos, that number is a complete lie. Answer-first: Load testing a routing engine is not just about testing your Go code. It is a brutal stress test of the Linux Kernel network stack (sockets, TCP reuse, SOMAXCONN), the Go runtime scheduler, and the memory footprint of your load testing tool itself. ...

June 15, 2026 · 4 min · Lê Tuấn Anh

Part 6: Location Clustering with Uber H3 & Redis Semantic Caching

Caching an exact GPS coordinate is impossible. Because floating-point numbers are infinitely precise, two users standing 1 meter apart will have completely different coordinates (106.0001 vs 106.0002). If your Redis key is simply lat1,lng1:lat2,lng2, your Cache Hit Rate will forever remain at 0%. Answer-first: To survive massive scale, you must implement Semantic Caching. Instead of caching raw coordinates, use Uber H3 to “snap” coordinates into 100-meter hexagonal buckets. Your cache key becomes route:{h3_origin}:{h3_dest}. This instantly transforms a compute-heavy routing problem into a lightning-fast Redis memory lookup. ...

June 15, 2026 · 4 min · Lê Tuấn Anh

Part 4: Golang API & Microservices Integration (Kratos & Dapr)

Building a simple API that calls Graphhopper via http.Get is easy. Building a Principal-level API Gateway that survives 10,000 concurrent riders requesting routes without crashing is a masterclass in Distributed Systems. Answer-first: Graphhopper is a heavily CPU-bound downstream service. If your Golang API blindly accepts traffic and forwards it, a slight slowdown in Graphhopper will cause your Goroutines to pile up, exhausting your server’s RAM and triggering a cascading failure. You must implement a “Defense in Depth” strategy using Concurrency Bounding, Circuit Breakers, and Asynchronous Pub/Sub. ...

June 14, 2026 · 4 min · Lê Tuấn Anh

Part 3: Spatial Indexing (Uber H3, PostGIS & Redis GEO)

A fatal mistake made by junior engineers building ride-hailing apps is connecting their API Gateway directly to the Routing Engine. Answer-first: Graphhopper is extremely CPU-intensive. If you ask it to calculate the ETA to all 10,000 drivers currently online in a city, your servers will melt. You must introduce Spatial Indexing (like Uber H3 or Redis GEO) as a high-speed “Pre-filter”. The index quickly finds the 50 closest drivers “as the crow flies” using RAM, and only those 50 are sent to Graphhopper for heavy ETA calculations. ...

June 14, 2026 · 5 min · Lê Tuấn Anh

Tech Radar (14/06/2026): Kratos & Dapr State Management

Welcome back to the Tech Radar bulletin. In modern Microservices architecture, maintaining a system capable of communicating flexibly both externally (HTTP) and internally (gRPC) is an essential requirement. Simultaneously, State Management in distributed environments demands rigorous solutions to prevent data collisions. Today, we will dissect how to combine Go’s highly acclaimed Kratos framework with Dapr v1.15 to comprehensively solve this problem. 1. Kratos Dual-Protocol: HTTP & gRPC Running in Parallel Answer-first: The Kratos framework integrates with Dapr v1.15 State Management via the sidecar pattern, allowing HTTP and gRPC servers to run concurrently. To avoid state collisions when running dual-protocol, the system uses Dapr ETags via SaveStateWithETag for Optimistic Concurrency Control, and uses Middleware for Metadata synchronization. ...

June 14, 2026 · 4 min · Lê Tuấn Anh

Tech Radar (13/06/2026): Go 1.26 GC, K8s Pod Resizing & AI-Native

Welcome back to the Tech Radar bulletin, where we filter out the noise of the tech industry to uncover the genuine trends shaping future System Architecture. The second week of June 2026 witnessed three massive shifts, from core infrastructure (Go, Kubernetes) to the maturation of AI-Native architecture. From the perspective of a System Architect, these are updates you cannot ignore to optimize your High-Concurrency systems. 1. Golang 1.26: “Green Tea” GC Architecture - The Savior for RAM-Hungry Microservices Enabled by default in Go 1.26, the Garbage Collector codenamed “Green Tea” is not just a performance patch; it is a core architectural overhaul. ...

June 13, 2026 · 4 min · Lê Tuấn Anh

Chapter 1: How Systems Handle Millions of Requests/s (C10M)? Lessons from Shopee & Alipay

← Series hub Next → Chapter 1: Overcoming the C10M Barrier To build a system capable of handling millions of Requests Per Second (RPS) — known as the C10M problem — vertical scaling is never enough. It requires a meticulously designed Distributed Architecture. 1. The Shift from C10K to C10M Answer-first: While C10K was solved by non-blocking I/O (like NGINX), C10M shifts the bottleneck to the OS kernel. Systems must bypass the kernel using DPDK or XDP to handle 10 million connections efficiently. ...

June 9, 2026 · 3 min · Lê Tuấn Anh

The Reality of C10M: Surviving Extreme Traffic — Exec Summary

Despite the massive advancements in cloud computing, enterprise applications facing explosive traffic growth inevitably hit a brutal wall: the Database and the Network layer. The root cause lies not in the hardware, but in the Architecture. We attempt to solve the “Millions of Requests per Second” (C10M) problem by simply throwing more servers at it (Vertical/Horizontal Scaling), only to realize that stateful bottlenecks, cache stampedes, and dual-write inconsistencies bring the entire cluster to its knees. ...

June 9, 2026 · 3 min · Lê Tuấn Anh

Real-Time Inventory Synchronization: Kafka, CDC & Redis for E-commerce

What Is Real-Time Inventory Synchronization? Real-time inventory synchronization is the process of propagating stock count changes from the system of record (database) to all sales channels — web storefront, mobile app, WMS, ERP — in sub-second time. Instead of batch ETL jobs that run every hour, a CDC + Kafka pipeline streams every committed stock change as an event, eliminating overselling and stale stock displays. Handling this during a flash sale — where thousands of users attempt to purchase a highly contested SKU simultaneously — is a pinnacle architectural challenge. Traditional synchronous database updates collapse under lock contention. ...

June 8, 2026 · 6 min · Lê Tuấn Anh

Real-Time Ride-Hailing Architecture: Uber & Grab Stack

Answer-first: How Uber and Grab handle millions of GPS pings/sec: H3 geospatial indexing, Kafka, DISCO matching engine, surge pricing, and RAMEN push notifications. The moment you open the Uber or Grab app, a cascade of real-time systems activates simultaneously: your phone begins transmitting GPS coordinates, a geospatial index updates your location, a matching engine re-evaluates nearby driver availability, a pricing model recalculates the fare based on supply-demand ratios, and a push notification pipeline prepares to deliver your match confirmation in under 3 seconds. ...

June 1, 2026 · 13 min · Lê Tuấn Anh