Modern Logistics and Delivery systems rely heavily on one core capability: Calculating distances and travel times (Distance Matrix) quickly and accurately.

How does Grab dispatch millions of drivers every second? How does ShopeeXpress optimize delivery routes for tens of thousands of couriers simultaneously? The secret lies in Routing Engine and Geospatial Indexing architecture.

In this 8-part series, we will dive deep into building a complete Distance Matrix API and Routing Engine using Golang, integrated with Graphhopper, and accelerated by Redis and Uber’s H3 Indexing. This series is designed to be highly visual, starting from scratch (understanding algorithms visually) all the way to large-scale load testing architecture.

🗺️ Series Contents (8 Parts)


Q&A: Frequently Asked Questions

Is this series suitable for beginners?

Absolutely. The series is designed with a “Foundation First” philosophy. Parts 1 and 2 thoroughly explain concepts through visuals and provide step-by-step environment setup instructions (downloading OSM map data, running Docker) so anyone can follow along.

Why combine Golang and Graphhopper?

Golang provides excellent concurrency and a small footprint, making it ideal as an API Gateway. Meanwhile, Graphhopper (written in Java) is an incredibly powerful routing engine. This combination brings out the best of both worlds: Golang handles I/O and Caching, while Graphhopper handles deep algorithmic computations.

Will the source code of the Demo Repo be shared?

Yes. The entire source code, Docker Compose configuration, sample OpenStreetMap data files, and K6/JMeter test scripts will be publicly available on a companion GitHub repository.

Executive Summary — The Big Picture of Geospatial & Routing Architecture

The Engineering Challenge Building a modern logistics platform (like food delivery, ride-hailing, or fleet management) requires computing distances and Estimated Times of Arrival (ETA) at an immense scale. The $N^2$ Problem: If you have 1,000 drivers and 1,000 orders, calculating the distance between every possible combination requires 1,000,000 individual route calculations. Speed: These calculations must happen in real-time (under 50ms) to ensure seamless user experiences and prevent dispatching algorithms from timing out. Accuracy: The system must account for real-world constraints such as one-way streets, “no left turn” rules, and dynamic traffic congestion. Standard point-to-point APIs (like basic Google Maps API calls) are too slow and too expensive for massive Distance Matrix generation. You need an internal, highly optimized Routing Engine. ...

June 14, 2026 · 3 min · Lê Tuấn Anh

Part 8: Zero-Downtime Map Updates & Multi-Region Kubernetes

Writing a fast algorithm is only half the battle. The true test of a Principal Engineer is deploying a massive, stateful Routing Engine to the Cloud without causing a single second of downtime during map updates or infrastructure failures. Answer-first: You cannot treat Graphhopper like a stateless web server. Updating the OpenStreetMap data takes 30 minutes of heavy computation. You MUST decouple the map build process using Kubernetes Jobs, inject the pre-computed 50GB cache via initContainers, and switch traffic instantly using Blue-Green Deployments. ...

June 15, 2026 · 5 min · Lê Tuấn Anh

Part 7: Load Testing and Performance Tuning for Production

Load testing is the final boss of System Design. A junior engineer runs a script, sees “20,000 RPS” with 0 errors, and assumes the system is ready. A Principal Engineer knows that unless you tune the Linux Kernel, bypass Coordinated Omission, and simulate realistic chaos, that number is a complete lie. Answer-first: Load testing a routing engine is not just about testing your Go code. It is a brutal stress test of the Linux Kernel network stack (sockets, TCP reuse, SOMAXCONN), the Go runtime scheduler, and the memory footprint of your load testing tool itself. ...

June 15, 2026 · 4 min · Lê Tuấn Anh

Part 6: Location Clustering with Uber H3 & Redis Semantic Caching

Caching an exact GPS coordinate is impossible. Because floating-point numbers are infinitely precise, two users standing 1 meter apart will have completely different coordinates (106.0001 vs 106.0002). If your Redis key is simply lat1,lng1:lat2,lng2, your Cache Hit Rate will forever remain at 0%. Answer-first: To survive massive scale, you must implement Semantic Caching. Instead of caching raw coordinates, use Uber H3 to “snap” coordinates into 100-meter hexagonal buckets. Your cache key becomes route:{h3_origin}:{h3_dest}. This instantly transforms a compute-heavy routing problem into a lightning-fast Redis memory lookup. ...

June 15, 2026 · 4 min · Lê Tuấn Anh

Part 5: Route Visualization UI with Mapbox & Deck.gl

Rendering a single route on Google Maps is trivial. Rendering 100,000 historical vehicle routes, Origin-Destination matrices, and dynamic H3 geofences simultaneously? That requires offloading computation from the browser’s CPU to the GPU using WebGL. Answer-first: Do not use native Mapbox GL JS to render massive, dynamic datasets. Modifying the DOM or standard Mapbox sources with thousands of updates per second will freeze the browser. The industry standard is to use deck.gl paired with MapboxOverlay. This allows Deck.gl to render raw data directly onto the GPU while perfectly synchronizing with Mapbox’s camera. ...

June 14, 2026 · 3 min · Lê Tuấn Anh

Part 4: Golang API & Microservices Integration (Kratos & Dapr)

Building a simple API that calls Graphhopper via http.Get is easy. Building a Principal-level API Gateway that survives 10,000 concurrent riders requesting routes without crashing is a masterclass in Distributed Systems. Answer-first: Graphhopper is a heavily CPU-bound downstream service. If your Golang API blindly accepts traffic and forwards it, a slight slowdown in Graphhopper will cause your Goroutines to pile up, exhausting your server’s RAM and triggering a cascading failure. You must implement a “Defense in Depth” strategy using Concurrency Bounding, Circuit Breakers, and Asynchronous Pub/Sub. ...

June 14, 2026 · 4 min · Lê Tuấn Anh

Part 3: Spatial Indexing (Uber H3, PostGIS & Redis GEO)

A fatal mistake made by junior engineers building ride-hailing apps is connecting their API Gateway directly to the Routing Engine. Answer-first: Graphhopper is extremely CPU-intensive. If you ask it to calculate the ETA to all 10,000 drivers currently online in a city, your servers will melt. You must introduce Spatial Indexing (like Uber H3 or Redis GEO) as a high-speed “Pre-filter”. The index quickly finds the 50 closest drivers “as the crow flies” using RAM, and only those 50 are sent to Graphhopper for heavy ETA calculations. ...

June 14, 2026 · 5 min · Lê Tuấn Anh

Part 2: Zero to Hero Environment Setup (Docker, OSM, Golang)

Setting up a local routing engine is notoriously difficult. Most generic tutorials offer a basic Docker command that crashes silently, leaving developers confused. In this guide, we bypass the basic “Hello World” setups. We will build a production-grade local environment integrating OpenStreetMap (OSM) data, a properly tuned Graphhopper (Java) Docker container, and a high-concurrency Golang API Gateway. 1. Downloading and Cropping Map Data Answer-first: Download raw OpenStreetMap data in .osm.pbf format from the Geofabrik server. To save gigabytes of RAM during local development, use osmium extract to crop the massive country-level map down to a single city bounding box. ...

June 14, 2026 · 5 min · Lê Tuấn Anh

Part 1: Core Algorithms (A*, Dijkstra) Visualized - Routing Architecture Masterclass

When building a high-scale logistics or delivery system, generic algorithm tutorials often lead developers astray. They tell you that A* is universally better than Dijkstra. However, in the real world of Routing Engines and Distance Matrices, the truth is much more complex. In this first part of our masterclass, we will move beyond academic theory. We will visualize the exact lifecycle of a routing request—from snapping a GPS coordinate to the road, to bypassing traffic, and finally calculating routes in milliseconds using Contraction Hierarchies. ...

June 14, 2026 · 6 min · Lê Tuấn Anh