Caching an exact GPS coordinate is impossible. Because floating-point numbers are infinitely precise, two users standing 1 meter apart will have completely different coordinates (106.0001 vs 106.0002). If your Redis key is simply lat1,lng1:lat2,lng2, your Cache Hit Rate will forever remain at 0%.
Answer-first: To survive massive scale, you must implement Semantic Caching. Instead of caching raw coordinates, use Uber H3 to “snap” coordinates into 100-meter hexagonal buckets. Your cache key becomes route:{h3_origin}:{h3_dest}. This instantly transforms a compute-heavy routing problem into a lightning-fast Redis memory lookup.
1. The Anatomy of a Semantic Cache
By generating H3 indexes for the origin and destination, we create a deterministic string. If User A and User B are standing in the same parking lot and want to go to the same airport, they generate the exact same H3 Hexagon IDs.
Escaping the Distance Matrix Latency Trap
When generating a 10x10 Distance Matrix, your API needs to check 100 cache keys. If you use a simple for loop executing GET 100 times, you will incur 100 network roundtrips. Even if the cache is hit, the network latency will destroy performance.
The Fix: You MUST use Redis MGET (Multi-Get) or TCP Pipelining to fetch all 100 keys in a single network trip. This reduces latency from 100ms down to 2ms.
2. Dealing with the Dark Side of Redis
Redis is incredibly fast, but if mismanaged under massive load, it will cause catastrophic system failures.
The Cache Stampede (Thundering Herd)
Imagine your most popular cache key (e.g., from the Airport to Downtown) expires at exactly 5:00 PM during rush hour. Suddenly, 5,000 concurrent requests miss the cache and hit your Graphhopper engine simultaneously. Your server crashes instantly.
The Fix (XFetch): Do not use standard SETEX. Implement the XFetch (Probabilistic Early Expiration) algorithm. As the TTL approaches 0, XFetch mathematically forces exactly one random request to recompute the route in the background, while the other 4,999 requests safely consume the old cache.
The “Hot Key” Shard Melt
A massive concert ends. 100,000 users check the route home from the exact same H3 hexagon. In a Redis Cluster, this creates a “Hot Key.” All 100,000 requests map to a single hash slot, crashing one Redis node while the rest of the cluster sits idle.
The Fix (L1 Caching): You cannot scale out of a Hot Key. Your Golang API Gateway must implement an In-Memory L1 Cache (e.g., using ristretto) to absorb the Hot Key traffic before it even touches the network.
How do we verify that our API Gateway, Redis cache, and Graphhopper routing engine will survive under massive production load? Read Part 7: Load Testing and Performance Tuning for Production to simulate a 20,000 RPS load test and tune the Linux kernel.
FAQ: Production Caching Nightmares
Graphhopper just updated the map. How do I delete millions of old cached routes in Redis?
SCAN or KEYS to delete millions of records on production. Redis is single-threaded; SCAN will block the server. Instead, use Key Versioning (e.g., route:map_v2:origin:dest). When the map updates, simply increment the version variable in your API config. Old keys are instantly orphaned and gracefully expire via their TTL.An attacker is sending fake coordinates in the ocean, maxing out my CPU!
My Redis server crashed with OOM, but `used_memory` was only 5GB out of 16GB. Why?
jemalloc allocator to fragment memory. The OS sees Redis using 15GB (used_memory_rss), while Redis only holds 5GB of data. You MUST monitor mem_fragmentation_ratio and enable activedefrag yes to defragment memory without restarting.