This series dives deep into the technical architecture behind the most critical feature of ride-hailing applications: Real-time capabilities.

Seeing a car move smoothly on a map might seem simple, but behind it lies a massive distributed network: from battery-optimized GPS transport protocols, map gridding algorithms using hexagons (H3), the Kafka backbone processing millions of events per second, the DISCO system for optimal ride matching, to RAMEN — Uber’s real-time notification push network.

All content is synthesized from the official engineering blogs of Uber, Grab, and Lyft.

Series Contents

Executive Summary — The Big Picture of Real-time Ride-Hailing Systems

The Engineering Challenge Imagine you are an engineer at Uber or Grab. Your system must: Ingest GPS coordinates from millions of drivers every 4 seconds. Store and index all these positions in memory to query them in under 10ms. When a user requests a ride, find and rank the best drivers within a few kilometers, calculate the Estimated Time of Arrival (ETA) based on real-time traffic, and push the ride offer to the driver’s phone instantly — all within 2 seconds. Simultaneously, continuously calculate dynamic pricing (surge pricing) based on the supply-demand ratio in each area, updating every few seconds. This is not a typical CRUD application. It is one of the most complex distributed systems in the world. ...

May 6, 2026 · 4 min · Tuấn Anh

Part 1 — Location Ingestion: Collecting Millions of GPS Coordinates Per Second

The Challenge: Millions of Drivers, Every 4 Seconds Grab has approximately 5 million drivers operating in Southeast Asia. Uber has over 5 million drivers globally. If every driver sends a GPS coordinate every 4 seconds, the system must receive: 5,000,000 drivers ÷ 4 seconds = 1,250,000 GPS packets / second That is 1.25 million write operations per second — just for location data alone, not counting other requests. Traditional HTTP REST cannot handle this load efficiently. ...

May 6, 2026 · 6 min · Tuấn Anh

Part 2 — Geospatial Indexing: H3, S2 Geometry & Redis GEO

The Problem: Finding a Needle in a Haystack When you tap “Book” on Grab, the system must find the most suitable driver within a radius of a few kilometers. But the system is tracking millions of drivers simultaneously. The naive approach — calculating the distance from you to every driver — is impossible: The Naive Approach (Brute Force): SELECT * FROM drivers WHERE ST_Distance(driver_location, rider_location) < 2000 -- 2km ORDER BY ST_Distance(driver_location, rider_location) With 5 million drivers → 5 million distance calculations Latency: takes seconds → Unacceptable The solution: Divide the map into a grid (Spatial Indexing) to narrow down the search space from millions to just a few dozen. ...

May 6, 2026 · 6 min · Tuấn Anh

Part 3 — Event Streaming: The Apache Kafka & Flink Backbone

Why Do We Need Event Streaming? Millions of events occur every second in a ride-hailing system: Driver A updates their GPS coordinates. Customer B opens the app and requests a ride. Driver C accepts a ride offer and starts moving. Customer D cancels a ride. Surge pricing updates the multiplier in the Downtown area. If every service called each other directly (synchronous communication), the system would become tightly coupled and fragile — one slow service would bring down the entire chain. The solution is Event Streaming: every event is pushed into a central “pipeline,” and services independently subscribe to listen to the events they care about. ...

May 6, 2026 · 6 min · Tuấn Anh

Dispatch Algorithm & Matching Engine in Ride-Hailing

Every time you tap “Book Ride,” a system makes dozens of decisions in under two seconds: Which driver? What route? What’s the real ETA? This article breaks down exactly how the dispatch algorithm works — from the greedy approach that fails at scale, to the bipartite graphs, batched matching, and surge pricing mechanics that power Uber, Lyft, Grab, and Gojek today. Why a Greedy Dispatch Algorithm Fails (Closest Driver Problem) The first instinct when designing a matching system is to pair every customer with their nearest driver. However, this Greedy approach causes massive losses at a system-wide scale: ...

May 6, 2026 · 14 min · Tuấn Anh

Part 5 — Surge Pricing: How Surge Rate Is Calculated in Real-Time Ride-Hailing Systems

What Is Surge Rate? (And How Is It Calculated?) What is surge rate? Surge rate (also called surge pricing or surge multiplier) is the real-time price multiplier that ride-hailing platforms like Uber and Grab apply when demand for rides exceeds the available supply of drivers in a geographic zone. A surge rate of 2.0x means the rider pays twice the base fare. How is surge rate calculated? The surge rate is calculated by a pricing engine that evaluates the ratio of incoming ride requests (demand) versus available drivers (supply) in a specific H3 hexagon cell over a rolling time window (typically 5 minutes). The ratio is fed into a lookup table or ML model that outputs the surge multiplier. Why is Surge Pricing Necessary? On New Year’s Eve, during heavy rain, or at rush hour — the demand for rides skyrockets, but the number of available drivers remains unchanged. If prices were kept fixed: ...

May 6, 2026 · 8 min · Tuấn Anh

Part 6 — RAMEN & Real-time Communication: Pushing Instant Notifications to Millions of Devices

The Problem: Pushing Instant Notifications to Millions of Devices When DISCO decides to match you with Driver John Doe, the system must: Send the ride offer to exactly John Doe’s phone (out of millions of connected phones). Deliver it in milliseconds (not seconds). Ensure the driver receives it even if their 4G connection is weak. Simultaneously push the driver’s location back to your app so you can watch the car move on the map. There are two main approaches: Polling (asking continuously) and Push (proactively sending). ...

May 6, 2026 · 8 min · Tuấn Anh