How Uber Finds Nearby Drivers: Handling 1 Million Requests/Second
Uber's Real-Time Challenge
Uber processes millions of location updates per second, matching riders with drivers in real-time. The challenge: finding the nearest available driver in milliseconds, handling a constantly moving supply.
The Core Problem
Requirements:
- Find available drivers within X radius
- Handle millions of location updates/second
- Match in < 100ms
- Handle global scale
- Accurate ETA calculation
Complexity:
- Drivers constantly moving
- Supply changes by the second
- Geographic density varies wildly
- Must handle city-specific patterns
Geospatial Indexing
Challenge:
Traditional indexes don't work well for moving objects.
Uber's Approach: Google S2 Geometry
S2 Cells:
- Divide Earth into hierarchical cells
- 30 levels of precision
- Cell IDs are 64-bit integers
- Efficient indexing and querying
How It Works:
- Earth divided into cells at various levels
- Each cell has unique ID
- Nearby cells have similar IDs (mostly)
- Efficient range queries
Cell Levels:
- Level 12: ~3.3km² (city-level)
- Level 14: ~0.8km² (neighborhood)
- Level 16: ~0.05km² (block-level)
Supply Tracking System
Architecture:
1. Driver Location Service
- Receives GPS updates from drivers
- Updates every 4 seconds
- Stores in in-memory index
- Handles millions of updates/second
2. Geospatial Index
- S2-based partitioning
- In-memory for speed
- Sharded by geographic region
- Eventually consistent
3. Availability Service
- Tracks driver status (available, busy, offline)
- Integrates with dispatch
- Real-time state machine
Finding Nearby Drivers
Query Flow:
- Rider requests ride at (lat, long)
- Convert location to S2 cell ID
- Query cell and neighboring cells
- Filter by availability
- Rank by distance/ETA
- Return top candidates
Optimizations:
1. Cell Covering
- For radius query, find all cells that intersect
- Query only relevant cells
- Minimizes false positives
2. Adaptive Precision
- Dense areas: smaller cells
- Sparse areas: larger cells
- Balances accuracy and performance
3. Caching
- Recent query results cached
- Popular pickup locations pre-computed
- Reduces computation
Dispatch System
The Matching Problem:
Not just nearest driver - optimize for:
- Pickup ETA
- Driver heading (already going your way?)
- Driver preferences
- Rider preferences
- Surge pricing zones
- Pool matching
Dispatch Algorithm:
- Identify candidate drivers
- Compute ETAs for each
- Score based on multiple factors
- Select best match
- Offer to driver
- Handle accept/reject
- Fallback to next best
ETA Calculation
Simple Approach:
- Straight-line distance / average speed
- Inaccurate in cities (roads!)
Uber's Approach:
1. Road Network Graph
- City modeled as graph
- Nodes = intersections
- Edges = road segments
- Weights = travel time
2. Historical Data
- Actual travel times collected
- Time-of-day patterns
- Day-of-week patterns
- Event-based adjustments
3. Real-Time Adjustment
- Current traffic conditions
- Active ride data
- External traffic APIs
Handling Scale
Geographic Sharding:
- World divided into regions
- Each region has dedicated infrastructure
- Minimal cross-region communication
- Handles isolation of failures
Cell-Based Partitioning:
- Within region, partition by S2 cell
- Hot cells (airports) get more resources
- Dynamic rebalancing
Redundancy:
- Multiple replicas per cell
- Leader election for writes
- Any replica can serve reads
Data Pipeline
Location Update Flow:
- Driver app sends GPS update
- Load balancer routes to regional service
- S2 cell computed
- Update written to appropriate shard
- Old location invalidated
- Index updated atomically
Volume:
- ~1M active drivers
- Updates every 4 seconds
- ~250K updates/second baseline
- Peaks at 2-3x baseline
Real-Time Optimization
Challenge:
Matching is a real-time optimization problem that's NP-hard at scale.
Solutions:
1. Greedy Matching
- First-come, first-served
- Simple, fast
- Not globally optimal
2. Batch Matching
- Collect requests over short window
- Optimize batch together
- Better global optimum
- Slight latency increase
3. Hybrid
- Most requests: greedy
- High-demand periods: batch
- Best of both worlds
Surge Pricing Integration
Dynamic Pricing:
- Demand exceeds supply? Price goes up
- Computed per geographic zone
- Updated every few minutes
- Displayed before booking
Technical Implementation:
- Supply/demand tracked per cell
- ML model predicts imbalance
- Pricing computed asynchronously
- Cached at edge
Failure Handling
What If Location Service Fails?
- Drivers fall off map
- Stale data served
- Manual dispatch possible
- Automatic recovery
Graceful Degradation:
- Show last known locations
- Increase matching radius
- Fallback to simpler algorithms
- User communication
Interview Application
When designing ride-sharing:
Key Components:
- Location tracking system
- Geospatial indexing
- Matching/dispatch service
- ETA calculation
- Surge pricing
Key Questions:
- How to index moving objects?
- How to handle millions of updates?
- How to compute optimal matches?
- How to calculate ETAs?
Trade-offs:
- Accuracy vs latency (matching)
- Global optimum vs simplicity (dispatch)
- Freshness vs stability (location)
Uber's architecture demonstrates real-time geospatial systems at massive scale - a rich area for system design interviews.
Ready to Build Your Perfect Resume?
Let IdealResume help you create ATS-optimized, tailored resumes that get results.
Get Started Free