How Uber Finds Nearby Drivers: Handling 1 Million Requests/Second
System Design

How Uber Finds Nearby Drivers: Handling 1 Million Requests/Second

IdealResume TeamJuly 11, 202510 min read
Share:

Uber's Real-Time Challenge

Uber processes millions of location updates per second, matching riders with drivers in real-time. The challenge: finding the nearest available driver in milliseconds, handling a constantly moving supply.

The Core Problem

Requirements:

  • Find available drivers within X radius
  • Handle millions of location updates/second
  • Match in < 100ms
  • Handle global scale
  • Accurate ETA calculation

Complexity:

  • Drivers constantly moving
  • Supply changes by the second
  • Geographic density varies wildly
  • Must handle city-specific patterns

Geospatial Indexing

Challenge:

Traditional indexes don't work well for moving objects.

Uber's Approach: Google S2 Geometry

S2 Cells:

  • Divide Earth into hierarchical cells
  • 30 levels of precision
  • Cell IDs are 64-bit integers
  • Efficient indexing and querying

How It Works:

  1. Earth divided into cells at various levels
  2. Each cell has unique ID
  3. Nearby cells have similar IDs (mostly)
  4. Efficient range queries

Cell Levels:

  • Level 12: ~3.3km² (city-level)
  • Level 14: ~0.8km² (neighborhood)
  • Level 16: ~0.05km² (block-level)

Supply Tracking System

Architecture:

1. Driver Location Service

  • Receives GPS updates from drivers
  • Updates every 4 seconds
  • Stores in in-memory index
  • Handles millions of updates/second

2. Geospatial Index

  • S2-based partitioning
  • In-memory for speed
  • Sharded by geographic region
  • Eventually consistent

3. Availability Service

  • Tracks driver status (available, busy, offline)
  • Integrates with dispatch
  • Real-time state machine

Finding Nearby Drivers

Query Flow:

  1. Rider requests ride at (lat, long)
  2. Convert location to S2 cell ID
  3. Query cell and neighboring cells
  4. Filter by availability
  5. Rank by distance/ETA
  6. Return top candidates

Optimizations:

1. Cell Covering

  • For radius query, find all cells that intersect
  • Query only relevant cells
  • Minimizes false positives

2. Adaptive Precision

  • Dense areas: smaller cells
  • Sparse areas: larger cells
  • Balances accuracy and performance

3. Caching

  • Recent query results cached
  • Popular pickup locations pre-computed
  • Reduces computation

Dispatch System

The Matching Problem:

Not just nearest driver - optimize for:

  • Pickup ETA
  • Driver heading (already going your way?)
  • Driver preferences
  • Rider preferences
  • Surge pricing zones
  • Pool matching

Dispatch Algorithm:

  1. Identify candidate drivers
  2. Compute ETAs for each
  3. Score based on multiple factors
  4. Select best match
  5. Offer to driver
  6. Handle accept/reject
  7. Fallback to next best

ETA Calculation

Simple Approach:

  • Straight-line distance / average speed
  • Inaccurate in cities (roads!)

Uber's Approach:

1. Road Network Graph

  • City modeled as graph
  • Nodes = intersections
  • Edges = road segments
  • Weights = travel time

2. Historical Data

  • Actual travel times collected
  • Time-of-day patterns
  • Day-of-week patterns
  • Event-based adjustments

3. Real-Time Adjustment

  • Current traffic conditions
  • Active ride data
  • External traffic APIs

Handling Scale

Geographic Sharding:

  • World divided into regions
  • Each region has dedicated infrastructure
  • Minimal cross-region communication
  • Handles isolation of failures

Cell-Based Partitioning:

  • Within region, partition by S2 cell
  • Hot cells (airports) get more resources
  • Dynamic rebalancing

Redundancy:

  • Multiple replicas per cell
  • Leader election for writes
  • Any replica can serve reads

Data Pipeline

Location Update Flow:

  1. Driver app sends GPS update
  2. Load balancer routes to regional service
  3. S2 cell computed
  4. Update written to appropriate shard
  5. Old location invalidated
  6. Index updated atomically

Volume:

  • ~1M active drivers
  • Updates every 4 seconds
  • ~250K updates/second baseline
  • Peaks at 2-3x baseline

Real-Time Optimization

Challenge:

Matching is a real-time optimization problem that's NP-hard at scale.

Solutions:

1. Greedy Matching

  • First-come, first-served
  • Simple, fast
  • Not globally optimal

2. Batch Matching

  • Collect requests over short window
  • Optimize batch together
  • Better global optimum
  • Slight latency increase

3. Hybrid

  • Most requests: greedy
  • High-demand periods: batch
  • Best of both worlds

Surge Pricing Integration

Dynamic Pricing:

  • Demand exceeds supply? Price goes up
  • Computed per geographic zone
  • Updated every few minutes
  • Displayed before booking

Technical Implementation:

  • Supply/demand tracked per cell
  • ML model predicts imbalance
  • Pricing computed asynchronously
  • Cached at edge

Failure Handling

What If Location Service Fails?

  • Drivers fall off map
  • Stale data served
  • Manual dispatch possible
  • Automatic recovery

Graceful Degradation:

  • Show last known locations
  • Increase matching radius
  • Fallback to simpler algorithms
  • User communication

Interview Application

When designing ride-sharing:

Key Components:

  • Location tracking system
  • Geospatial indexing
  • Matching/dispatch service
  • ETA calculation
  • Surge pricing

Key Questions:

  • How to index moving objects?
  • How to handle millions of updates?
  • How to compute optimal matches?
  • How to calculate ETAs?

Trade-offs:

  • Accuracy vs latency (matching)
  • Global optimum vs simplicity (dispatch)
  • Freshness vs stability (location)

Uber's architecture demonstrates real-time geospatial systems at massive scale - a rich area for system design interviews.

Ready to Build Your Perfect Resume?

Let IdealResume help you create ATS-optimized, tailored resumes that get results.

Get Started Free

Found this helpful? Share it with others who might benefit.

Share: