How Slack Works: Real-Time Messaging Architecture
Slack's Real-Time Challenge
Slack handles millions of concurrent WebSocket connections, delivering billions of messages with sub-second latency. Every message, reaction, and typing indicator must be delivered reliably and in order.
Core Requirements
Functional:
- Real-time messaging
- Channels (public, private)
- Direct messages
- File sharing
- Search
- Integrations
Non-Functional:
- Sub-second message delivery
- Message ordering guarantee
- Offline support
- 99.99% availability
Architecture Overview
Key Components:
1. Gateway Service
- WebSocket termination
- Connection management
- Message routing
- Presence tracking
2. Messaging Service
- Message storage
- Channel management
- Permission checking
- Event publishing
3. Search Service
- Full-text search
- Real-time indexing
- Relevance ranking
4. File Service
- Upload handling
- Storage management
- Thumbnail generation
WebSocket Architecture
Connection Handling:
- Millions of concurrent connections
- Each connection = state to maintain
- Long-lived connections (hours)
- Heartbeats for health checking
Scaling WebSockets:
1. Connection Servers
- Dedicated servers for WebSocket termination
- Horizontally scaled
- Sticky sessions via load balancer
2. Pub/Sub Layer
- Redis Pub/Sub or similar
- Route messages to correct connection server
- Decouple message handling from delivery
3. Channel Subscriptions
- Each client subscribes to channels
- Messages published to channel
- Routed to all subscribers
Message Flow
Sending a Message:
- Client sends via WebSocket
- Gateway validates and authenticates
- Message stored in database
- Event published to pub/sub
- Gateway servers receive event
- Delivered to subscribed clients
- Acknowledgment to sender
Ordering Guarantees:
- Messages ordered per channel
- Sequence numbers ensure ordering
- Client-side reordering if needed
- Gaps detected and filled
Data Model
Channels:
- channel_id
- workspace_id
- name, type (public/private)
- member_ids
Messages:
- message_id (time-ordered)
- channel_id
- user_id
- content
- timestamp
- thread_ts (for threads)
Presence:
- user_id
- status (active, away, DND)
- last_activity
Message Storage
Challenges:
- High write volume
- Random read patterns
- Long retention
- Full-text search
Solution: Vitess + MySQL
- Sharded MySQL via Vitess
- Sharding key: workspace_id
- Efficient for channel-based queries
- Proven at scale
Storage Tiers:
- Hot: Recent messages (SSD)
- Warm: Older messages
- Cold: Archive (object storage)
Search Implementation
Real-Time Search:
- Elasticsearch cluster
- Index updated on each message
- Near real-time (seconds)
Search Features:
- Full-text search
- Filters (channel, user, date)
- Relevance ranking
- Permission-aware results
Presence System
Challenge:
Show who's online in real-time without overwhelming the system.
Approach:
1. Heartbeat-Based
- Clients send heartbeats
- Server tracks last activity
- Status computed from activity
2. Sampling
- Don't broadcast every status change
- Aggregate and batch updates
- Reduce message volume
3. Lazy Loading
- Presence fetched on demand
- Cached with short TTL
- Accurate when needed
File Handling
Upload Flow:
- Client requests upload URL
- Direct upload to S3
- Server notified on completion
- Thumbnail/preview generated
- Message with file reference created
Optimizations:
- Chunked uploads for large files
- Resumable uploads
- Client-side compression
- CDN for delivery
Offline Support
Message Queue:
- Messages queued if client offline
- Delivered on reconnect
- Gap fill for missed messages
Client Cache:
- Recent messages cached locally
- Enables offline viewing
- Synced on reconnect
Thread Architecture
Slack Threads:
- Messages can have replies
- Threads don't clutter main channel
- Separate notification controls
Data Model:
- thread_ts: ID of parent message
- Replies reference parent
- Efficient thread retrieval
Scaling Considerations
Per-Workspace Isolation:
- Workspaces are natural sharding boundary
- Most queries within workspace
- Cross-workspace queries rare
Hot Channels:
- Very active channels need special handling
- Rate limiting
- Message batching
- Horizontal scaling
Key Optimizations
1. Connection Coalescing
- Multiple workspaces per connection
- Reduces connection count
- Efficient multiplexing
2. Message Batching
- Batch rapid messages
- Reduces UI updates
- Smoother experience
3. Lazy Loading
- Load messages on scroll
- Infinite scroll with virtualization
- Minimal initial payload
Interview Application
When designing chat systems:
Core Components:
- Real-time messaging (WebSocket)
- Message storage
- Channel management
- Presence system
- Search
Key Challenges:
- WebSocket scaling
- Message ordering
- Offline support
- Search indexing
Trade-offs:
- Consistency vs latency
- Storage vs computation (indexing)
- Accuracy vs efficiency (presence)
Slack's architecture shows how to build reliable, real-time communication systems that scale to enterprise needs.
Ready to Build Your Perfect Resume?
Let IdealResume help you create ATS-optimized, tailored resumes that get results.
Get Started Free