How Slack Works: Real-Time Messaging Architecture
System Design

How Slack Works: Real-Time Messaging Architecture

IdealResume TeamJuly 6, 20258 min read
Share:

Slack's Real-Time Challenge

Slack handles millions of concurrent WebSocket connections, delivering billions of messages with sub-second latency. Every message, reaction, and typing indicator must be delivered reliably and in order.

Core Requirements

Functional:

  • Real-time messaging
  • Channels (public, private)
  • Direct messages
  • File sharing
  • Search
  • Integrations

Non-Functional:

  • Sub-second message delivery
  • Message ordering guarantee
  • Offline support
  • 99.99% availability

Architecture Overview

Key Components:

1. Gateway Service

  • WebSocket termination
  • Connection management
  • Message routing
  • Presence tracking

2. Messaging Service

  • Message storage
  • Channel management
  • Permission checking
  • Event publishing

3. Search Service

  • Full-text search
  • Real-time indexing
  • Relevance ranking

4. File Service

  • Upload handling
  • Storage management
  • Thumbnail generation

WebSocket Architecture

Connection Handling:

  • Millions of concurrent connections
  • Each connection = state to maintain
  • Long-lived connections (hours)
  • Heartbeats for health checking

Scaling WebSockets:

1. Connection Servers

  • Dedicated servers for WebSocket termination
  • Horizontally scaled
  • Sticky sessions via load balancer

2. Pub/Sub Layer

  • Redis Pub/Sub or similar
  • Route messages to correct connection server
  • Decouple message handling from delivery

3. Channel Subscriptions

  • Each client subscribes to channels
  • Messages published to channel
  • Routed to all subscribers

Message Flow

Sending a Message:

  1. Client sends via WebSocket
  2. Gateway validates and authenticates
  3. Message stored in database
  4. Event published to pub/sub
  5. Gateway servers receive event
  6. Delivered to subscribed clients
  7. Acknowledgment to sender

Ordering Guarantees:

  • Messages ordered per channel
  • Sequence numbers ensure ordering
  • Client-side reordering if needed
  • Gaps detected and filled

Data Model

Channels:

  • channel_id
  • workspace_id
  • name, type (public/private)
  • member_ids

Messages:

  • message_id (time-ordered)
  • channel_id
  • user_id
  • content
  • timestamp
  • thread_ts (for threads)

Presence:

  • user_id
  • status (active, away, DND)
  • last_activity

Message Storage

Challenges:

  • High write volume
  • Random read patterns
  • Long retention
  • Full-text search

Solution: Vitess + MySQL

  • Sharded MySQL via Vitess
  • Sharding key: workspace_id
  • Efficient for channel-based queries
  • Proven at scale

Storage Tiers:

  • Hot: Recent messages (SSD)
  • Warm: Older messages
  • Cold: Archive (object storage)

Search Implementation

Real-Time Search:

  • Elasticsearch cluster
  • Index updated on each message
  • Near real-time (seconds)

Search Features:

  • Full-text search
  • Filters (channel, user, date)
  • Relevance ranking
  • Permission-aware results

Presence System

Challenge:

Show who's online in real-time without overwhelming the system.

Approach:

1. Heartbeat-Based

  • Clients send heartbeats
  • Server tracks last activity
  • Status computed from activity

2. Sampling

  • Don't broadcast every status change
  • Aggregate and batch updates
  • Reduce message volume

3. Lazy Loading

  • Presence fetched on demand
  • Cached with short TTL
  • Accurate when needed

File Handling

Upload Flow:

  1. Client requests upload URL
  2. Direct upload to S3
  3. Server notified on completion
  4. Thumbnail/preview generated
  5. Message with file reference created

Optimizations:

  • Chunked uploads for large files
  • Resumable uploads
  • Client-side compression
  • CDN for delivery

Offline Support

Message Queue:

  • Messages queued if client offline
  • Delivered on reconnect
  • Gap fill for missed messages

Client Cache:

  • Recent messages cached locally
  • Enables offline viewing
  • Synced on reconnect

Thread Architecture

Slack Threads:

  • Messages can have replies
  • Threads don't clutter main channel
  • Separate notification controls

Data Model:

  • thread_ts: ID of parent message
  • Replies reference parent
  • Efficient thread retrieval

Scaling Considerations

Per-Workspace Isolation:

  • Workspaces are natural sharding boundary
  • Most queries within workspace
  • Cross-workspace queries rare

Hot Channels:

  • Very active channels need special handling
  • Rate limiting
  • Message batching
  • Horizontal scaling

Key Optimizations

1. Connection Coalescing

  • Multiple workspaces per connection
  • Reduces connection count
  • Efficient multiplexing

2. Message Batching

  • Batch rapid messages
  • Reduces UI updates
  • Smoother experience

3. Lazy Loading

  • Load messages on scroll
  • Infinite scroll with virtualization
  • Minimal initial payload

Interview Application

When designing chat systems:

Core Components:

  • Real-time messaging (WebSocket)
  • Message storage
  • Channel management
  • Presence system
  • Search

Key Challenges:

  • WebSocket scaling
  • Message ordering
  • Offline support
  • Search indexing

Trade-offs:

  • Consistency vs latency
  • Storage vs computation (indexing)
  • Accuracy vs efficiency (presence)

Slack's architecture shows how to build reliable, real-time communication systems that scale to enterprise needs.

Ready to Build Your Perfect Resume?

Let IdealResume help you create ATS-optimized, tailored resumes that get results.

Get Started Free

Found this helpful? Share it with others who might benefit.

Share: