Design Spotify: A System Design Interview Conversation
System Design Interview

Design Spotify: A System Design Interview Conversation

IdealResume TeamSeptember 30, 202512 min read
Share:

The Interview Scenario

You're in a 45-minute system design interview. The interviewer asks: "Design Spotify."

---

Interviewer: "Design Spotify - a music streaming service. You have 45 minutes."

Candidate: "Great, I'd love to tackle this. Let me start by clarifying requirements to make sure we're aligned on scope."

---

Phase 1: Requirements Clarification (5 minutes)

Candidate: "I have several questions:

  1. **Core features** - Are we focusing on music playback, playlists, search, or recommendations?
  2. **User types** - Free tier with ads vs premium?
  3. **Offline mode** - Should users be able to download music?
  4. **Social features** - Following friends, collaborative playlists?
  5. **Scale** - How many users are we targeting?"

Interviewer: "Let's design:

  • Music streaming and playback
  • Playlist creation and management
  • Search functionality
  • Basic recommendations
  • Assume 500 million users, 200 million daily active
  • Both free (with ads) and premium tiers
  • Mobile and web clients"

Candidate: "Perfect. For non-functional requirements:

  • **Low latency** - songs should start within 200ms
  • **High availability** - 99.99% uptime
  • **Seamless playback** - no buffering during songs
  • **Offline sync** - for premium users
  • **Personalization** - recommendations should improve over time

Sound right?"

Interviewer: "Yes, let's proceed."

---

Phase 2: Back-of-Envelope Calculations (5 minutes)

Candidate: "Let me estimate the scale:

Storage:

  • Assume 100 million songs in catalog
  • Average song: 3 minutes at 320kbps = ~7MB (high quality)
  • Store multiple qualities (64, 128, 256, 320 kbps)
  • Total per song: ~15MB average across qualities
  • Total catalog: 100M × 15MB = **1.5 PB**
  • Plus metadata, user data, playlists: add 20% = **~1.8 PB total**

Bandwidth:

  • 200M DAU, average 1 hour listening/day
  • 20 songs/day × 7MB = 140MB per user per day
  • Daily bandwidth: 200M × 140MB = **28 PB/day**

Concurrent streams:

  • Peak hours: 20% of DAU online simultaneously
  • 200M × 20% = **40 million concurrent streams**

QPS for metadata:

  • Each song play = 1 metadata lookup
  • 200M users × 20 songs = 4 billion lookups/day
  • ~46,000 QPS average, ~100,000 QPS peak"

Interviewer: "How do these numbers affect your design?"

Candidate: "Key insights:

  1. **CDN is critical** - 28 PB/day needs edge caching
  2. **Predictable storage** - 1.5 PB is large but manageable with object storage
  3. **High concurrent connections** - Need efficient connection handling
  4. **Caching metadata** - 100K QPS needs heavy caching layer"

---

Phase 3: High-Level Design (10 minutes)

Candidate: "Here's the high-level architecture:"

![Spotify High Level Architecture](/images/blog/spotify-architecture.svg)

Interviewer: "Walk me through what happens when a user presses play."

Candidate: "Here's the playback flow:

  1. **Client requests song** - Sends song_id to Playback Service
  2. **Auth check** - Verify user has access (premium or free tier)
  3. **Get audio URL** - Fetch CDN URL for requested quality
  4. **Metadata fetch** - Get song details from Redis cache (or DB on miss)
  5. **Client receives manifest** - Contains URLs for audio segments
  6. **Streaming begins** - Client fetches segments from CDN
  7. **Buffering ahead** - Client pre-buffers next 30 seconds
  8. **Track listening event** - Async event sent for analytics/royalties

Key optimizations:

  • Predictive pre-fetching of next song in queue
  • Quality adaptation based on network conditions
  • Gapless playback - load next song before current ends"

Interviewer: "Why did you separate Playback Service from Playlist Service?"

Candidate: "Separation of concerns and scaling:

  1. **Different scaling patterns** - Playback is read-heavy constant load; Playlist has more writes during peak creation times
  2. **Different data stores** - Playback needs fast key-value lookups; Playlists need flexible queries
  3. **Failure isolation** - If playlist service goes down, users can still play their current queue
  4. **Team ownership** - Different teams can own and deploy independently"

---

Phase 4: Deep Dive - Audio Streaming (10 minutes)

Interviewer: "Let's go deeper on audio delivery. How do you ensure smooth playback?"

Candidate: "Audio streaming has unique challenges compared to video:"

Audio File Preparation:

```

Original Audio (FLAC/WAV)

┌─────────────────────────────────────┐

│ Transcoding Pipeline │

│ • 64 kbps (low quality/mobile) │

│ • 128 kbps (normal) │

│ • 256 kbps (high quality) │

│ • 320 kbps (premium) │

│ • FLAC (lossless - premium only) │

└─────────────────────────────────────┘

Ogg Vorbis / AAC Format

(Segmented for streaming)

```

Adaptive Streaming:

  • Songs divided into 5-10 second segments
  • Client monitors download speed
  • Dynamically switches quality mid-song if needed
  • Buffer threshold: switch down if buffer < 10 seconds

Interviewer: "How do you achieve 200ms playback start?"

Candidate: "Several optimizations:

  1. **Pre-fetch on hover** - When user hovers over a song, start loading first segment
  2. **Predictive loading** - Queue next song while current plays
  3. **Edge caching** - Popular songs cached within 50ms of users
  4. **Small first segment** - First segment is 2 seconds (fast to download)
  5. **Connection keep-alive** - Maintain persistent connections to CDN
  6. **DNS pre-resolution** - Resolve CDN domains on app startup

Latency breakdown:

  • DNS: 0ms (pre-resolved)
  • TCP/TLS: 50ms (persistent connection)
  • First segment download: 100ms (from edge, 2-second segment)
  • Audio decode: 20ms
  • Total: ~170ms to first sound"

Interviewer: "How do you handle offline mode?"

Candidate: "Offline is premium-only:

  1. **Download manager** - Background service downloads songs
  2. **Encrypted storage** - Songs stored with device-specific key
  3. **License validation** - Check license expiry periodically (every 30 days requires online check)
  4. **Sync service** - When back online, sync listening history
  5. **Storage management** - Auto-remove least-played downloaded songs when space low

DRM approach:

  • Encrypt files with user-specific key
  • Key tied to device ID + user ID
  • Prevents sharing downloaded files between accounts"

---

Phase 5: Playlist and Recommendations (8 minutes)

Interviewer: "Tell me about playlist storage and recommendations."

Candidate: "Playlists have interesting requirements:"

Playlist Data Model:

```

// Cassandra schema - chosen for write scalability

playlists (

user_id UUID,

playlist_id UUID,

name TEXT,

created_at TIMESTAMP,

PRIMARY KEY (user_id, playlist_id)

)

playlist_tracks (

playlist_id UUID,

position INT,

track_id UUID,

added_at TIMESTAMP,

PRIMARY KEY (playlist_id, position)

)

```

Why Cassandra:

  • Handles millions of playlists with ease
  • Fast writes for playlist updates
  • Partition by user_id for data locality
  • Scales horizontally

Collaborative playlists:

  • Use CRDT (Conflict-free Replicated Data Types)
  • Each client can add/remove independently
  • Conflicts auto-resolve (duplicate adds ignored, removes win)

Interviewer: "How does the recommendation system work?"

Candidate: "Spotify's recommendations use multiple signals:

Data Collection:

```

┌─────────────────┐

│ Listening Events│ ─▶ Kafka ─▶ Event Processing

└─────────────────┘

User Behavior:

• Songs played

• Songs skipped

• Playlist additions

• Search queries

• Time of day patterns

```

Recommendation Models:

  1. **Collaborative Filtering** - "Users like you also listen to..."
  2. **Content-based** - Audio features analysis (tempo, key, energy)
  3. **NLP on metadata** - Genre, mood, lyrics analysis
  4. **Sequential** - What songs follow each other in playlists

Serving Recommendations:

  • Pre-compute daily for each user
  • Store in feature store (Redis/DynamoDB)
  • Real-time adjustments based on current session
  • Blend multiple models with learned weights

Interview tip: I'd implement a simpler version first (collaborative filtering with item-item similarity) and add complexity incrementally."

---

Phase 6: Trade-offs Discussion (5 minutes)

Interviewer: "What are the key trade-offs?"

Candidate: "Several important decisions:

1. Audio Format:

  • **Chose:** Ogg Vorbis / AAC
  • **Alternative:** MP3
  • **Trade-off:** Better compression vs wider compatibility
  • **Why:** Ogg gives 20% better quality at same bitrate; we control the client

2. Playlist Database:

  • **Chose:** Cassandra
  • **Alternative:** PostgreSQL
  • **Trade-off:** Eventual consistency vs strong consistency
  • **Why:** Playlist operations can tolerate slight delays; need write scale

3. Pre-computed vs Real-time Recommendations:

  • **Chose:** Pre-computed with real-time adjustments
  • **Alternative:** Fully real-time
  • **Trade-off:** Freshness vs latency
  • **Why:** Complex ML models take seconds to run; pre-compute overnight, adjust in real-time

4. CDN Strategy:

  • **Chose:** Cache only popular songs at edge
  • **Alternative:** Cache everything
  • **Trade-off:** Hit rate vs storage costs
  • **Why:** 80% of plays are 20% of songs; optimize for common case

Interviewer: "What would you improve with more time?"

Candidate: "Three areas:

  1. **Social features** - Friend activity feed, listening parties
  2. **Audio quality** - Lossless streaming for premium
  3. **Podcasts** - Different requirements (longer files, chapters, variable bitrate)"

---

Key Interview Takeaways

  1. **Audio vs Video** - Audio has simpler segmentation but stricter latency needs
  2. **Predictive loading** - Pre-fetch is crucial for instant playback
  3. **Offline requires DRM** - Important for premium differentiation
  4. **Recommendations** - Start simple (collaborative filtering) before going complex
  5. **Cassandra for playlists** - High write throughput, partition by user

Ready to Build Your Perfect Resume?

Let IdealResume help you create ATS-optimized, tailored resumes that get results.

Get Started Free

Found this helpful? Share it with others who might benefit.

Share: