How Bluesky Works: Decentralized Social Media Architecture
System Design

How Bluesky Works: Decentralized Social Media Architecture

IdealResume TeamAugust 16, 20258 min read
Share:

Bluesky's Vision

Bluesky is building a decentralized social network using the AT Protocol (Authenticated Transfer Protocol). Unlike traditional social networks, Bluesky separates the data layer from the application layer, enabling user data portability and algorithmic choice.

The AT Protocol Foundation

Core Concepts:

1. Personal Data Servers (PDS)

  • Each user's data stored on a PDS
  • Can be self-hosted or use a provider
  • Contains posts, follows, likes, profile

2. Decentralized Identifiers (DIDs)

  • Unique, portable user identities
  • Not tied to any single server
  • Supports identity recovery

3. Lexicons

  • Schema definitions for data types
  • Enables interoperability
  • Versioned for evolution

Architecture Overview

The Bluesky Network:

  1. **Personal Data Servers**: Store user data
  2. **Relays (Big Graph Servers)**: Aggregate data for discovery
  3. **App Views**: Process data for applications
  4. **Feed Generators**: Create custom feeds
  5. **Labelers**: Provide moderation signals

Data Model

Records and Repositories:

  • Each user has a data repository
  • Repositories contain records (posts, likes, etc.)
  • Records are signed by the user
  • Merkle tree structure for verification

Example Post Record:

  • Created timestamp
  • Text content
  • Reply references
  • Embedded media
  • Cryptographic signature

Federation Model

How Data Flows:

  1. User creates post on their PDS
  2. PDS signs and stores the record
  3. Relays crawl PDS instances
  4. Relays build global indexes
  5. App Views query relays
  6. Clients display data from App Views

Benefits:

  • Data portability (move your data)
  • No single point of control
  • Algorithmic choice
  • Censorship resistance

Feed Generation

Custom Feeds:

Unlike traditional social networks with a single algorithm, Bluesky enables:

  • Third-party feed generators
  • User-selected feeds
  • Transparent algorithms
  • Multiple feeds per user

Feed Generator Architecture:

  1. Subscribe to relay firehose
  2. Filter and rank posts
  3. Serve feed via API
  4. Users subscribe in-app

Moderation in Decentralized Systems

The Challenge:

Moderation in federated systems is complex - no central authority.

Bluesky's Approach:

1. Labelers

  • Independent services that label content
  • Users choose which labelers to trust
  • Labels: spam, adult content, misinformation, etc.

2. Composable Moderation

  • Multiple moderation layers
  • Users customize their experience
  • Community-driven standards

3. Block and Mute

  • User-level controls
  • Mute lists (shared blocks)
  • Server-level policies

Scaling Considerations

Challenges:

  • Global data consistency
  • Real-time updates across federation
  • Search across distributed data
  • Spam prevention

Solutions:

1. Relay Architecture

  • Relays aggregate data for efficiency
  • Multiple relay operators possible
  • Eventually consistent model

2. Cursor-based Pagination

  • Efficient traversal of large datasets
  • Works across federated data

3. WebSocket Subscriptions

  • Real-time updates via streaming
  • Subscription to specific records or collections

Identity and Portability

DID System:

  • did:plc method (Bluesky's default)
  • did:web for domain-based identity
  • Recovery mechanisms built-in

Switching Providers:

  1. Export data from old PDS
  2. Import to new PDS
  3. Update DID document
  4. Followers automatically follow to new location

Comparison with ActivityPub (Mastodon)

AT Protocol vs ActivityPub:

| Aspect | AT Protocol | ActivityPub |

|--------|-------------|-------------|

| Identity | DIDs (portable) | Server-based |

| Data | User-controlled repos | Server-controlled |

| Feeds | Custom algorithms | Chronological |

| Discovery | Global via relays | Server-centric |

Key Technical Innovations

1. Signed Data

All user data is cryptographically signed, enabling verification.

2. Schema-first Design

Lexicons define data types, enabling evolution and interop.

3. Separation of Concerns

Data, algorithms, and moderation are separate services.

Interview Application

When discussing decentralized systems:

Key Topics:

  • Federation vs decentralization
  • Identity management
  • Data consistency models
  • Moderation challenges
  • Protocol design

Trade-offs:

  • Decentralization vs performance
  • User control vs simplicity
  • Federation vs fragmentation

Bluesky demonstrates how to build social networks that prioritize user agency and data portability while maintaining usability.

Ready to Build Your Perfect Resume?

Let IdealResume help you create ATS-optimized, tailored resumes that get results.

Get Started Free

Found this helpful? Share it with others who might benefit.

Share: