How Amazon S3 Works: Building the Internet's Storage
System Design

How Amazon S3 Works: Building the Internet's Storage

IdealResume TeamAugust 5, 20259 min read
Share:

S3's Unprecedented Scale

Amazon S3 (Simple Storage Service) stores over 350 trillion objects and handles millions of requests per second. Its "11 nines" durability promise (99.999999999%) means losing less than one object per 10 million objects stored for 10,000 years.

Design Principles

1. Durability First

Data loss is unacceptable. Every design decision prioritizes durability.

2. Simplicity at Scale

Simple primitives (PUT, GET, DELETE) that scale infinitely.

3. Eventual Consistency (Historically)

Trade immediate consistency for availability and partition tolerance. (Note: S3 now offers strong consistency)

4. Pay for What You Use

No provisioning - truly elastic storage.

Architecture Overview

Key Components:

1. Front-end Layer

  • Load balancers
  • Request routing
  • Authentication
  • Rate limiting

2. Metadata Layer

  • Index of all objects
  • Bucket and object metadata
  • Distributed across multiple systems

3. Storage Layer

  • Actual object data
  • Distributed across data centers
  • Erasure coding for efficiency

Data Durability

How S3 Achieves 11 9s:

1. Replication

  • Objects replicated across multiple Availability Zones
  • Each AZ is an isolated data center
  • Minimum 3 AZ replication

2. Erasure Coding

  • Data split into fragments
  • Parity fragments added
  • Can reconstruct from subset of fragments
  • More storage efficient than full replication

3. Checksums

  • MD5/SHA checksums on upload
  • Continuous background verification
  • Automatic repair of detected corruption

4. Geographic Distribution

  • Data spread across physically separate locations
  • Survives facility-level failures
  • No single point of failure

Storage Classes

S3 Standard:

  • Frequently accessed data
  • Highest performance
  • 3+ AZ replication

S3 Intelligent-Tiering:

  • Automatic cost optimization
  • Moves data between tiers based on access patterns
  • No retrieval fees

S3 Glacier:

  • Archive storage
  • Lower cost, higher latency
  • Minutes to hours for retrieval

S3 Glacier Deep Archive:

  • Lowest cost
  • 12-48 hour retrieval
  • Compliance and archival use cases

Consistency Model

Historically (Before 2020):

  • Read-after-write for new objects
  • Eventual consistency for overwrites/deletes
  • Could read stale data briefly

Current (Strong Consistency):

  • All operations strongly consistent
  • Read-after-write for all objects
  • List operations immediately reflect changes
  • No additional cost

How Strong Consistency Works:

  • Distributed consensus for metadata
  • Careful ordering of operations
  • Significant engineering investment

Performance Optimization

1. Prefix Partitioning

  • S3 partitions by key prefix
  • Spread keys across prefixes for parallelism
  • Avoid sequential naming

2. Transfer Acceleration

  • Use CloudFront edge locations
  • Optimized network paths
  • 50-500% faster uploads

3. Multipart Upload

  • Large objects split into parts
  • Parallel upload of parts
  • Resume failed uploads
  • Required for objects > 5GB

4. Byte-Range Requests

  • Fetch only needed bytes
  • Enable partial file reads
  • Critical for large file processing

Access Control

IAM Policies:

  • User and role-based access
  • Fine-grained permissions
  • Cross-account access

Bucket Policies:

  • Resource-based policies
  • Public access controls
  • IP restrictions

Access Points:

  • Named endpoints with specific permissions
  • Simplify access management at scale
  • VPC-only access options

Security Features

Encryption:

  • SSE-S3: S3-managed keys
  • SSE-KMS: Customer-managed keys
  • SSE-C: Customer-provided keys
  • Client-side encryption

Bucket Lock:

  • Write-once-read-many (WORM)
  • Compliance mode (nobody can delete)
  • Governance mode (with special permissions)

Common Patterns

Static Website Hosting:

  • Enable website configuration
  • Set index and error documents
  • Use with CloudFront for HTTPS

Data Lake:

  • Store raw data in S3
  • Query with Athena, Redshift Spectrum
  • Transform with Glue

Backup Target:

  • Database backups
  • Application state
  • Disaster recovery

Interview Application

When designing object storage:

Key Features:

  • Basic operations (PUT, GET, DELETE, LIST)
  • Metadata management
  • Access control
  • Durability guarantees

Key Questions:

  • How to achieve high durability
  • Handling large files
  • Consistency requirements
  • Cost optimization

Design Considerations:

  • Replication vs erasure coding
  • Metadata storage architecture
  • Handling "hot" objects
  • Multi-region replication

S3's architecture demonstrates how to build massively scalable, highly durable storage with simple interfaces - a masterclass in distributed systems design.

Ready to Build Your Perfect Resume?

Let IdealResume help you create ATS-optimized, tailored resumes that get results.

Get Started Free

Found this helpful? Share it with others who might benefit.

Share: