Skip to content
QuizMaker logoQuizMaker
Activity
System Design: The Complete Guide
2. Intermediate Architecture
1. Introduction to System Design
2. Vertical vs Horizontal Scaling
3. Load Balancing
4. Caching Strategies
5. CDNs (Content Delivery Networks)
6. SQL vs NoSQL
7. Database Sharding & Partitioning
8. The CAP Theorem
9. Microservices Architecture
10. Message Queues & Event Streaming
12. Design BookMyShow (Ticket Booking)
14. Design Dropbox (Cloud File Storage)
15. How to Approach Any System Design Interview
16. Back-of-the-Envelope Estimation
17. Consistent Hashing
18. Bloom Filters & Probabilistic Data Structures
19. Database Replication
20. Leader Election & Consensus (Raft & Paxos)
21. Distributed Transactions (Saga, 2PC, Outbox)
22. Event Sourcing & CQRS
23. Unique ID Generation at Scale
24. Rate Limiting Algorithms
25. Circuit Breakers & Bulkhead Pattern
26. API Gateway, Proxies & Service Mesh
27. Real-Time Communication
28. Observability (Tracing, Logging, SLOs)
30. Design a Chat System (WhatsApp)
31. Design YouTube (Video Streaming)
32. Design a Web Crawler
CONTENTS

6. SQL vs NoSQL

Choosing the right database for the right job.

Feb 22, 202638 views0 likes0 fires
18px

[!NOTE] Choosing a database is a high-leverage decision. You can change a load balancer in a day; migrating primary data stores can take months of dual writes, backfills, and careful cutovers. Choose wisely—this decision will haunt (or help) you for years.

Start with Access Patterns, Not Hype

Before picking "SQL vs NoSQL," write down your top reads and top writes:

  • What are the 3 most important queries your app runs?
  • What must be strongly consistent (can never be wrong)?
  • What can be eventually consistent (slightly stale is fine)?
  • Do you need full-text search, time-series analytics, or graph traversal?

The biggest mistake junior engineers make is picking a database based on popularity ("MongoDB is trendy!") instead of access patterns.

SQL (Relational Databases)

Examples: PostgreSQL, MySQL, Oracle, SQL Server.

Relational databases store data in normalized tables with defined schemas and strong integrity constraints. SQL shines when correctness and complex querying matter.

Real-World Example: Stripe

Stripe processes hundreds of billions of dollars in payments annually. They run on PostgreSQL because financial transactions demand ACID guarantees. When you transfer $1,000, the debit from Account A and credit to Account B must either both succeed or both fail—there is zero tolerance for partial updates.

What You Get with SQL

  • ACID transactions: Multi-row correctness (payments, inventory, accounting).
  • Joins and rich querying: Relationships, aggregations, window functions.
  • Constraints: Unique keys, foreign keys, NOT NULL checks—guardrails against bad data.
  • Mature tooling: Decades of optimization, indexing strategies, and battle-tested ORMs.

Scaling SQL (Common Patterns)

  • Read replicas: Scale reads by replicating a primary to followers. Instagram uses dozens of PostgreSQL read replicas for their feed.
  • Connection pooling: PgBouncer protects the DB from connection exhaustion during traffic spikes.
  • Partitioning: Split large tables by time or tenant. Shopify partitions merchant data to keep indexes small.
  • Caching: Cache hot reads in Redis—protects the DB from spikes.
  • Sharding: Last resort. Vitess (used by YouTube) makes MySQL sharding manageable.

Where SQL Gets Painful

  • Write bottlenecks: A single primary (leader) limits write throughput.
  • Cross-shard operations: Joins and transactions across shards become expensive or impossible.
  • Schema migrations at scale: Altering a 500GB table with zero downtime requires tools like pg_repack or gh-ost.

NoSQL (A Family of Datastores)

NoSQL is not one database—it is an entire category of datastores, each optimized for specific access patterns.

NoSQL Families (with Real-World Users)

  • Key-Value (Redis, DynamoDB): Lightning-fast get/put by key. Amazon built DynamoDB to power their shopping cart—it handles millions of requests per second during Prime Day.
  • Document (MongoDB, Firestore): JSON-like documents with flexible schemas. Great for content management and mobile app backends.
  • Wide-Column (Cassandra, ScyllaDB): Extreme write throughput. Discord migrated from MongoDB to Cassandra (then ScyllaDB) to store billions of chat messages.
  • Graph (Neo4j): Efficient relationship traversal. LinkedIn uses graph databases to power "People You May Know" across 900+ million members.
  • Search (Elasticsearch): Full-text search. Wikipedia uses Elasticsearch to search 60+ million articles in 300+ languages.

Consistency and Indexing Tradeoffs

  • Consistency models: Many NoSQL systems are eventually consistent. Your app must handle retries and idempotency.
  • Secondary indexes: Can be limited or expensive—design around the primary key first.
  • Denormalization: Store data in query-friendly shapes and keep them updated via events or jobs.

Polyglot Persistence (How Uber Does It)

Most production systems use multiple databases, each for what it does best:

  • MySQL (Vitess): System of record for trips, riders, drivers.
  • Redis: Caching, session management, geospatial indexing for nearby drivers.
  • Cassandra: Billions of GPS trace points with massive write throughput.
  • Elasticsearch: Powering search across restaurants (Uber Eats).
  • Apache Pinot: Real-time analytics dashboards for operations teams.

A Practical Decision Checklist

  • Multi-entity transactions and strong constraints? PostgreSQL or MySQL.
  • Massive scale with simple key lookups? DynamoDB or Redis.
  • Rapidly evolving nested objects? MongoDB or Firestore.
  • Extreme write volume with predictable queries? Cassandra or ScyllaDB.
  • Full-text search? Add Elasticsearch alongside your primary DB.
  • Relationship-heavy traversals? Neo4j or Amazon Neptune.

[!IMPORTANT] Pick the database that makes your top 3 reads and top 3 writes simple. Then design everything else (indexes, partitions, caches, queues) around those access patterns. If you start with the wrong database, no amount of caching or indexing can fix it.

Share this article

Share on TwitterShare on LinkedInShare on FacebookShare on WhatsAppShare on Email

Test your knowledge

Take a quick quiz based on this chapter.

mediumSystem Design
Quiz: Databases
7 questions5 min

Continue Learning

7. Database Sharding & Partitioning

Intermediate
14 min

8. The CAP Theorem

Intermediate
16 min

9. Microservices Architecture

Intermediate
14 min
Lesson 1 of 5 in 2. Intermediate Architecture
Next in 2. Intermediate Architecture
7. Database Sharding & Partitioning
← Back to System Design: The Complete Guide
Back to System Design: The Complete GuideAll Categories