Scaling Databases ✨

Scaling Databases ✨


Clustering involves connecting multiple servers to work together as a single unit to imprve database performance, scalability, and fault tolerance

Key Concepts to not on :-

  • Load Balancing:= Distributing incoming queries across multiple nodes in the cluser

  • High Availability:= If one node fails, others can take over to ensure uptime

  • Horizontal Scaling:= Easily add more nodes to handle increased load

  • Distributed Data:= Data can be partitioneed or duplicated across nodes for optimal performance

  • Consistency Challenges:= Ensuring data consistency across nodes may require consensus algorithms like Paxos or Raft


Replication involves creating and maintaining copies of the same database on multiple servers to ensure redundancy and performance.

Types of Replication:

  • Master-Slave Replication:

    • one master server handles writes, relicas (slaves) handle reads

    • Reduces load on the master

  • Master-Master Replication:

    • All servers can handle reads and writes

    • Used for high availability and geographical distribution

  • Synchronous Replication:

    • Data is written to all replicas simultaneously

    • Ensures data consistency but slower

  • Asynchronous Replication

    • Data is written to replicas with a delay

    • Faster but risks inconsistency

Consensus v/s Quorum


1) A process in distributed systems where all nodes agree on a single value or decision

2) Ensures consistency across nods, even in the presence of failures or communicatino issues

3) Examples: Algorithms like Paxos, Raft, and ZooKeeper’s Zab Protocol achieve consensus

4) Focus: Reaching agreement among distributed nodes


1) A subset of nodes (majority or a specific minimum) required to perform an operation (like read/write)

2) Purpose: Ensures a minimum number of nodes are involved in an operation to maintain consistency and reliability

3) Exampes: A quorum might require over half the nodes to approve a write operation for it to be a valid

4) Determining the minimum number of nodes needed for a consensus-related decisions or operations

PurposeAgreement on a value/decisionDefine participation threshold
ScopeEntire systemSubset of nodes
ImplementationAlgorithms like RaftUsed in quorum reads/writes
ExampleElecting a leaderAllowing a write if 3/5 nodes agree

Partioning & Sharding

Partitioning involves dividing a database into smaller, manageable pieces (partitions) to improve performance and scalability.

Key concepts:

  1. Horizontal Partitioning(Sharding)

    • Splits rows of a table across different database servers

    • Examples: Users with IDs 1-1000 go to one partition, and IDs 1001-2000 go to another

  2. Vertical Partitioning:

    • Splits columns of a table across different servers

    • Examples: User profile data in one server and user purchase history in another

  3. Range Based Partitioning:

    • Data is partitioned based on ranges of a key(eg dates, IDs)
  4. Hash-Based Partitioning

    • Uses a hash function to evenly distribute rows across partitions

Sharding is a form of horizontal partitioning where data is distributed across multiple databases (shards). Each shard operates independently.

Key Concepts:

  1. Shard Key:

    • A field used to determine which shard data belongs to.

    • Example: UserID % TotalShards.

  2. Scalability:

    • Adding more shards increases the system’s capacity.
  3. Fault Isolation:

    • Failure of one shard doesn’t affect others.


  • Sharding adds complexity in managing queries, rebalancing shards, and ensuring consistency.

CAP Theorem

The CAP Theorem states that in a distributed system, it is impossible to simulataneously guarantee all three of the following properties

  1. Consistency : all nodes in the system see the same data at the same time.

For Eg:= When data is updated on one node, the update is immediately reflected on all the other nodes

  1. Availability : The system remains operational and responds to every request, even if some nodes fail

For Eg:= Every request receieves a response, though it may not be the most recent data

  1. Partition Tolerance : The system remains operational and responds to every request, even if some node fail

Let’s learn more about Partition Tolerance

Partition tolerance addresses the scenario where network communication failures occur between nodes in a distributed system. In a partitioned system:

  • Nodes cannot communicate with each other due to a network failure.

  • Each partition might operate independently for a period.

  • Partition tolerance ensures that the system does not completely fail because of these network partitions. Instead, it continues to function—perhaps in a limited capacity—until the partition is resolved.

3 Types of DBs

  • Consistency + Availability (No Partition Tolerance):

    • Suitable for systems with reliable networks where partitioning is rare.

    • Example: Relational databases with strong consistency.

  • Consistency + Partition Tolerance (No Availability):

    • Prioritizes accuracy of data but sacrifices availability during network partitions.

    • Example: Distributed databases like MongoDB in "strict consistency" mode.

  • Availability + Partition Tolerance (No Consistency) (Eventual Consistency):

    • Prioritizes uptime but may serve stale or inconsistent data.

    • Example: Eventual consistency models like DynamoDB or Cassandra.

Different Types of DBs

HierarchialIBM IMS
Object Oriented DBObject DB
Column-Family DBCassandra
Graph DBNeo4j
Time Series DBInflux DB
Distributed DBGoogle Spanner
Cloud DBAmazon RDS
Key-value(cache) DBRedis

Did you find this article valuable?

Support Thirumalai by becoming a sponsor. Any amount is appreciated!