Clustering
Clustering involves connecting multiple servers to work together as a single unit to imprve database performance, scalability, and fault tolerance
Key Concepts to not on :-
Load Balancing:= Distributing incoming queries across multiple nodes in the cluser
High Availability:= If one node fails, others can take over to ensure uptime
Horizontal Scaling:= Easily add more nodes to handle increased load
Distributed Data:= Data can be partitioneed or duplicated across nodes for optimal performance
Consistency Challenges:= Ensuring data consistency across nodes may require consensus algorithms like Paxos or Raft
Replication
Replication involves creating and maintaining copies of the same database on multiple servers to ensure redundancy and performance.
Types of Replication:
Master-Slave Replication:
one master server handles writes, relicas (slaves) handle reads
Reduces load on the master
Master-Master Replication:
All servers can handle reads and writes
Used for high availability and geographical distribution
Synchronous Replication:
Data is written to all replicas simultaneously
Ensures data consistency but slower
Asynchronous Replication
Data is written to replicas with a delay
Faster but risks inconsistency
Consensus v/s Quorum
Consensus
1) A process in distributed systems where all nodes agree on a single value or decision
2) Ensures consistency across nods, even in the presence of failures or communicatino issues
3) Examples: Algorithms like Paxos, Raft, and ZooKeeper’s Zab Protocol achieve consensus
4) Focus: Reaching agreement among distributed nodes
Quorum
1) A subset of nodes (majority or a specific minimum) required to perform an operation (like read/write)
2) Purpose: Ensures a minimum number of nodes are involved in an operation to maintain consistency and reliability
3) Exampes: A quorum might require over half the nodes to approve a write operation for it to be a valid
4) Determining the minimum number of nodes needed for a consensus-related decisions or operations
Aspect | Consensus | Quorum |
Purpose | Agreement on a value/decision | Define participation threshold |
Scope | Entire system | Subset of nodes |
Implementation | Algorithms like Raft | Used in quorum reads/writes |
Example | Electing a leader | Allowing a write if 3/5 nodes agree |
Partioning & Sharding
Partitioning involves dividing a database into smaller, manageable pieces (partitions) to improve performance and scalability.
Key concepts:
Horizontal Partitioning(Sharding)
Splits rows of a table across different database servers
Examples: Users with IDs 1-1000 go to one partition, and IDs 1001-2000 go to another
Vertical Partitioning:
Splits columns of a table across different servers
Examples: User profile data in one server and user purchase history in another
Range Based Partitioning:
- Data is partitioned based on ranges of a key(eg dates, IDs)
Hash-Based Partitioning
- Uses a hash function to evenly distribute rows across partitions
Sharding is a form of horizontal partitioning where data is distributed across multiple databases (shards). Each shard operates independently.
Key Concepts:
Shard Key:
A field used to determine which shard data belongs to.
Example:
UserID % TotalShards
.
Scalability:
- Adding more shards increases the system’s capacity.
Fault Isolation:
- Failure of one shard doesn’t affect others.
Complexity:
- Sharding adds complexity in managing queries, rebalancing shards, and ensuring consistency.
CAP Theorem
The CAP Theorem states that in a distributed system, it is impossible to simulataneously guarantee all three of the following properties
Consistency
: all nodes in the system see the same data at the same time.
For Eg:= When data is updated on one node, the update is immediately reflected on all the other nodes
Availability
: The system remains operational and responds to every request, even if some nodes fail
For Eg:= Every request receieves a response, though it may not be the most recent data
Partition Tolerance
: The system remains operational and responds to every request, even if some node fail
Let’s learn more about Partition Tolerance
Partition tolerance addresses the scenario where network communication failures occur between nodes in a distributed system. In a partitioned system:
Nodes cannot communicate with each other due to a network failure.
Each partition might operate independently for a period.
Partition tolerance ensures that the system does not completely fail because of these network partitions. Instead, it continues to function—perhaps in a limited capacity—until the partition is resolved.
3 Types of DBs
Consistency + Availability (No Partition Tolerance):
Suitable for systems with reliable networks where partitioning is rare.
Example: Relational databases with strong consistency.
Consistency + Partition Tolerance (No Availability):
Prioritizes accuracy of data but sacrifices availability during network partitions.
Example: Distributed databases like MongoDB in "strict consistency" mode.
Availability + Partition Tolerance (No Consistency) (Eventual Consistency):
Prioritizes uptime but may serve stale or inconsistent data.
Example: Eventual consistency models like DynamoDB or Cassandra.
Different Types of DBs
Type | Example |
RDBMS | MySQL |
NoSQL | MongoDB |
Hierarchial | IBM IMS |
Object Oriented DB | Object DB |
Column-Family DB | Cassandra |
Graph DB | Neo4j |
Time Series DB | Influx DB |
Distributed DB | Google Spanner |
Cloud DB | Amazon RDS |
Key-value(cache) DB | Redis |