Some concepts on Distributed Databases & System design

LiveRunGrow

9 min readOct 3, 2020

CAP Theorem

CAP theorem does not apply if there is only a single master for each shard (without any type of replication).
Consistency: every read receives the most recent write or an error.
Availability: every request receives a response that is not an error.
Partition tolerance: the system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes
CAP theorem implies that in the presence of a network partition, one has to choose between consistency and availability.
From a Quora Answer: CAP is frequently misunderstood as if one has to choose to abandon one of the three guarantees at all times. In fact, the choice is really between consistency and availability only when a network partition or failure happens; at all other times, no trade-off has to be made.
ACID databases choose consistency over availability. (RDBMs)
BASE systems choose availability over consistency. (NoSQL)

Today, NoSQL databases are classified based on the two CAP characteristics they support:

CP database: A CP database delivers consistency and partition tolerance at the expense of availability. When a partition occurs between any two nodes, the system has to shut down the non-consistent node (i.e., make it unavailable) until the partition is resolved.
AP database: An AP database delivers availability and partition tolerance at the expense of consistency. When a partition occurs, all nodes remain available but those at the wrong end of a partition might return an older version of data than others. (When the partition is resolved, the AP databases typically resync the nodes to repair all inconsistencies in the system.)
CA database: A CA database delivers consistency and availability across all nodes. It can’t do this if there is a partition between any two nodes in the system, however, and therefore can’t deliver fault tolerance.

Caching

Take advantage of the locality of reference principle: recently requested data is likely to be requested again.
Caching exists at all levels of the architecture, but often found at the level nearest to the Front-End.
Problems with caching: Trashing, Consistency,

Application Sever Cache

Cache placed on a request layer node => I am guessing it means the first server node that the request reaches from the Front-end Application.
If there are many nodes, then a same request could reach different server nodes, depending on the Load Balancer assignment. This means there might be a lot of cache misses.
Possible solutions: Global Caches, Distributed Caches.

Distributed Caches

The entire cache of the system is shared among and span multiple servers.
The entire cache is divided up using a consistent hashing function.
Pros: Distributed caches are especially useful in environments with high data volume and load. The distributed architecture allows incremental expansion/scaling by adding more computers to the cluster, allowing the cache to grow in step with the data growth.
Cons: A missing node leads to cache lost.

We can set different time-outs for different types of data, depending on their nature.

Global Cache

A server that is accessible by all request layer nodes.

Content Distributed Network

For sites serving large amount of static media.

Process

A request first asks the CDN for a piece of static media.
CDN serves that content if it has it locally available.
If content isn’t available, CDN will query back-end servers for the file, cache it locally and serve it to the requesting user.

Cache Invalidation

Cache invalidation refers to process during which web cache proxies declare cached content as invalid, meaning it will not longer be served as the most current piece of content when it is requested.
The ultimate purpose, of course, is to ensure that the next time a client queries, he will get the most recent result.

The most common strategies for cache invalidation are:

Expiration time, where the application knows how long the data will be valid. After this time, the data should be removed from the cache causing a “cache miss” in a subsequent request;
Freshness caching verification, where the application executes a lightweight procedure to determine if the data is still valid every time the data is retrieved. The downside of this alternative is that it produces some execution overhead
Active application invalidation, where the application actively invalidates the data in the cache, normally when some state change is identified.

Cache eviction policies

FIFO: first in first out
LIFO: last in first out
LRU: least recently used
MRU: most recently used
LFU: least frequently used
RR: random replacement

Client Server Communication

Standard HTTP Web request.
Ajax Polling (Alternative to web socket): The client repeatedly polls a server for data, and wait for the server to respond with data. If there is no data, server will return an empty message.
HTTP Long Polling (Alternative to websocket): The client makes an initial response using HTTP then waits for a response. The server delays reply until an update is available or timeout occurs.
Web Sockets (bi directional)
The reason why Web Sockets were created was to avoid the 3 handshake. It keeps the port open between 2 party.
Used in gaming websites, chat
Different from polling where a new socket is created with every request. Here, you only create once and then the connection channels remains forever.
Good for News Feed where you don’t want the user to keep clicking a button to get the latest information.

The client and server can talk in real time without making requests.

WebSockets allow for a higher amount of efficiency compared to REST because they do not require the HTTP request/response overhead for each message sent and received. Headers are only sent once.

https://blogs.windows.com/windowsdeveloper/2016/03/14/when-to-use-a-http-call-instead-of-a-websocket-or-http-2-0/#:~:text=WebSockets%20allow%20for%20a%20higher,each%20message%20sent%20and%20received.&text=When%20a%20client%20wants%20ongoing,are%20generally%20a%20good%20fit.

Server Sent Event: Client request data from a server using regular HTTP. The requested webpage opens a connection to the server. Server sends the data to the client when there is new information available. Usually used when real time traffic from server to client is needed.

Consistent Hashing

Derived from simple hashing. The problems of a simple hashing function key % n where n is the number of servers: It is not horizontally scalable. May not be equally balanced where some requests might get more data to process.

Key idea:

First, we hash all the keys and allocate them a position in the ring. Hash (Key)
A key is simply the data that we want to store.
We also do the same for the servers.
Hash (Server’s IP address)
The server checks clockwise and handles those request.
Request could be unevenly distributed amongst servers. Some server might have to handle more load than others and could become bottlenecks.
Solution:
Make each server have multiple virtual IPs and spread them around.

Indexes

B+, Hashing

Key Characteristics of Distributed Systems

Scalability

The capability of a system to grow and manage increased demand.
A system that can continuously evolve to support growing amount of work is scalable.
Horizontal scaling: by adding more servers into the pool of resources.
Vertical scaling: by adding more resource (CPU, RAM, storage, etc) to an existing server. This approach comes with downtime and an upper limit.

Reliability

Reliability is the probability that a system will fail in a given period.
A distributed system is reliable if it keeps delivering its service even when one or multiple components fail.
Reliability is achieved through redundancy of components and data (remove every single point of failure).

Availability

Availability is the time a system remains operational to perform its required function in a specific period.
Measured by the percentage of time that a system remains operational under normal conditions.
A reliable system is available.
An available system is not necessarily reliable.
A system with a security hole is available when there is no security attack.

Efficiency

Latency: response time, the delay to obtain the first piece of data.
Bandwidth: throughput, amount of data delivered in a given time.

Serviceability / Manageability

Easiness to operate and maintain the system.
Simplicity and spend with which a system can be repaired or maintained.

Load Balancing

Between user and web server
Spread workload

Proxies

A proxy server is an intermediary piece of hardware / software sitting between client and backend server.
Filter requests
Log requests
Transform requests (encryption, compression, etc)
Cache

Queues

A queue is a line of things waiting to be handled — in sequential order starting at the beginning of the line.
Queues are used to effectively manage requests in a large-scale distributed system, in which different components of the system may need to work in an asynchronous way. Asynchronous processing allows a task to call a service, and move on to the next task while the service processes the request at its own pace.
It is an abstraction between the client’s request and the actual work performed to service it. The client can continue operating without interruption when the server is not ready.
The basic architecture of a message queue is simple; there are client applications called producers that create messages and deliver them to the message queue. Another application, called a consumer, connects to the queue and gets the messages to be processed. Messages placed onto the queue are stored until the consumer retrieves them.

Example of queues: Kafka, Heron, real time streaming, RabbitMQ

Redundancy

Duplication of critical data or services with the intention of increased reliability of the system.

Sharding (Data Partitioning)

Horizontal Partitioning: Hash (Range based, consistent hashing), Horizontally derived
Vertical Partitioning

SQL VS No SQL

Storage

SQL: store data in tables.
NoSQL: have different data storage models, Eg, Key-Values, Document Databases, Wide-Column (Cassandra), Graph Databases

Schema

Sql: Each record has a fixed schema.
NoSql: Schemas are dynamic

Scalability

SQL -> Vertically scalable. Horizontally scalable may be harder.
NoSQL -> Horizontally scalable and cheap.

ACID

SQL: ACID compliant and guarantee of transactions
NO SQL: Sacrifice ACID for performance and scalability. Even Cassandra is not ACID. It does not delete items, only mark with tombstones. It does append-on writes. It is good for heavy writes applications.

When to use Cassandra and when to steer clear

“But Cassandra doesn’t do it well!” is definitely not something you want to hear after deploying a Cassandra cluster…

towardsdatascience.com

When to use what

SQL

Ensure ACID compliance.
Reduce anomalies.
Protect database integrity.
Data is structured and unchanging.

NoSQL

Data has little or no structure.
Make the most of cloud computing and storage.
Cloud-based storage requires data to be easily spread across multiple servers to scale up.
Rapid development.
Frequent updates to the data structure.

Compare SQL and CQL

CQL table is a set of Partitions, where each partition can be just a single row (when you don’t have clustering key) or multiple rows.
In CQL, you can have collection type columns — set, list, map.
Column can contain a user defined type and be reused.
CQL does not support joins which are available in SQL.
Data Modelling.

How does mongodb work?

Index

MongoDB uses multikey indexes to index the content stored in arrays. If you index a field that holds an array value, MongoDB creates separate index entries for every element of the array. These multikey indexes allow queries to select documents that contain arrays by matching on element or elements of the arrays.

MongoDB creates a unique index on the _id field during the creation of a collection.

User can also define compound indexes.

Storage

Instead of tables, a MongoDB database stores its data in collections. A collection holds one or more BSON documents. Documents are analogous to records or rows in a relational database table. Each document has one or more fields; fields are similar to the columns in a relational database table.

Sharding

https://www.mongodb.com/basics/sharding#:~:text=MongoDB%20uses%20the%20shard%20key,the%20shards%20in%20the%20cluster.

Sharding is a method for distributing or partitioning data across multiple machines.

MongoDB shards at the collection level. You choose which collection(s) you want to shard. MongoDB uses the shard key to distribute a collection’s documents across shards. MongoDB splits the data into “chunks”, by dividing the span of shard key values into non-overlapping ranges. MongoDB then attempts to distribute those chunks evenly among the shards in the cluster.

Shard keys are based on fields inside each document. The values in those fields will decide on which shard the document will reside, according to the shard ranges and amount of chunks. This data is stored and kept in the config server replica set.

MongoDB supports two sharding strategies for distributing data across sharded clusters:

Ranged Sharding
Hashed Sharding