System Design

35 min readSep 7, 2020

This article is based on the system design section from Cracking the coding interview book and the following 2 sections from the github repository below:

  1. Load Balancer VS Reverse Proxy
  2. Object Oriented Design interview questions with Solutions

Load Balancer VS Reverse Proxy

  • Load balancers are useful when we have multiple servers. Often, load balancers route traffic to a set of servers having the same function.

Load balancers distribute incoming client requests to computing resources such as application servers and databases. In each case, the load balancer returns the response from the computing resource to the appropriate client. Load balancers are effective at:

  • Preventing requests from going to unhealthy servers
  • Preventing overloading resources
  • Helping to eliminate a single point of failure

How to remember? UOE

Load balancers can be implemented with hardware (expensive) or with software such as HAProxy.

To protect against failures, it’s common to set up multiple load balancers, either in active-passive or active-active mode.

Layer 4 load balancing

Layer 4 load balancers look at info at the transport layer to decide how to distribute requests. Generally, this involves the source, destination IP addresses, and ports in the header, but not the contents of the packet. Layer 4 load balancers forward network packets to and from the upstream server, performing Network Address Translation (NAT). It will change the destination IP address to the server behind.

The idea of NAT is to allow multiple devices to access the Internet through a single public address. To achieve this, the translation of private IP address to a public IP address is required. Network Address Translation (NAT) is a process in which one or more local IP address is translated into one or more Global IP address and vice versa in order to provide Internet access to the local hosts. Also, it does the translation of port numbers i.e. masks the port number of the host with another port number, in the packet that will be routed to the destination. It then makes the corresponding entries of IP address and port number in the NAT table. NAT generally operates on router or firewall.


  • Simpler load balancing
  • Efficient -> No data look up. Only look at addresses.
  • One TCP connection.


  • No smart load balancing because you can’t look at data.
  • Not applicable to microservices since you wont know what endpoint or service to forward to.

Layer 7 load balancing

Layer 7 load balancers look at the application layer to decide how to distribute requests. This can involve contents of the header, message, and cookies. Layer 7 load balancers terminate network traffic, reads the message, makes a load-balancing decision, then opens a connection to the selected server. For example, a layer 7 load balancer can direct video traffic to servers that host videos while directing more sensitive user billing traffic to security-hardened servers.

Here, you have 2 TCP connections.


  • Smart Load Balancing
  • Caching (You know the data)
  • Good for microservices


  • Expensive (looks at data)
  • Decryptes (terminates TLS)
  • 2 TCP connections

At the cost of flexibility, layer 4 load balancing requires less time and computing resources than Layer 7, although the performance impact can be minimal on modern commodity hardware.

There must be >2 servers behind a load balancer.

Horizontal scaling (Use more Servers)

Load balancers can also help with horizontal scaling, improving performance and availability. Scaling out using commodity machines is more cost efficient and results in higher availability than scaling up a single server on more expensive hardware, called Vertical Scaling.

Disadvantage(s): horizontal scaling

  • Scaling horizontally introduces complexity and involves cloning servers
  • Servers should be stateless: they should not contain any user-related data like sessions or profile pictures
  • Sessions can be stored in a centralized data store such as a database (SQL, NoSQL) or a persistent cache (Redis, Memcached)
  • Downstream servers such as caches and databases need to handle more simultaneous connections as upstream servers scale out

Disadvantage(s): load balancer

  • The load balancer can become a performance bottleneck if it does not have enough resources or if it is not configured properly.
  • Introducing a load balancer to help eliminate a single point of failure results in increased complexity.
  • A single load balancer is a single point of failure, configuring multiple load balancers further increases complexity.

Reverse Proxy

A reverse proxy is a web server that centralizes internal services and provides unified interfaces to the public. Requests from clients are forwarded to a server that can fulfill it before the reverse proxy returns the server’s response to the client.

  • Deploying a load balancer is useful when you have multiple servers. Often, load balancers route traffic to a set of servers serving the same function.
  • Reverse proxies can be useful even with just one web server or application server, opening up the benefits described in the previous section.
  • Increased security — No information about your backend servers is visible outside your internal network, so malicious clients cannot access them directly to exploit any vulnerabilities. Many reverse proxy servers include features that help protect backend servers from distributed denial-of-service (DDoS) attacks, for example by rejecting traffic from particular client IP addresses (blacklisting), or limiting the number of connections accepted from each client.
  • Increased scalability and flexibility — Because clients see only the reverse proxy’s IP address, you are free to change the configuration of your backend infrastructure. This is particularly useful In a load-balanced environment, where you can scale the number of servers up and down to match fluctuations in traffic volume.

Load balancer must have >2 servers. On the other hand, Revere Proxy can have 1 server or more.

we can only utilise the 5 properties only.

ChatGPT about the differences.

Load balancer is a very specific server compared to reverse proxy.

A load balancer is a networking device that distributes incoming network traffic across multiple servers to ensure that no single server is overwhelmed with requests. It can help improve the availability, scalability, and reliability of web applications by spreading the workload across multiple servers.

On the other hand, a reverse proxy is a server that sits between client devices and web servers, acting as an intermediary to handle client requests and distribute them to the appropriate backend servers. It can provide several benefits, including improved security by masking the origin server’s IP address, caching static content to reduce server load, and SSL/TLS encryption for incoming requests.

While both load balancers and reverse proxies are used to distribute incoming traffic to multiple servers, the main difference is that a load balancer focuses on balancing the workload across multiple servers to improve performance, while a reverse proxy handles client requests and provides additional features such as caching and security.

Object Oriented Design interview questions with Solutions

Design a Hash Table

Design a LRU Cache

  • Cache class contains 2 Data structure: (1) Linkedlist (2) Dictionary / Hash Map
  • (1) Linkedlist: Track the ordering of the queries. Allow us to know which is the most recently accessed and last recently accessed query. O(n)
  • (2) Dictionary: Facilitate easy access to query’s result O(1)
import java.util.LinkedHashMap;
import java.util.Map;

public class Cache<K, V> extends LinkedHashMap<K, V> {
private final int capacity;

public Cache(int capacity) {
super(capacity + 1, 1.1f, true);
this.capacity = capacity;

protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
return size() > capacity;

public static void main(String[] args) {
Cache<Integer, String> cache = new Cache<>(2);
cache.put(1, "one");
cache.put(2, "two");
System.out.println(cache); // prints {1=one, 2=two}
cache.put(3, "three");
System.out.println(cache); // prints {2=two, 3=three}
System.out.println(cache); // prints {3=three, 2=two}

The removeEldestEntry method is an overridden method from the LinkedHashMap class that determines whether the eldest (i.e. least recently used) entry should be removed from the cache. The method is called by the put and putAll methods after inserting a new entry into the map.

The default implementation of removeEldestEntry returns false, which means that the eldest entry is never removed automatically. However, in the Cache class example I provided earlier, we override this method to return true if the size of the cache exceeds its maximum capacity, which triggers the removal of the eldest entry.

If the size of the cache is greater than its capacity, then removeEldestEntry returns true, indicating that the eldest entry should be removed. Otherwise, it returns false, indicating that the eldest entry should be retained.

By overriding removeEldestEntry in this way, we ensure that the cache never exceeds its maximum capacity and removes the least recently used entries to make room for new entries.

Cracking the coding interview — System Design

1) It is important to scope the problem and know what exactly you are designing.

  • Eg, Design TinyURL. You can ask questions like: Will people be able to specify their own short URL? Will it be auto-generated? Will you need to keep track of any stats on the clicks? Should the URLs stay alive forever or have a TimeOut?
  • You can also make a list of all the major features or use cases.
  • For eg, Shortening a URL to a TinyURL, Analytics for a URL, Retrieving the URL associated with a TinyURL, User accounts and link management.

2) Make Reasonable Assumptions

  • Data to be stale by a max of 10 minutes?

3) Draw the major components

  • How the servers, frontend work together.

4) Identify the key issues

  • What are the bottlenecks?
  • For eg, some URLs will be infrequently accessed, others can suddenly peak. This might happen if the URL is posted on Reddit? You dont’t necessarily want to constantly hit the database.

5) Redesign for the Key Issues

  • Update tour diagram…

Some Key concepts (ohhh..from cs4224 and cs4225)

  • Horizontal VS Vertical scaling
  • Load balancer
  • Database Denormalisation and NoSQL. Joins in a relational database such as SQL can get very slow as the system gets bigger. To avoid this, we have to do denormalisation. It means adding redundant information into a database to speed up reads. -> Like Cassandra query-based data modelling.
  • Database Partitioning (Sharding). Sharding means splitting the data across multiple machines while ensuring you have a way of figuring out which data is on which machine. Some example of sharding/partitioning: Vertical partitioning, Hash-Based partitioning (Allocate N servers, based on some part of the data maybe ID, put the data on mod(key, n)), Directory-Based partitioning (look-up table to find your data).
  • Caching. In-memory cache is a simple key-value pairing and sits between your application layer and data store.
  • Asynchronous processing & Queues: Some slow processes can be done asynchronously.
  • MapReduce

Example 1

Given a list of millions of documents, how would you find all documents that contain a list of words? The words can appear in any order, but they must be in complete words.

  • Before we start solving, we need to understand if this is a one-time operation or can be called repeatedly. Lets assume that we will do this operation many times and hence we can accept the burden of preprocessing.
  • The first step is to pretend we just have a few dozen documents. How would we implement the operation? One way is to pre-process each document and create a hash table index. This hash table will map from a word to a list of the documents that contains this word.
  • “books” -> {doc2, doc3, doc6, doc8}
  • “many” -> {doc1, doc3, doc7, doc8, doc9}
  • To search for many books, we would simply do an interaction on the values for “books” and “many” and return {doc3, doc8} as the result.
  • Step 2: Going back to the original problem with millions of documents. What are the possible problems? We will need to divide the documents across many machines, due to the possible number of words, we may not be able to fit the hash table in one machine.
  • Then, we could divide up our hash table by keyword. Or we could divide by document such that a machine contains the keyword mapping for only a subset of the documents. We also need to link up the result of one machine with another.
  • Step 3: Here we try to find the solution to each of these issues. One solution is to divide up the words alphabetically by keyword, such that each machine controls a range of words. We can implement a simple algorithm in which we iterate through the keywords alphabetically. When the machine is full, we move on to the next.
  • The advantage of this approach is that the lookup table is small and simple (since it must only specify a range of values), and each machine can store a copy of the lookup table. However, the disadvantage is that if new documents or words are added, we may need to perform an expensive shift of keywords.
  • To find all the documents that match a list of strings, we would first sort the list and then send each machine a lookup request for the strings that the machine owns. For example, if our string is “after builds
    boat amaze banana”, machine 1 would get a lookup request for{“ after”, “amaze”}. Machine 1 looks up the documents containing “after” and “amaze;’ and performs an intersection on these
    document lists. Machine 3 does the same for {“banana”, “boat”, “builds”}, and intersects their lists.
    In the final step, the initial machine would do an intersection on the results from Machine 1 and Machine 3.

9.1 solution

9.2 Solution



The next idea is based on the CS3223 Partitioning using hash functions. We can have a hash function which produces a value indicating the machine that the url will be hashed to. Then in each machine, we can detect which are similar since same url will end up in the same server.


This problem seems to be related to the return K-Max items.


Tiny URL

Scope of the system: To return a shortened URL from a given input of long url.


Functional requirement:

Inactive URL will be invalidated. User can choose timeout?

Must store mapping from old to new shortened url.

Must not return duplicate shortened URL. One to one mapping.

Must the new shortened URL be similar to the old URL?

Can the user choose the new shortened URL?

The URL produced must be unique and not clash with any other URLs generated elsewhere.

Non-Functional Requirements:

The system must be highly available. Otherwise, if the service fails, all the short links will be unavailable.

URL redirection should be fast. Means need to be able to retrieve the mappings quickly.

Potential loopholes/bottlenecks? Need to keep track of all mappings and check for duplicates so that we do not generate repeated URLs.

Some URLs might be accessed more frequently than others -> Use Cache.


How do they host the website with the new shortened url? I think….somehow the tiny url server will have to be connected to the DNS servers. When the user makes a request by typing the url on the browser, TinyURL servers are queried and they make the redirection.

Major Components:

This system will be read heavy.

We can use a NoSQL DB since there is no relationship between the records and allows for easier horizontal scaling.

For the schema, we need one table for storing information about the URL mappings and another database for the data from the user who created the short links.

First solution: Hash Table with a hash function that depends on all the input string given so that no duplicate hash value will be returned. Since there could be many requests, we need more than 1 server to store all the mapping. We could have a centralised server in charge of holding the hash function. According to the output of this hash function, the new shortened URL is produced. Then, the new-old url mapping will be stored in another server (according to the new shortened URL string. For eg, server A in charge of A-C URLs and server B in charge of D-F URLs). We can adopt consistent hashing.

If collision arises, need to do collision resolution. Could become a bottleneck.

Second Solution:

NOTE THAT: A URL character can be one of the following
1) A lower case alphabet [‘a’ to ‘z’], total 26 characters
2) An upper case alphabet [‘A’ to ‘Z’], total 26 characters
3) A digit [‘0’ to ‘9’], total 10 characters

There are total 26 + 26 + 10 = 62 possible characters.

So the task is to convert a decimal number to base 62 number.

A Better Solution is to use the integer id stored in the database and convert the integer to a character string that is at most 6 characters long.

  • Using base64 encoding, a six-letter key would result in 6⁴⁶ = ~68.7 billion possible strings.
  • Using base64 encoding, an eight-letter key would result in 6⁴⁸ = ~281 trillion possible strings
  • With 68.7 billion unique strings, let’s assume six-letter keys would suffice for our system.
  • Eg, you have to convert 125 (base 10) to X62 (base 62).
  • 125 (10) = 2×62^1 + 1×62^0 = [2,1]

Can use Cache -> Because some urls will be accessed more frequently than others.

Other random questions

Implementation of Recommender System

  • There are two approach. The first is Content Filtering (use information about the items involved but does not really consider relationships). The second is Collaborative filtering (better approach where you recommend items to users based on the user’s similarity to other users).
  • Collaborative filtering: Based on the similarity between items calculated using people’s ratings of those items. They have two major parts:
  1. Analyses the user-item matrix to identify relationships between different items. Compute the similarity between items.
  2. Use these relationships to indirectly compute recommendations for users.

Assume that data is given in the form of USER, ITEM, SCORE.

As we read in the rows of data for many different users, we build the history matrix which is a table whereby each row contains information about which items that particular user (represented by each row) bought.

From this history matrix, we then construct the co-occurrence matrix which contains the count in which each pair of items were bought together by a user.

The next table we want to construct is the score matrix. Each row represents an item. Each column represents a user.

Score matrix is the table on slide 28

Now, with the co-occurrence matrix and score matrix, we want to do matrix multiplication.

In the output table, each row represents each item. Each column represents the user.

Implementation of Image Uploading. It can range from user avatars to up-loadable inventory pictures.

Some use cases:

Display to users, what they see:

  • Display on frontend?
  • Display image variants, such as banners and thumbnails.

Workflow of users, how they interact with the system:

  • Who can upload image? Who can download them?
  • Any restrictions on the type of images uploaded? Size limit?
  • The front-end server is an extension of the back-end server and is designed to provide scalability. Multiple FEs can be connected to a BE and each FE can have multiple clients connected to it in a distributed setup. The main function performed by the FE is to channel the requests received from the clients. The FE takes care of generating the database views through database read operations using a completely stateless architecture. This helps in providing the accurate information at all times.

Database/server storage concerns, how do we store the images?:

  • How to store the image? Associate them with the records that you are uploading them for? Is it for user avatar or campaign banners.
  • For the case of facilitating users in displaying all sorts of possible image variants, how do we do image processing? Perform cropping.
  • How do we optimise the process of uploading the image to the server in a scalable way?

Tradeoffs, Potential bottlenecks:


Image uploads can cause servers to crash or cause timeout. It consumes a lot of resources. A request handler might be tied up for the entire amount of time it takes to upload 10 megabyte.

Hence, the architecture is required to be focused on reducing the amount of time your server is handling an image uploading request to almost nothing, and offloading the actual upload to another service (such as S3 or an in house service dedicated to uploads).


Possibility of a Denial of service attack where massive amount of resources is consumed.

Our system architecture needs to provide integrity of user data as well.

Walkthrough the flow of secure file upload implementation.

=> How to handle scanning file

=> How to identify type of file

In the context of web applications, we call “upload” the process in which a user sends data/files from a local computer to a remote computer.

(Refer to drawing)

How to handle user authentication in login feature

(1) Session based authentication. HTTP protocol is stateless and hence with every new request, you will have to login again to let the application know it is you. Cookie based authentication makes the authentication process stateful. It is an authentication record or session kept in both server and client side.

In the browser, the user enters their username and password, and the request goes from the client application to the server. The server checks for the user, authenticates it and sends a session ID to the user’s client application. It also saves it in memory. When the client browser receives it, it stores the session ID in a session cookie.

This cookie is then sent with every subsequent request. At each request, the server looks at the session cookie to read the session ID. If it matches the one stored in its memory, it sends a response to the browser.

However, there are problems with this system. Whenever someone is authenticated, the server will have to create a new session ID and store it as a record in the server. This is a problem as it contributes to the server’s workload. Furthermore, it has scalability issues. If server is replicated, you have to store session cookie everywhere. (though you can solve it by having a single dedicated server that outsource all the requests to other server after checking the session cookie id).

(2) Therefore, introducing Token-based authentication.

Tokens used for token-based authentication are JSON Web Tokens (JWT). This implementation is stateless and we do not store any information on the server about the user.

User will first enter their login credentials which will be verified by the server. The server will return a signed token (the JWT) which contains all the information needed to allow or deny any given requests to an API.

A Json Web token is a long encoded text string that is composed of three parts separated by a dot sign.

  • Header: Contains information about the header itself. It contains the algorithm used to sign the token and the key used to validate it.
  • Payload: Contains information about the client.
  • Hash: Generated based on the previous two parts

This JWT token is stored in the client browser side. Subsequent requests to the server will include this token, generally as an additional authorisation header in the form of a Bearer. It can also be sent in the body of a POST request or query parameter.

The server will decode the JWT and if it is valid, it processes the request.

This approach is stateless and good. It can also allow you to store additional data inside the JWT, such as the user’s permission level. You can then save yourself additional lookup calls to get and process the requested data.

(3) Passwordless

Instead of giving a user an email/username and password, they enter only their email address. The application will send them a one-time use link to that email, which the user clicks on to be automatically logged in to your website or application.

Alternatively, can also send a code or one-time-password through SMS.

However, if someone has access to the user’s email account then they would then access to their account on your website. But this is not your responsibility.

(4) Single Sign-On (SSO)

To put it simply, it’s the enterprise equivalent of the “Login with Google” or “Login with Facebook” buttons we see in apps around the internet. We register an account initially in Google or Facebook etc and use that account to login to other apps like Spotify, Netflix, Zoom etc. We do this to avoid maintaining multiple username/passwords. Similarly, enterprises maintain a single user management system and employees use their corporate account to login to third-party services like Salesforce, Workday, Expensify etc without creating separate accounts or remembering multiple passwords.

There are many implementations of this. They could be SAML, Facebook connect, OAuth.

OAuth is a protocol for authorization: it ensures Bob goes to the right parking lot. In contrast, Security Assertion Markup Language (SAML) is a protocol for authentication, or allowing Bob to get past the guardhouse.

OAuth 2.0: If you’ve ever signed up to a new application and agreed to let it automatically source new contacts via Facebook or your phone contacts, then you’ve likely used OAuth 2.0. This standard provides secure delegated access. That means an application can take actions or access resources from a server on behalf of the user, without them having to share their credentials. It does this by allowing the identity provider (IdP) to issue tokens to third-party applications with the user’s approval.

OpenID Connect: If you’ve used your Google to sign in to applications like YouTube, or Facebook to log into an online shopping cart, then you’re familiar with this authentication option. OpenID Connect is an open standard that organizations use to authenticate users. IdPs use this so that users can sign in to the IdP, and then access other websites and apps without having to log in or share their sign-in information.

SAML: You’ve more likely experienced SAML authentication in action in the work environment. For example, it enables you to log into your corporate intranet or IdP and then access numerous additional services, such as Salesforce, Box, or Workday, without having to re-enter your credentials. SAML is an XML-based standard for exchanging authentication and authorization data between IdPs and service providers to verify the user’s identity and permissions, then grant or deny their access to services.

Enterprises rely on web frameworks and protocols like OAuth 2.0, OpenID, and SAML to bring structure and security to federated identity. Knowing when to use each is a key step towards protecting your organization’s data from the ground up.

SAML (Security Assertion Markup Language) (Authentication)

There are 3 main participants involved in the SAML authentication flow:

Identity Provider (IdP)

This is the centralised user management system that we talked about earlier. This server is responsible for authenticating the user and passing the user details such as email address, name, department etc to the Service Provider. Popular identity providers are Azure AD, Auth0, Onelogin, Okta, G Suite etc.

Service Provider (SP)

This is the application that trusts the IdP and wants to use it for authentication.


This is the user who’s trying to log into the SP via the IdP.

SP initiated Login workflow:

  1. The user goes to the SP’s website. If the user is not logged in, it shows a “Login with SSO” button.
  2. Upon clicking the login button, the user is redirected to the IdP’s website where they’re asked to submit their credentials. Before this, the SP will generate a XML message called “AuthnRequest” with details about who is sending the request (Issuer), where to redirect to after the user is authenticated (Assertion Consumer Service URL) and security measures (ID, IssueInstant). This XML is encoded into a url-safe string, embedded as a query param in a request to the IDP and the user is redirected to this IDP url.
  3. The IDP will then perform the authentication. Once the user is successfully authenticated, the IDP sends back a XML message called “SAML Assertion” to the SP Assertion Consumer Service URL. This contains the user’s details such as name, email, department etc and is also digitally signed so that the SP can trust that the message is from IDP and login user.

OAuth (Authorisation)

When trying to understand OAuth, it can be helpful to remember that OAuth scenarios almost always represent two unrelated sites or services trying to accomplish something on behalf of users or their software.

With OAuth, a user can sign in on one platform and then be authorized to perform actions and view data on another platform. OAuth makes it possible to pass authorization from one application to another regardless of what the two applications are.

Suppose Alice wants to access her company’s cloud file storage application. She has already signed into her company’s SSO, but she has not yet accessed the file storage application that day. When she opens up the file storage application, instead of simply letting her in, the application requests authorization for Alice from her SSO.

In response, the SSO sends an OAuth authorization token to the application. The token contains information about what privileges Alice should have within the application. The token will also have a time limit: after a certain amount of time, the token expires and Alice will have to sign in to her SSO again.

OAuth tokens are typically sent using HTTPS, meaning they are encrypted. They are sent at layer 7 of the OSI model.

Explain SQL injection attack

SQL injection is a web security vulnerability that allows an attacker to interfere with the queries that an application makes to its database.

It generally allows an attacker to view data that they are not normally able to retrieve. This might include data belonging to other users, or any other data that the application itself is able to access. In many cases, an attacker can modify or delete this data, causing persistent changes to the application’s content or behavior.

In some situations, an attacker can escalate an SQL injection attack to compromise the underlying server or other back-end infrastructure, or perform a denial-of-service attack.

SQL injection examples

There are a wide variety of SQL injection vulnerabilities, attacks, and techniques, which arise in different situations. Some common SQL injection examples include:

Consider a shopping application that displays products in different categories. When the user clicks on the Gifts category, their browser requests the URL:

This causes the application to make an SQL query to retrieve details of the relevant products from the database:

SELECT * FROM products WHERE category = ‘Gifts’ AND released = 1

Imagine the query above. The restriction released = 1 is being used to hide products that are not released.

The application doesn’t implement any defenses against SQL injection attacks, so an attacker can construct an attack like:'--

This results in the SQL query:

SELECT * FROM products WHERE category = 'Gifts'--' AND released = 1

The key thing here is that the double-dash sequence -- is a comment indicator in SQL, and means that the rest of the query is interpreted as a comment. This means that all products are displayed, including unreleased products.

Consider an application that lets users log in with a username and password. If a user submits the username wiener and the password bluecheese, the application checks the credentials by performing the following SQL query:

SELECT * FROM users WHERE username = 'wiener' AND password = 'bluecheese'

If the query returns the details of a user, then the login is successful. Otherwise, it is rejected.

Here, an attacker can log in as any user without a password simply by using the SQL comment sequence -- to remove the password check from the WHERE clause of the query. For example, submitting the username administrator'-- and a blank password results in the following query:

SELECT * FROM users WHERE username = 'administrator'--' AND password = ''

This query returns the user whose username is administrator and successfully logs the attacker in as that user.

  • (3) UNION attacks, where you can retrieve data from different database tables.

For example, if an application executes the following query containing the user input “Gifts”:

SELECT name, description FROM products WHERE category = 'Gifts'

then an attacker can submit the input:

' UNION SELECT username, password FROM users--

This will cause the application to return all usernames and passwords along with the names and descriptions of products.

You can query the version details for the database. The way that this is done depends on the database type, so you can infer the database type from whichever technique works. For example, on Oracle you can execute:

SELECT * FROM v$version

You can also determine what database tables exist, and which columns they contain. For example, on most databases you can execute the following query to list the tables:

SELECT * FROM information_schema.tables

  • Blind SQL injection, where the results of a query you control are not returned in the application’s responses.

Many instances of SQL injection are blind vulnerabilities. Blind SQL injection arises when an application is vulnerable to SQL injection, but its HTTP responses do not contain the results of the relevant SQL query or the details of any database errors.

How to prevent SQL Injection?

Most instances of SQL injection can be prevented by using parameterized queries (also known as prepared statements) instead of string concatenation within the query.

The following code is vulnerable to SQL injection because the user input is concatenated directly into the query:

String query = "SELECT * FROM products WHERE category = '"+ input + "'";

Statement statement = connection.createStatement();

ResultSet resultSet = statement.executeQuery(query);

We can rewrite it to

PreparedStatement statement = connection.prepareStatement("SELECT * FROM products WHERE category = ?");

statement.setString(1, input);

ResultSet resultSet = statement.executeQuery();

Another example:

The key difference is the data being passed to the executeQuery(...) method. In the first case, the parameterized string and the parameters are passed to the database separately, which allows the driver to correctly interpret them. In the second case, the full SQL statement is constructed before the driver is invoked, meaning we are vulnerable to maliciously crafted parameters.

To protect a web site from SQL injection, you can use SQL parameters.

SQL parameters are values that are added to an SQL query at execution time, in a controlled manner.

For a parameterized query to be effective in preventing SQL injection, the string that is used in the query must always be a hard-coded constant, and must never contain any variable data from any origin.

Design a peer-to-peer payment system.

(My initial thoughts)

What is the purpose of the system: Payment from one person to another

Possible Requirements:

  • User Interact with the system?

=> (to friend )Input handphone number + Transfer payment amount

=> (to store) Scan QR?

=> (pay utilities/gas)?

=> Transfer bank account cash to E-Wallet option

=> Login, Logout

=> Set transfer limit

  • What will the user see?

=> E Wallet cash amount?

=> Past Transactions. Information about inflow and outflow of cash and to whom?

=> Budget tracker feature?

=> Notifications

DB storage concerns

=> How do I link each user to the bank? Probably has some bank API to connect phone number to bank account id?

Potential loopholes/bottlenecks?

=> Security…There are usually two types of attacks that pose threats to any peer to peer money transfer — DoS and storage flooding.

=> Too many requests made?


Draw Major components?

(After reference from Linkedin article)

The problem is as follows: design a payment system in which users authorize payments (typically when buying a product), and the system interfaces with external payment processors like Visa to complete the payment.

A queue-based architecture

Payment systems demand high reliability, because the consequences of not making payments users requested or charging users multiple times is too high. To achieve this reliability, the system puts a “payment task” onto a queue that is processed asynchronously.

  1. The front-end service accepts a user’s payment request. The first order of business is to set up a record in a database that says the user’s request has been received and payment is pending. The record is what will be used to communicate with the user what the status of the payment is.
  2. The front-end service also puts a message on the payment queue with the following information: a reference to the newly-created database record, and any details about the payment (which credit card to use, the amount, etc.). If either this step or the previous one failed, the user is immediately notified.
  3. An asynchronous processor reads from the queue, processing individual payment tasks. The processing is idempotent and crash-tolerant as described in my blog post. The primary objective of the processing is to make the payment using the external payment service (like Visa) and update the user’s payment database record based on the result.
  4. Once the payment goes through or fails in a non-recoverable way (e.g. the external service rejects the payment due to insufficient funds), the asynchronous processor notifies the user. Typically, this would be by sending an email to the user.

Important things to note


  • Important to be able to scale horizontally.


  • Consistency is a key concern in highly available systems. A system is consistent if all nodes see and return the same data, at the same time. To make sure that each node has the same information, they need to send messages to each other, to keep themselves in sync. However, messages sent to each other can fail to deliver, they can get lost and some of the nodes might be unavailable.
  • There are several consistency models, the most common one used in distributed systems being strong consistency, weak consistency and eventual consistency.
  • Why did consistency matter when building a large payments system? Data in the system needed to be consistent. But just how consistent? For some parts of the system, only strongly consistent data would do. For example, knowing if a payment has been initiated is something that needed to be stored in a strongly consistent way. For other parts, that were less mission-critical, eventual consistency is something that was considered as a reasonable tradeoff. A good example is listing recent transactions: these could be implemented with eventual consistency (meaning, the latest transaction might only show up in parts of the system after a while — in return, the operation will be return with lower latency or be less resource intensive).


  • Durability means that once data is successfully added to a data store, it will be available going forward. Some form of replication is usually used to increase durability — if the data is stored on multiple nodes, if one or more nodes go down, the data will still be available.
  • Why did data durability matter when building a payments system? For many parts of the system, no data could be lost, given this being something critical, like payments. The distributed data stores we built on needed to support cluster level data durability — so even if instances would crash, completed transactions would persist. These days, most distributed data storage services, like Cassandra, MongoDB, HDFS or Dynamodb all support durability at various levels and can be all configured to provide cluster level durability.

Managing Counters/ Generating Unique identifiers with Apache Zoo Keeper

We have Apache Zoo Keeper to coordinate the counter among the different servers. In this case we have several servers and the Apache Zoo Keeper which allocates a range of counters to each server.

Each server has a range of counters. When it receives a request from the load balancer, it will then use the current counter to be hashed by the base 64. Note that for a different input, the base64 hash function outputs a different value. Hence, we are guaranteed to have no collision here since counters are different. With the output of the hash function, it can form the new identifier, which can be used in the database.

Note that the context of the image above is for generating unique URLs.

More about ZooKeeper: ZooKeeper allows distributed processes to coordinate with each other through a shared hierarchical namespace which is organized similarly to a standard file system.

in ZooKeeper parlance — and these are similar to files and directories. Unlike a typical file system, which is designed for storage, ZooKeeper data is kept in-memory, which means ZooKeeper can achieve high throughput and low latency numbers.

How does Google Docs work?


RDBMS ensures ACID properties -> Atomicity, Consistency, Isolation and Durability.

Atomicity: All actions in the Xact happen, or none happens

Consistency: If each Xact is consistent, and the DB starts consistent, it ends up consistent.

Isolation: Execution of one Xact is isolated from the other Xacts.

Durability: If a Xact commits, its effect persists.

Cons: Scaling with RDBMS because it might involve consistent hashing to balance the traffic, perform joins and DB queries (CS4224 stuff). This increases complexity especially if we are going to do a lot of read and writes with a large user base of 30 million for eg.

Application code is developed to distribute the data, distribute queries, and aggregate the results of data across all of the database instances. Additional code must be developed to handle resource failures, to perform joins across the different databases, for data rebalancing, replication, and other requirements.


High Availability and Easy Scaling: Just keep adding nodes when we want to expand.

NoSQL is developed for modern applications that are constantly changing (eg, new volumes of data types). It allows for easier scaling and is much flexible compared to SQL databases. NoSQL databases, on the other hand, usually support auto-sharding, meaning that they natively and automatically spread data across an arbitrary number of servers, without requiring the application to even be aware of the composition of the server pool. Data and query load are automatically balanced across servers, and when a server goes down, it can be quickly and transparently replaced with no application disruption.

Cons: Eventual consistency

Design a system for logging

Over time your system will collect a lot of data. Most of this data is extremely useful. It can give you a view of the health of your system, its performance and problems. It can also give you valuable insight into who uses your system, how they use it, how often, which parts get used more or less, and so on.

Remember that logging is a sequence of consecutive events, which means the data becomes time-series data.

What are the requirements?

  • Build the system to collect logs
  • Monitoring. Need a dashboard to monitor and analyse it? Present charts in a human readable way.
  • Alerting: Alert user of significant events.
  • High availability
  • Minimum data loss
  • Horizontal scalability -> Continue working when traffic is heavy
  • Low Latency -> Event should be available for consumption after it is being logged.
  • Minimum operation overhead -> system should be up and running with minimum human intervention.


- PutLogEntry
— — — — — —
Client -> Log Entries -> Service -> Store that Log Entry in a data store
- Kafka

— — — — — — — — — —
Client -> GetLog(filters) -> Service -> Query the data store

Types of Logs:

  • Warning, Error, Fatal, Info, Function Level

Potential Problems or bottlenecks:

  • On a processer, threads may run at the same time. We need to make sure that one thread is fully executed to print out the entire log before printing the next.
  • To guarantee the right logging sequence, the Log Distributor must enter a critical section (or similar facility to serialize access to shared data) when passing the message to the observer, which is done indirectly via the message queue. Another thread trying to log is blocked until the former message is distributed.

Taking example from Pinterest


We have many apps and each of them will send messages to the Kafka which is an open source software. Since we cannot store everything on Kafka, we need to transfer the data to a permanent storage. Pinterest uses the cloud storage.

Kalfka is the central hub for messages.

Singer/Secor/Merced/Sanitizer: In house products by pinterest.

Singer (Logging agent): Read the logs from the app servers and then upload to Kafka.

Secor/Merced: Data transport services. They have 2 versions. This is to move data in volumne from Kafka to S3.

Sanitizer: Sanitise data before doing data processing. The whole pipeline supports message transportation but along the way, there could be data replications or corrupted messages. Hence Sanitizer helps to filter these data.

Uses Spark Streaming to explore data processing

Singer, logging agent requirements

  • Reliable, high throughput, low computation resource usage, support various log file format (text, thrift etc), fairness scheduling (so logging agent can upload logs)
Audit: Know what logs are missing? Heart beat mechanism: Useful when we have more than 1 singer so as not to delay logging upload to kafka.

For more on Kafka:

  • Kafka is an event streaming platform which provides the following three functionalities:
  1. To publish (write) and subscribe to (read) streams of events, including continuous import/export of your data from other systems.
  2. To store streams of events durably and reliably for as long as you want.
  3. To process streams of events as they occur or retrospectively.

Kafka is a distributed system consisting of servers and clients that communicate via a high-performance TCP network protocol.

Servers: Kafka is run as a cluster of one or more servers that can span multiple datacenters or cloud regions. Some of these servers form the storage layer, called the brokers. Other servers run Kafka Connect to continuously import and export data as event streams to integrate Kafka with your existing systems such as relational databases as well as other Kafka clusters.

Clients: They allow you to write distributed applications and microservices that read, write, and process streams of events in parallel, at scale, and in a fault-tolerant manner even in the case of network problems or machine failures. Kafka ships with some such clients included, which are augmented by dozens of clients provided by the Kafka community: clients are available for Java and Scala including the higher-level Kafka Streams library, for Go, Python, C/C++, and many other programming languages as well as REST APIs.

Poducers are those client applications that publish (write) events to Kafka, and consumers are those that subscribe to (read and process) these events.

The end, for now :)




𓆉︎ 𝙳𝚛𝚎𝚊𝚖𝚎𝚛 🪴𝙲𝚛𝚎𝚊𝚝𝚘𝚛 👩‍💻𝚂𝚘𝚏𝚝𝚠𝚊𝚛𝚎 𝚎𝚗𝚐𝚒𝚗𝚎𝚎𝚛 ☻ I write & reflect weekly about software engineering, my life and books. Ŧ๏ɭɭ๏ฬ ๓є!