One of my life goals is to have a renowned architect like John Nouvel design my home using modern architecture while staying within my limited budget.
Hiring an architect gives you access to a good design with a creative solution. They will help you choose the right materials and finishes. A design that would stand the test of time, both structurally and aesthetically.
This post is not about architects. It shows that having an architect on a house construction team offers advantages over not having one. Similarly, having System Design Engineers on an organization's application team can bring the same benefits.
Welcome to this new issue, as we continue to learn about the basics of software engineering. This week’s focus is on tracking back a little to understand the importance of system design. Understanding the role system design plays as a significant component of software engineering would help put perspective on our future issues as we continue to build upon these basics.
We will explore the following topics:
What is system design?
Why take it that seriously?
In the end, we will design a URL shortener service as an example to emphasize the point
With the advent of Artificial Intelligence and Machine Learning slowly taking shape and being integrated into our contemporary world, there has been no better time to learn a common denominator, something that is fundamental to every complex and sophisticated application in our world. This current unfolding of events is almost exactly as it was reportedly during the inception days of cloud computing, famously orchestrated by Amazon AWS (Amazon Web Services) in 2006.
It’s almost certain that if you fail to improve on learning and mastering the core aspects of software engineering, like system design, and continue to leverage your skills in the direction of these fundamentals, you will be left significantly behind.
Let’s talk about it.
What is System design?
A system is a group of interconnected or interdependent components or parts that work together to achieve a common purpose or goal. A system can be thought of as a set of elements or entities that interact with each other in some way to produce a desired outcome.
System design is the ability to plan and devise a strategy for creating a system or application. System is a pretty large topic that covers everything from understanding the end goal of what you and your company want to achieve, figuring out how to make that happen, and then breaking that down into each part of a larger plan so that the system can be built.
This could cover all elements, such as:
Architecture: This is the conceptual model that defines the structure, behavior, and more views of a system. We can use flowcharts to represent and illustrate the architecture.
Modules: These are components that handle one specific task in a system. A combination of modules makes up the system.
Components: This produces a particular function or group of related functions. They are made up of modules.
Interfaces: This is the shared boundary across which the components of the system exchange information and relate.
Data: This is the management of the information and data flow.
Why Should We Learn System Design?
Except if you are a freelance developer building online stores for small to medium businesses with third-party platforms like WordPress or Webflow, you might never have a reason to learn how to build distributed systems (this term would be used interchangeably with system design).
No, scratch that; even a WordPress developer would still, at the very least, need to understand the technical know-how of how a distributed system works because the online store will eventually interact with and rely on some distributed systems to build a reliable product for their clients.
The online stores would one way or another rely on third parties like Stripe or Twilio to process online payments or send text messages to customers, respectively.
Likewise, for a front-end developer, your work never happens in isolation. Even when you are “just” rendering a web page, you are interacting with several different distributed systems through APIs, especially when it's a third-party platform.
A common example is Google’s home page, where you will find a plethora of tutorials teaching you how to build a clone of the Google search page. Literally, any decent front-end developer can build it. After all, it’s just a search box with a logo.
However, what is not remotely easy to clone is the fact that, between the first character typed in the search box and the search results you get after hitting the Enter
button, the browser is interacting with hundreds of thousands of servers through distributed systems.
System design knowledge is essential for engineers of all types and levels. Understanding how things work under the hood enables you to create solutions that are tailored to your system's needs and limitations, leading to greater success in your career.
Some more reasons…
There are even more practical and compelling reasons. Let’s explore them.
FAANG interviews include system design.
Learning system design is crucial. Every FAANG company now conducts a system design interview for all new hires, even at entry level. This demonstrates the importance of understanding system design.
If you're applying for a technical role, like front-end, you still need to understand system design. You might be wondering why this is important.
FAANG companies look for individuals who possess both top-notch skills and a deep understanding of key concepts, in addition to the ability to build technology capable of handling high traffic volumes.
Understanding how your project works as a whole is crucial to building it successfully. This is because if you don't have a comprehensive understanding of the different elements and how they interact with each other, it's easy to make mistakes that could cause the project to fail.
Therefore, it's important to invest time in learning about system design and how it applies to your project. By doing so, you'll be able to build in a way that is robust and sustainable, ensuring your project's success in the long term. Additionally, some roles pay well for this knowledge, so it's a valuable skill to have.
Or for even any Senior positions…
If you don't care too much about working at a big tech company, you might be interested in finding a high-level job at your current workplace or another company.
Due to the increasing demand for knowledge and expertise in distributed systems, most technology companies now require a System Design interview. During this interview, you may be asked to design various things, such as Facebook's News Feed, a rideshare app like Uber or Lyft, or a URL shortener like Tiny URL or LinkedIn's in-house URL shortener. We will be demonstrating how to design a URL shortener in this issue.
During that time, they also want you to cover everything that’s involved with that system, such as how it's built, how it's connected, how it stores information, how it shares information, how it handles mistakes, and more.
Many engineers may not know how to make the right decisions about certain concepts. This is why some companies use this as a way to distinguish between a senior and a junior engineer.
Let’s design a URL shortener like TinyURL, or even LinkedIn URL shortener.
Designing a URL Shortener Service
URL shortener services like Rebrandly, Bit.ly are very popular for generating shorter aliases for long URLs.
You need to design a kind of web service where if a user gives a long URL, the service returns a short URL, and if the user gives a short URL, it returns the original long URL.
We could use a brute force approach, when a user gives a long URL, we convert it into a short URL and simply update the database, and when the user requests information with the given short URL on the internet, we search the short URL in the database, get the long URL, and simply redirect the user to the original URL. Easy.
This seems simple, until you have to consider the scalability of the service. Questions like what happens when we start to have 10 million new URL shortening requests per month or when our little server hosted on probably a shared hosting platform is starting to give way, slowing down the response time due to the large amount of writes on the database per second
Well, let’s start by talking about some of the requirements.
Requirements
Always ask questions to identify the scope of the system you are attempting to design. Here are some of the requirements we will be working with in this example.
Given a long URL, the service should generate a shorter and unique alias for it.
When the user hits a short link, the service should redirect to the original link
Links will expire after a default time span
The system should be highly available. This is really important to consider because if the service goes down, all the URL redirection will start failing.
URL redirection should happen in real-time with minimal latency
Shortened links should not be predictable.
Traffic
Let’s assume our service has 20 million new URL shortenings per month. Let’s assume we store every URL shortening request and their associated shortening link for 5 years. For this period, the service will generate about 1.2 Billion records.
20 million * 5 years * 12 months = 1.2 Billion
URL Length
Let’s consider that we are using 7 characters to generate a short URL. These characters are a combination of 62 alphanumeric characters [A-Z, a-z, 0–9], something like:
http://example.com/waTx4p1
Data Capacity Modeling
Discuss the data capacity model to estimate the storage capacity of the system. We need to understand how much data we might have to insert into our system. The different columns or attributes that will be stored in our database and calculate the storage of data for the next five years.
Let’s make the assumptions given below for different attributes:
Consider the average long URL size of 2 kb, i.e., 2048 characters.
Short URL size of 17 bytes, i.e., 17 characters
Timestamps created of 7 bytes
The above calculation will give a total of 2.024 kb per shortened URL entry in the database. If we calculate the total storage for 20 million active users, then the total size is:
20,000,000 * 2.024 = 40480000 kb = 40.48 GB per month
40.48 GB * 12 months = 0.4857 TB
40.48 GB * 12 months * 5 years = 2,428 TB
We need to think about the writes and reads that will happen on our system for this amount of data. This will decide what kind of database, whether RDMS or NoSQL, we need to use.
URL Shortening Logic—Encoding
To convert a long URL into a unique short URL, we can use some hashing techniques like Base62 or MD5. In this issue, we will discuss only the Base62 approach. Let’s discuss.
Base62 Encoding: The Base62 encoder allows us to use the combination of alphanumeric characters and numbers, which contains a total of 62 characters—A-Z, a-z, 0-9 (26 + 26 + 10 = 62).
So for our assumption of a 7-character short URL, we can have 62 exponent 7, i.e., 62^7. Where 62 is called the base, whereas 7 is known as the exponent of the expression. The calculation of this is simply multiplying 62 in 7 different places.
Therefore, 62^7 = 62 × 62 × 62 × 62 × 62 * 62 * 62 = 3521614606208/3500 billion/3.5 trillion
This is 3500 billion potential URLs that can be generated from the base62 combination. This calculation may be used for any other base you might be more comfortable using, e.g., base64.
I think base 62 is quite enough to use in comparison to base 10, as it only contains numbers (0–9), so you will get only 10 million combinations.
But using base62 with the assumption that the service is generating 1000 tiny URLs per second, it will take 110 years to exhaust the calculated combinations above. Let’s calculate:
1000/sec * 86400 sec/day = 86,400,000 short URLs / day
86400000 sec * 7 days = 604,800,000 URLs / week
86400000 sec * 365 days = 31,536,000,000 URLs / year
31536000000 * 110 years = 3,468,960,000,000 URLs in 110 years
Clearly, the combination would be more than enough to provide combinations that will never collide with each other over the next 110 years.
Database
We may use an RDBMS that uses ACID properties, but we will be facing the scalability issue with relational databases. If you think you can use sharding and resolve the scalability issue that comes with it, then you can decide to use it, but then it will increase the complexity of the system.
There are 20 million active users, so there will be very high conversion rates and a lot of short URL resolutions and redirections. Expectedly, read and write will be heavy for these 20 million users, so scaling a relational database using shards will definitely increase the design complexity.
In my opinion, the NoSQL database is better suited for this design. NoSQL can easily handle high availability, and it’s easy to scale. The only problem, however, is its eventual consistency model. When we write something to the database, it takes a little time to replicate it to different nodes. Nodes here refer to how we expand our storage capacity when we need to scale the application.
Mapping Long URL Into Short URL In Database
To generate a shorter alias for a long URL, we can use a technique called base62 and store it in the database. The following are the steps to implement this technique:
Check if the short URL already exists in the database. If it does, generate a new unique short URL.
If the short URL does not exist in the database, store the long URL and short URL in the database, along with other relevant information.
By following these steps, we can ensure that each short URL is unique and avoid any collisions.
The problem with this method is that it only works in a straight line when you have one server. When you have multiple servers, which is often the case when dealing with 20 million active users, this method can cause a race condition.
This means that several servers could create the same hash value for different long URLs, even if they checked the database before adding data to it. Hence, causing our database to become corrupt.
One way to work around this problem is to use a counter-attack method. This approach is good for a scalable solution like this because counters are always connected, so we can get a new value for every new request.
When a designated server receives a new request to shorten a URL, it communicates with the counter. The counter returns a unique number and increments its internal value. This process repeats for each subsequent request.
The unique number is used to generate the short URL.
The next issue is the creation of a single point of failure if the counter server goes down for any period of time. This problem persists even if we attempt to create multiple servers for a single counter. It's challenging to maintain a single counter and return the output across all associated servers.
So we can use a multiple internal counters approach for multiple servers then, which uses different counter ranges across the different servers. For example, server 1 ranges from 1 to 1 million, server 2 from 1 million to 5 million, and so on.
We are making some progress, but we still have a problem with failure. If one of the servers for the counters goes down, it will be hard to access the range that the counter is handling, and we will lose that range.
Also, if one counter reaches it’s maximum limit, resetting the counter will be difficult because there is no single host available to coordinate them all. There is no master-slave relationship between the counter servers, and we would never know how to coordinate and synchronize them.
To solve this problem, a distributed service like Zookeeper can be used to manage tedious tasks and overcome various challenges of a distributed system, such as race conditions, deadlock, or partial failure of data. Zookeeper is essentially a distributed coordination service that manages a large set of hosts. It is commonly used by Kafka brokers to determine the leader of a given partition and topic, and to perform leader elections.
This system keeps track of various elements such as server names, active and inactive servers, and configuration information for all hosts.
It provides coordination and maintains the synchronization between the multiple servers. Making it the perfect solution to our current problem.
Let’s take the first two billion combinations from our available 3.5 trillion combinations. We will divide the 2 billion combinations into 1000 ranges of 2 million each.
range 1 = 1 - 2,000,000
range 2 = 2,000,001 - 4,000,000
...
range 1000 = 1,998,000,001 - 2,000,000,000
When a request is made to the server, the server checks for an unused range from Zookeeper. For example, if a server is assigned range 1 (let's call it S1), S1 generates the short URL by incrementing the counter using an encoding technique. This method guarantees a unique number every time, so there is no possibility of a collision. It also eliminates the need to check the database to see if the URL already exists. Instead, we can directly insert the mapping of a long URL and a short URL into the database.
In the worst scenario, if one server stops working, we will have two million combinations in Zookeeper that we can't use. Some solutions suggest reducing the range to one or half a million. However, the goal is to lose a small enough range that it won't matter much in the bigger picture, which is to have high availability.
If one server reaches its maximum range or limit, it can get a new range from Zookeeper. Adding a new server is simple. Zookeeper will give an unused number range to the new server. If we use up all the numbers in the first 2 billion range, we'll use the second range of 2 billion to keep going.
This is the point we stop with the design in this issue. There are definitely way more topics we have not mentioned. There are still more high-level concepts that adds to services’ efficiency. There is the caching, the redirections, the analytics, load-balancer or even security of the system.
Rounding up…
Learning system design is a crucial aspect of software engineering that cannot be overlooked. It is not something that can be mastered overnight or in a few weeks. It requires a step-by-step approach and consistent practice over time to become ingrained in one's subconscious.
Developers who invest their time in mastering system design not only increase their chances of landing a job at top tech companies but also enhance their ability to build robust and sustainable systems that meet the needs of their users.
It is essential to approach system design with a mindset of continuous improvement and to keep up with the latest trends and technologies. By doing so, one can develop a strong foundation in software engineering and enjoy a fulfilling career in the tech industry.
We are dedicated to teaching you software engineering concepts, including system design, step by step. We hope you find this issue educational and informative, and we plan to continue offering valuable content to help you progress in your career.
We love reading your comments and feedback! Don't hesitate to give us a like and leave a comment.
Cheers
Happy learning!