Ethan's Blog


  • Home

  • Archives

  • Tags

  • Search

A Framework for System Design Interviews

Posted on 2023-10-15

You have just landed a coveted on-site interview at your dream company. The hiring coordinator sends you a schedule for that day. Scanning down the list, you feel pretty good about it until your eyes land on this interview session - System Design Interview.

System design interviews are often intimidating. It could be as vague as “designing a well-known product X?”. The questions are ambiguous and seem unreasonably broad. Your weariness is understandable. After all, how could anyone design a popular product in an hour that has taken hundreds if not thousands of engineers to build?

The good news is that no one expects you to. Real-world system design is extremely complicated. For example, Google search is deceptively simple; however, the amount of technology that underpins that simplicity is truly astonishing. If no one expects you to design a real-world system in an hour, what is the benefit of a system design interview?

The system design interview simulates real-life problem solving where two co-workers collaborate on an ambiguous problem and come up with a solution that meets their goals. The problem is open-ended, and there is no perfect answer. The final design is less important compared to the work you put in the design process. This allows you to demonstrate your design skill, defend your design choices, and respond to feedback in a constructive manner.

Read more »

Back-of-the-envelope Estimation

Posted on 2023-10-14

In a system design interview, sometimes you are asked to estimate system capacity or performance requirements using a back-of-the-envelope estimation. According to Jeff Dean, Google Senior Fellow, “back-of-the-envelope calculations are estimates you create using a combination of thought experiments and common performance numbers to get a good feel for which designs will meet your requirements” [1].

You need to have a good sense of scalability basics to effectively carry out back-of-the-envelope estimation. The following concepts should be well understood: power of two [2], latency numbers every programmer should know, and availability numbers.

Power of two

Although data volume can become enormous when dealing with distributed systems, calculation all boils down to the basics. To obtain correct calculations, it is critical to know the data volume unit using the power of 2. A byte is a sequence of 8 bits. An ASCII character uses one byte of memory (8 bits). Below is a table explaining the data volume unit (Table 1):

Power Approximate value Full name Short name
10 1 Thousand 1 Kilobyte 1 KB
20 1 Million 1 Megabyte 1 MB
30 1 Billion 1 Gigabyte 1 GB
40 1 Trillion 1 Terabyte 1 TB
50 1 Quadrillion 1 Petabyte 1 PB
Read more »

Scale from Zero to Millions of Users

Posted on 2023-10-13

Designing a system that supports millions of users is challenging, and it is a journey that requires continuous refinement and endless improvement. In this chapter, we build a system that supports a single user and gradually scale it up to serve millions of users. After reading this chapter, you will master a handful of techniques that will help you to crack the system design interview questions.

Single server setup

A journey of a thousand miles begins with a single step, and building a complex system is no different. To start with something simple, everything is running on a single server. Figure 1 shows the illustration of a single server setup where everything is running on one server: web app, database, cache, etc.

Read more »

更加复杂的数据结构

Posted on 2023-04-15

引言

目前为止,我们接触了大量的数据结构,包括利用指针实现的三剑客和 C++ 自带的 STL 等。对于一些题目,我们不仅需要利用多个数据结果解决问题,还需要把这些数据结构进行嵌套和联动,进行更为复杂、更为快速的操作。

并查集

并查集 (Union-Fnd) 可以动态地连通两个点,并且可以非常快速地判断两个点是否连通。假设存在 n 个节点,我们先将所有节点的父亲标为自己;每次要连接节点 i 和 j 时,我们可以将 i 的父亲标为 j;每次要查询两个节点是否相连时,我们可以查找 i 和 j 的祖先是否最终为同一个人:

其中 union 操作可以将两个集合连在一起,find 操作可以查找给定节点的祖先,并且如果可以的话,将集合的层数/高度降低。

684. Redundant Connection
因为需要判断是否两个节点被重复连通,所以我们可以使用并查集来解决此类问题。为了加速查找,我们可以使用路径压缩和按秩合并来优化并查集。

Read more »

指针三剑客之三:图

Posted on 2023-04-13

数据结构介绍

作为指针三剑客之三,图是树的升级版。图通常分为有向 (Directed) 或无向 (Undirected),有循环 (Cyclic) 或无循环 (Acyclic),所有节点相连 (Connected) 或不相连 (Disconnected)。树即是一个相连的无向无环图,而另一种很常见的图是有向无环图 (DAG):

假设图中一共有 n 个节点、m 条边,图通常有两种表示方法:

  • 邻接矩阵 (Adjacency Matrix):我们可以建立一个 n×n 的矩阵 G,如果第 i 个节点连向第 j 个节点,则 G[i][j]=1,反之为 0;如果图是无向的,则这个矩阵一定是对称矩阵,即 G[i][j]=G[j][i];
  • 邻接链表 (Adjacency List):我们可以建立一个大小为 n 的数组,每个位置 i 储存一个数组或者链表,表示第 i 个节点连向的其它节点;

邻接矩阵空间开销比邻接链表大,但是邻接链表不支持快速查找 i 和 j 是否相连,因此两种表示方法可以根据题目需要适当选择。除此之外,我们也可以直接用一个 m×2 的矩阵储存所有的边。

Read more »
1…345…55
necusjz

necusjz

271 posts
16 tags
© 2016 - 2026 necusjz
Powered by Hexo
Theme - NexT.Mist