16 system design concepts I wish I knew before my interview

Tram Ho

1. Domain Name System (DNS)

1.1 What is DNS?

DNS is a fundamental part of the Internet that translates human-readable domain names (such as www.example.com ) into computer-readable IP addresses (such as 192.0.2.1).

1.2 How DNS Works

  • DNS Query: When a user types a URL, the browser sends a DNS query to the DNS resolver.
  • Root Servers: Forward queries to DNS root servers and point them to appropriate TLD servers if the resolver does not have the information.
  • TLD Server: The TLD server directs resolvers to the authoritative server that controls the domain of the question.
  • Authoritative Servers: Authoritative servers provide IP addresses that correspond to resolvers.
  • Response: The resolver returns an IP address to the browser, and the browser connects to the desired server.

2. Load balancer

2.1 What is a load balancer?

A load balancer is a piece of equipment or software that distributes traffic across multiple servers. This improves application performance, availability, and reliability.

2.2 Types of load balancing

  • Round Robin: Requests are evenly distributed among servers.
  • Least Connections: Requests are sent to the server with the fewest active connections.
  • IP Hash: Requests are assigned based on a hash of the client’s IP address.

3. API Gateway

3.1 What is an API Gateway?

An API gateway is a server that sits between clients and microservices. API gateways route API requests to the right microservices, hiding system complexity.

3.2 Functions of API Gateway

  • Request routing: Forward requests from clients to the appropriate microservice.
  • Authentication: Validate client access rights and enhance security.
  • Rate limiting: Limit the number of requests to the system to prevent overload.

4. Content Delivery Network (CDN)

4.1 What is a CDN?

A CDN is a network service that speeds up the delivery of web content by caching the content in data centers around the world.

4.2 How CDNs work

  • Edge Servers: CDNs install edge servers in data centers around the world to cache content closer to users.
  • Request routing: When a user accesses content, the CDN selects the closest edge server to route the request.

5. Forward and reverse proxies

5.1 Forward proxy

A forward proxy is a server between the client and the Internet that directs client requests to the appropriate server. Forward proxies bypass network restrictions, improve security, and utilize caching to save bandwidth.

5.2 Reverse Proxy

A reverse proxy is a server between the internet and a backend server that forwards client requests to the appropriate server. Reverse proxies can utilize caching for load balancing, increased security, and faster delivery of content.

6. Caching

6.1 What is caching?

Caching is a technique for efficiently storing and serving frequently accessed data. Store a copy of the data closer to the user to improve request response time.

6.2 Cache types

  • Browser cache: Content cached in the client-side web browser.
  • CDN Cache: Content cached on edge servers.
  • In-memory cache: data cached in the server’s memory. Examples: Redis, Memcached.

7. Data partitioning

7.1 What is data partitioning?

Data partitioning is the technique of dividing data into multiple parts to improve processing speed and availability.

7.2 Types of partitioning

  • Horizontal partitioning: splits data by rows. Each partition contains part of the overall dataset.
  • Vertical Partitioning: Divides data by columns. Each partition contains specific columns of the dataset.

8. Database replication

8.1 What is database replication?

Database replication is the process of copying data to multiple database servers to improve data availability, fault tolerance, and performance.

8.2 Types of replication

  • Master-Slave Replication: A master server makes data changes and a slave server copies those changes.
  • Master-Master Replication: All servers make data changes and replicate the changes to other servers.

9. Distributed messaging system

9.1 What is a distributed messaging system?

A distributed messaging system is a system for efficiently exchanging messages between applications. This results in looser coupling between components, which improves scalability and fault tolerance.

9.2 General distributed messaging system

  • Apache Kafka: A distributed streaming platform with high throughput, fault tolerance, and scalability.
  • RabbitMQ: A highly scalable and reliable message broker.

10. Microservices

10.1 What are microservices?

Microservices is an architectural style that divides an application into small, independently deployable services. This makes development, deployment, and scaling easier.

10.2 Advantages of Microservices

Facilitate development: Small teams can develop features independently. Scalability: Each service can be scaled independently. Fault localization: Only individual services are affected, not the entire system.

11. NoSQL databases

11.1 What is a NoSQL database?

NoSQL databases have a different data model than traditional relational databases, offering greater scalability and flexibility.

11.2 Types of NoSQL databases

  • Key-value store: A database that stores simple key-value pairs. Examples: Redis, Amazon DynamoDB.
  • Document store: A database that stores data in document formats such as JSON and BSON. Examples: MongoDB, Couchbase.
  • Column Family Store: A database that stores data by column family. Examples: Apache Cassandra, HBase.
  • Graph database: A database that stores data in a graph structure. Examples: Neo4j, Amazon Neptune.

12. Database index

12.1 What is a database index?

A database index is a structure that enables efficient searching of data in a database. Using indexes can greatly improve query performance.

12.2 Index types

  • B-Tree index: A common index, often used in relational databases.
  • Bitmap index: Effective for low cardinality data (data with few types of values).
  • Hash index: fast for equality lookups, but not good for range lookups.

13. Distributed Transactions

13.1 What is a distributed transaction?

A distributed transaction is a transaction that runs across multiple nodes. The goal is to execute transactions in a distributed system while maintaining data consistency.

13.2 Two Phase Commit (2PC)

Two-phase commit is a protocol for performing distributed transactions and consists of two phases:

  • Prepare Phase: Ask all participants if they are ready to commit the transaction.
  • Commit/Abort Phase: Commit the transaction if all participants are able to commit. Otherwise, abort the transaction.

14. Sharding

14.1 What is sharding?

Sharding is the process of dividing a database into multiple instances, each managing a portion of the dataset. This improves database scalability.

14.2 Sharding methods

  • Hash-based sharding: assigns data to shards based on the key’s hash function.
  • Range-based sharding: Allocate data to shards based on key ranges.

15. Data Center

15.1 What is a data center?

A data center is a dedicated facility that hosts information technology (IT) infrastructure such as computer systems, communication systems, and storage systems. Data centers store, process, and manage data and connect to the Internet.

15.2 Characteristics of data centers

  • Redundancy: Redundant power, cooling, and communication links are provided to handle system failures.
  • Security: Physical and logical security measures are in place.
  • Energy Efficiency: Data centers use energy-saving designs and cooling techniques to reduce power consumption.

16. Load Balancing

16.1 What is load balancing?

Load balancing is the process of evenly distributing the load on your system across multiple servers. This improves system availability, scalability, and performance.

16.2 Methods of Load Balancing

  • Round Robin: Assigns requests to servers in order.
  • Least Connections: Assign requests to the server that currently has the fewest connections.
  • IP Hash: Assigns requests to servers based on the client’s IP address.

These are the basic components and concepts of distributed systems. By understanding these components and concepts and applying them appropriately, you can build distributed systems with high availability, scalability, and performance.

last

I am always indebted. I hope you enjoyed this article and learned something new.

See you in the next article! If you like this article, please hit “LIKE” and subscribe to support me. thank you very much.

Ref

Share the news now

Source : Viblo