Is the system slow? Apply these techniques right away (Part 2)

Monday, 05/06/2023

Tram Ho

Walking lessons

A large system is always faced with the constant increase in traffic from users, and at the same time has to control the growing amount of data. You start to notice obvious signs of stagnation, slow system performance, and poor user experience.

Following Part 1 , in this Part 2, I will introduce you to useful techniques to optimize and improve performance, making the system work more efficiently at the Server. Let’s go!

App/Server

1. Hardware

Upgrading RAM, CPU, DISK is the first thing that cannot be ignored. Ensuring enough hardware resources to meet the needs of processing tasks is essential

When I was a student, my mother gave 10 million to buy a computer. In year 1 and year 2, everything worked smoothly and nothing happened. Until my 3rd year, when I installed Microsoft’s Visual Studio to learn C#, then disaster struck, my laptop could not open the application, let alone the optimal code, clean code.

The fact is, no matter how well you optimize the source code, but do not pay for hardware upgrades, it will not be able to meet the growing system.

2. Cache

The server cache, like the client cache, is temporary data storage. However, the cache in the client is very limited

Cache on the server to reduce disk and database access time. How much do you want to cache, cache how much you want to cache, it is important to have an effective cache strategy

There are two types of cache: Distributed cache and In-memory cache

Distributed cache : The server of the cache will be located on a different server, separate from your web server
In-memory cache : You will cache it directly on your web server

RAM is finite, if you cache without control, RAM overflow is inevitable. For an effective cache implementation, you need a good caching strategy. Here are some common caching strategies:

LRU Cache (Least Recently Used) : Cache the most recently accessed data, less frequently accessed data will be removed from the cache when the limit is reached.
FIFO Cache (First In First Out) : New data will be added to the cache, old data will be removed from the cache when the limit is reached.

3. Reverse proxy and load balancer

A reverse proxy is a server that stands in front of the main web server and accepts requests from clients. Instead of requests from the client being forwarded directly to the web server, the reverse proxy takes care of receiving and forwarding this request

Limit rating is one of the important functions of reverse proxy, limiting the number of user requests in a period of time. For example, an IP in 1 minute can only request 10 times. This helps to protect the web server from being overloaded and against attacks from attackers

Load balancer: Distribute requests to web servers or nodes to offload a web server that has to handle too many tasks.

Combination of reverse proxy and load balancer enhances system security and performance

4. Optimize code

Use the right data structures and algorithms

Remove redundant code , check source code and remove unused code. Unused code not only increases the size of the program, but can also affect performance

Loop optimization ensures that loops in the source code are written as efficiently as possible

Code reuse , check that previously written parts of the code can be reused to avoid rewriting the same functionality.

5. Microservices

Microservices are a software architecture in which an application is decomposed into small, independent, and self-managed services. Each microservice runs separately and performs an application specific function

Design based on microservices architecture will help increase scalability, provide independence and reuse, increase manageability, and enable diverse use of technologies.

6. Async & Concurrency

Async and concurrency techniques are two important methods in optimizing concurrency and increasing system performance.

Async is used to process tasks without blocking the main thread of the program. Instead of waiting for one task to complete before continuing with the next, async allows the program to continue executing other tasks.

Concurrency is the system’s ability to handle multiple tasks at the same time. It allows tasks to be executed in parallel and utilizes system resources efficiently

7. Handle error

Make sure to handle errors correctly to avoid program crashes or sudden shutdowns. Besides, it is necessary to set up an automatic application restart mechanism to ensure high availability

8. Compress data

When working with data, an important method to optimize the size and speed up the data transfer is data compression. Compress data before storing or returning it to the client to reduce size and speed up data transfer

9. Stream

Stream is a method of processing data continuously and increasing processing speed by breaking data into smaller pieces and transmitting them as a continuous stream. Instead of waiting for all data to be processed and sent back to the client, stream allows partial data transmission

Database

1. Design database

Standard database design is an important part of ensuring system integrity and performance, to avoid redundant and repetitive data storage.

2. Index & Create view

Indexing : Create indexes on important data fields in the database. Indexes help speed up queries. This is especially useful when querying on frequently queried data fields

Create table view : A virtual table is created from one or more original tables in the database. The view table contains only the fields needed to satisfy a particular query need. By creating table views, you can reduce the number of fields returned and limit the data to be processed, thereby increasing query speed.

However, it should be noted that these two techniques trade off fast and memory intensive queries

3. Query

Using the appropriate query method : Based on the problem requirements, we need to choose the appropriate query method

Such as using aggregate() in mongoose instead of using find() when there are millions of records

Avoid N + 1 query : A common problem in query, when need to get related data from many different tables

Example: You need to query 100 users in the User table, each user will have a role in the Role table. You will need to loop 100 users to get each role of each user

When you need to query related data between tables, use populate in Mongoose library or join in SQL to optimize data retrieval. This helps to avoid N +1 queries by getting data in a single query

Select only necessary fields : Limit the selection of unnecessary fields in the query. Select only the necessary data fields to reduce the amount of data returned and speed up the query. This is especially important when querying large data or when there are fields that contain large data

Query order : Sorting the correct order is very important, it can affect the query time if you sort the wrong order

For example: We need to find the 10 best-selling products of category A, should we filter category A first and then sort by the number of sales , or sort the number of sales and then filter by category A ? The answer is that we need to sort the data and then filter by category A

4. Cluster/Data replication

Cluster/Data replication: Replicating data to different servers is an advanced technique that needs to be applied to large systems. Benefits include:

High Availability: If a node or server fails, the system can continue to operate through other copies of data.

Fault tolerance: When a node or server fails, the system can automatically switch to other copies of data without disrupting operations.

Increased performance: Data distributed across multiple nodes will help increase processing capacity and parallel access, you can write on the primary node and read on the secondary nodes

Data backup and restore: Data copies can be used to backup and restore data as needed.

5. Partitioning/Sharding Data

Partitioning/Sharding Data: Increases scalability and performance by dividing data into small pieces or fragmenting on different servers

6. Choose database

Choosing the right database is an important factor in ensuring the system’s efficient operation. Currently, there are different types of databases such as NoSQL and SQL, each with its own advantages and limitations.

NoSQL Databases : Includes MongoDB, DynamoDB and many other NoSQL DBMSs

SQL Database: Including MySQL, SQL Server, PostgreSQL and many other SQL database management systems

End

Hope the knowledge I shared in Part 1 and Part 2 can help your system optimize and increase performance. Thank you for reading.

Share the news now

Source : Viblo