Rate Limiter in System Design. Part 2 – Commonly used algorithms

Tram Ho

In Part 1 , I gave an overview at the conceptual level. In this part 2, I will cover the most commonly used algorithms when building a Rate Limiter system. Each algorithm has its own advantages and disadvantages, let’s find out

Leaky Bucket

Conceptually, the Leaky Bucket works as follows. The server uses a fixed size bucket and the bottom contains an output hole to hold a certain amount of tokens. The token here represents the requests sent to the server.

Tokens are sent to the server by the user (client) and the server is the agent that allows the token to enter/exit the bucket.

To make it easier to visualize, people can see the illustration below:

leaky bucket sample.png

The Leaky Bucket algorithm is not too complicated to deploy or maintain on a server or load balancer, memory management is also simple because everything is configured from the beginning. Because of its high accuracy while using resources efficiently, many large systems use Leaky Bucket as a Rate Limiter, such as NGINX

Although there are many advantages, this algorithm also has some disadvantages as follows:

  1. Burstness: One drawback of the Leaky Bucket algorithm is that it doesn’t handle sudden or bursty request payloads well. When a large number of requests come in at the same time, the algorithm will have to process them all in the same amount of time. This can lead to overload and decrease in system performance.
  2. Slow in response: The Leaky Bucket algorithm can lead to slow responses to access requests made after a full “interval” period during which no requests were made. presently. In this case, the “leak” of requests in the Bucket occurs only when the next interval begins, resulting in a longer wait time for the request to be processed.
  3. Lack of flexibility in handling priority or urgent requests.
  4. Algorithm implementation error: if an error occurs and the bucket is not handled properly, some requests may be accepted and processed even though the percentage of requests has exceeded the established limit. This can allow a large number of requests to flood the system, causing overload and possibly causing the server to become unstable or slow.

Fixed Window Counter

The Fixed Window Counter algorithm (I will call it FWC for short) is different from the Leaky Bucket (using the allocated capacity to control), which is based on time. FWC divides time into equal intervals, and each interval has its own counter. When a request arrives, based on the time, the request will be allocated to the predefined portion of the FWC, and the corresponding counter will be active (e.g. incremented by 1).

When the counter reaches a certain threshold, the request will be refused to execute, must wait or retry at the next interval.

fixed window counter sample.png

With the Fixed Window Counter algorithm, we have the following complexity:

  • Space Complexity: O(1) for storing counters for each Window
  • Time Complexity: O(1) Get time and simple ++ operator implementation

FWC offers many advantages:

  • Easy to deploy
  • Smaller memory capacity, only used to store Counter
  • Can take advantage of built-in concurrency technologies (Redis)

However, there are still quite a few disadvantages:

  • Inflexible: because it works according to a predetermined configuration, it will not be able to scale-up itself in case of need. Developers need to manually adjust the configuration before each campaign with a spike in requests
  • Inheriting the error of the previous Window: in general, it is still flexible, if the previous window has too many requests beyond the threshold, the following windows can recognize and adjust their capacity accordingly.
  • Unable to distinguish priority or urgent requests to prioritize processing

Sliding Window Log & Sliding Window Counter

In the above two algorithms, it can be seen that although it is easy to implement, each algorithm still has certain disadvantages. Sliding Window Log & Counter is used to alleviate those difficulties.

Sliding Window Log – Counter (SWLC) algorithm stores and behaves based on the execution agent, here when a request is sent to the server, the system will identify the request based on certain criteria (token, IP). ,…)

SWLC often uses SLA table, which will log all user actions, simply key-value includes key is (token, IP) and value is requested data that user sends.

Sliding Window Log and Counter Algorithm.png

As the picture describes the Sliding Window Log & Counter algorithm above. For each window, there is always a maximum of 2 requests by a user, the counter will identify and reject all third requests of a user in the same time period.

At the most basic level, Sliding Window Log & Counter has the following complexity:

  • Space Complexity: O(number of requests the server reads in a window). To know which user/IP the request belongs to, the server must definitely do this.
  • Time Complexity ( Sliding Window Log ): O(number of requests read by server in 1 window)
  • Time Complexity ( Sliding Window Counter ): O(1). This is also the difference between Sliding Window Counter (SWC) and Sliding Window Log (SWL). SWC stores only the current window (when there is no need to “look-back” in previous windows)

The advantage of this algorithm is to solve the disadvantages of the two algorithms I mentioned above, which is also a combination of the three factors of quantity, agent, and time that I mentioned in Part 1. So the disadvantages of this algorithm. What is this algorithm?

  • Difficult to implement: combining all 3 is certainly not simple
  • Resource consuming: consumes a lot of memory for storing all users in a window
  • Complexity increases based on the number of requests


Above are some of the most commonly used algorithms when building a Rate Limiter system. Is there any other better solution, please let me and everyone know more!

Thanks everyone.

see more

Part 1 : Rate Limiter in System Design. Part 1 – Concepts and applications

Share the news now

Source : Viblo