What is API Rate Limiting and some of the more commonly used algorithms

Tram Ho

What is Rate Limiting?

Rate Limit means to limit the number of requests sent / received (rate) to the system. It only sounds so simple to say. In fact, one must use a number of algorithms to ensure fast and accurate running with less memory consumption. Suppose our system receives thousands of requests, but of them only can handle hundreds of requests / s, for example, and the rest of the requests fail (because the system CPU is overloaded can not handle it ).

To solve this problem, Rate Limiting mechanism was born. Its purpose is to only allow receiving a certain number of requests in 1 unit of time. If too, an error response will be returned.

Some common examples in Rate Limiting are:

  • Each IP address can only create 3 accounts per day.
  • Each user we only allow to send 200 requests / s. If exceeded, an error response will be returned.
  • Each user is only allowed to enter incorrect credit card 3 times per day.

Why does Api need Rate Limiting?

Rate Limiting will help us to limit some of the following:

  • Brute force credit card information (sweep scan)
  • Attack DOS (Denial of Service) to the system
  • Brute force password in the system (sweep type scan)

These attacks seem to come from real users, but they are actually generated by bots. Therefore, it is very difficult to detect and can easily cause our service to crash.

In addition, Rate Limiting also brings us some of the following benefits:

  • Security: Do not allow wrong password input too many times
  • Revenue: For each plan, there will be a different rate limit. If you want to use more, you need to buy a more expensive plan.

Some algorithms used?

Leaky bucket (the bucket)

Your bear does not want you to drink much, so pour all of the milk tea into a bucket and make holes in it. Even if you buy a few dozen glasses, they must be poured into a bucket, each day only about 1 cup. You are only allowed to drink the tea flowing from the bucket.

This way helps the system not afraid of overload when there are many people accessing, because the bucket will block it.

Token Bucket (Token Bucket)

Instead of putting milk tea in a bucket, your bear will put 60 tokens in a bucket every day. If you drink sugar-free tea, you will be able to buy 4 glasses, while drinking black sugar pearl with cheese cream is only 1 cup.

At the end of the day, if the token is out, your bear will refill more. This method ensures you only drink up to 60 tokens a day.

In fact, many requests will run longer, take more resources (create reports, save more data). If using Lua Lua Lua, we can only guarantee the number of requests to the server. With this algorithm, the more resource intensive algorithm will need more tokens, we can easily control the amount of load to the server.

Window Algorithm

Instead of using buckets, these algorithms rely on a timing window to limit the number of requests.

Fixed window

Now no more buying buckets, your bear decided that you can only drink up to 3 drinks a week. One week (from 0am to 12pm Sunday) this is fixed window, limit is 3 glasses.

This algorithm is easy to understand, easy to code, large systems when providing APIs are also limited like this. However, it has one weakness … it can be used to burst the number of requests in excess of the limit.

For example, your bear offers this limit because he wants you to drink 1 drink every 2 days. However, you are too bloody so you should wait 11:30 PM Sun drink 3 glasses, then 1am on Monday morning drink 3 more glasses!

So in 2 hours you drink 6 glasses, the way up, wrong, on the road, it is always!

Therefore, people often use Sliding Window Log / Sliding Window if you want to limit the above.

Sliding Window Log

Instead of using the fixed time, you decided to change the algorithm. Now, when you drink bad luck, the bear will log it. Then, the bear calculates 1 week back from the time you drink.

Suppose you drink 1 cup of milk tea at 12 noon this Wednesday, the bear will check log from 12 noon last Wednesday to 12 noon this Wednesday. This is the Sliding Window.

After that, the bear will count during this time, if you drink less than 3 glasses, then give it to you, otherwise …

This way to ensure the amount of milk tea you drink in 1 week does not exceed 3 glasses. If you drink all 3 glasses at once, you will have to wait… exactly 7 days to drink 1 drop of milk tea.

This method is very accurate, however, it also comes with some disadvantages:

  • Takes up a lot of memory because of having to log all requests
  • Takes up resources because you have to log each request
  • Running can be slow, because each time there is a request, we have to query and count the number of requests in a certain amount of time to see whether or not to receive the request.
  • Imagine a system of about 1 million users, every day they send 1000 requests, we have to log nearly 1 billion events per day. Storing and querying is always worn out!

Sliding Window

Sliding Window is an improved algorithm from Sliding Window Log. We trade accuracy for speed and memory (less storage, less query). We do not log on each request, but just save the amount per time period.

Back to the bear. Because the bear sees the log every time he drinks so tired, now he only logs 1 time on Sunday, which is how many cups of milk tea you drink in that week.

Suppose at week 1 you drink 3 glasses. At 12 noon on Wednesday of the 2nd week, you plan to have 1 more drink. Sliding Window will be from 12 noon last Wednesday to 12 o’clock this week. This Sliding window has 4.5 days a week 1 (Wednesday afternoon to Sunday evening), 2.5 days a week 2 (Monday morning to Wednesday afternoon).

Since there is no specific log, we will estimate the amount of milk tea you drink during this time using the following formula.

This may not be very accurate, as the amount of milk tea you drink will not spread evenly over the week. However, it is not as fraudulent as the fixed window, it consumes less resources and memory, so it is used quite a lot.

Summary

Above is the article I have learned about Rate Limiting api, I feel quite good and the author is very easy to understand and copy from the source https://toidicodedao.com/2020/03/17/rate-limiting-chong-ddos -p1 / .

You can find out more in the two links below: https://cloud.google.com/solutions/rate-limiting-strategies-techniques#techniques-enforcing-rate-limits https://hechao.li/ 2018/06/25 / Rate-Limiter-Part1 /

Share the news now

Source : Viblo