Question:
We will start by looking through the example I will mention below, imagine you are a software engineer in the company, the marketing department wants to develop a tool to support marketing campaigns through. email. You are required to design a function that filters users who meet certain conditions and sends emails to them with given content. The solution we initially offered might be as follows: Thus, whenever someone uses your software to send email, you will process each email until the end and then return the results. Suppose if sending each email takes 0.2 seconds and you need to send to 10,000 customers, then it will take 2000 seconds, equivalent to 32 minutes to complete the implementation and the results are returned. Not to mention 32 minutes is a very long time, the connection between the client and the server will be timeout. An improved solution is that we will send emails in batches, assuming each batch can send 1000 emails at the same time and still take 0.2 seconds for each execution, so we need to send 10 batches. down to 2 seconds. So what if the larger input instead of sending to 10,000 customers sent to 1,000,000 customers? Easily calculate the processing time will be up to 200 seconds. With such a long processing time for a request, the UX of the application will be very bad. Since “pressing” the user has to wait for hundreds of seconds while the server is processing, unfortunately in that time the user has to reload or turn it off and then turn it back on, everything becomes very troublesome. The situation becomes difficult to control when many users can use this feature at the same time, the server is overloaded due to having to handle many heavy tasks at the same time. That is the problem with the direction of synchronous processing, also known as synchronous, when you have to wait for the task to complete, the response is returned. There should be another approach to solve this problem more effectively that is handling asynchronous – asynchronous.
Solution
With the problem of sending mail, after the user submits the task, the system will not send immediately but return the message that the job has been received and will be processed soon in the future, during this time The user can continue to do other jobs and when there is a change in status or the job is finished processing the user will receive a notification. The process of actually sending mail will be done in the background. Task queue is such an asynchronous processing technique. How components, mechanism of operation of a task queue system, we will learn in the article today.
Define
Task queue is a system that helps an application to perform tasks asynchronously outside the time range of a normal user request. For large systems, sometimes receiving and processing user requests or tasks takes time, so the synchronous model in the request-response mechanism is no longer appropriate, instead it is an asynchronous processing model. Task queues are often used in problems such as: running long-running tasks, tasks that need to delay processing time (delayed task) or processing tasks that require complex calculations ( compute-intensive task). To have a clearer view, let’s learn about the architecture and operation mechanism of the task queue system shortly.
Architecture and operating mechanism
Basically, an asynchronous system that uses the task queue will have the following components:
- Producer : Service sends tasks to the queue.
- ** Message broker **: This is the queue that we refer to throughout this article. The broker’s task is to receive tasks from the producer, save them to the queue and send them to the corresponding consumer, act as an intermediary to manage the tasks, and act as a bridge between the producer-consumer. Some commonly used message brokers are RabbitMQ, Redis or Kafka.
- Consumer : Service receives and processes the task from the queue.
Typically, producers and consumers are two separate components as the examples will describe below. However, it is also possible to implement a service that is both a producer and a consumer by the task being sent from producer to broker, then the broker sends it back to the service that is running the producer and the task will be processed here. Very flexible, right?
In addition, for producers and consumers to communicate with each other, usually they will agree to use the same message format, which can be JSON, pickle or another format. The producer then serializes the task in a general format and sends it to the broker, the broker saves the tasks in this format to the queue, then the consumer deserializes the task and proceeds to execute.
Cases that use task queues and actual examples
Through the above, you must have partially imagined how the queue function in general and how to install the queue using Celery with Python in particular. So the question is which cases we should use the task queue. Here are a few common use cases:
- Compute-intensive task (compute-intensive task) These tasks, when executed, can consume a lot of system resources (CPU, memory …) easily overloading the server. The use of task queues helps distribute tasks in line with the processing power of workers. An easy example for this use case is shown in the feature of uploading videos to Youtube or Facebook. Did you notice that after a video is successfully uploaded, Facebook or Youtube will take a few minutes to post-process the video? During that time, the video will be censored by various complex algorithms. Instead of running one after another, the algorithms will be brought in and processed by the queue task. The moderation process is over when you receive a notification that the video is ready.
- Long-running task (long-running task) . This is the example given at the beginning, using the queue task to shorten the time to process requests, the system responds faster, providing a better user experience. Another example can be found in car booking applications. Obviously at the time you send a booking request, the system needs to take some time to find a nearby driver, send the booking request to the driver and wait for the driver to respond. This process is not continuous and takes a lot of time because depending on the driver’s feedback, the task queue can be used here for this process to be effectively processed.
- Tasks need to delay processing time (delayed task) . Sometimes there are tasks we want to delay to a certain amount of time in the future. We can put them into the task queue, by the time the task is scheduled to be removed for processing. An example for this case is handling the timeout of the test, at the beginning of the test we will put in the task queue a task to grade the test and delay it a period equal to the maximum time the user has can be used to do homework At the time of timeout, the system will automatically mark (if not previously submitted) without having to wait for actions arising from the user.
- Low priority tasks . For tasks on the sidelines of the main task, which do not require immediate execution, we can also put them into the queue thereby reducing processing time, shortening the waiting time of the user. Such as the task of sending a confirmation email after successful account registration. Obviously, this task has a lower priority than creating an account in the system so it can be put into the task queue. That is why, for many systems, you will only receive a confirmation email about a few seconds after your account registration is successful.
Evaluation and conclusions
As you can see, the task queue will bring many benefits if used correctly. By handling asynchronous tasks, the queue task helps us to unblock, improve response time thereby providing a better user experience.
Also by putting tasks into the queue and processing them later, the task queue helps us be more proactive in handling them. These tasks will be pulled from the queue according to the capabilities of the worker. In the event that too many tasks are required to execute at the same time, the task queue will help avoid system overload. Because the worker handling the task can split up into a separate service, it is also easy to scale up as needed without having to scale the entire system.
However, unnecessary abuse of queue tasks will increase the complexity of the system, thereby increasing operating costs. The application itself also has to handle more complex cases, such as having to manage the status of the task and notify the user when necessary, handle the faulty execution task, retry mechanism …
Through this article, I hope you can understand the task queue, its applications and its application well. Happy coding !!
Reference links
https://medium.com/@thao_gotit