Create API server on Node.js with Express and MongoDB (part 2)
- Tram Ho
Artwork: When the collection has 200 thousand documents
Hello everyone, continue with the last post: Create API with Express.js + Mongodb . I also received the request as if I wanted to guide both API build and complete interface. However, I have not yet written a tutor yet.
There is also an issue that I have and some of you may encounter a previous post about paging when taking the posts list. With applications that store less, only a few hundred thousand documents in a collection, it looks like everything is still as good as above. But, as the size increased from 1 million documents, you saw the difference when the response time increased, feeling the application slowed down. And it is indeed true:
Artwork: When the collection has 1 million documents
Artwork: When collection reaches 2 million documents
Oh god believe it, a request that takes up to 16 seconds to respond. Unbelievable. So this article I will write and with you to learn and resolve the issue above.
Infulence level
So, how will it affect our system?
- Directly affect the user experience when using the function that directly uses the API above.
- In addition, it also indirectly affects other services using them because there is a big time.
- More seriously, if the number of documents increases by 3 million, 5 million, 10 million, then not long ago, the other API will even exceed the timeout and there is no response. When being called too much, memory will also increase and the number of processed requests is reduced. When reaching the maximum, there will be “rejected” requests because of overloading.
Reason
First, let’s take a look at the old code dealing with the posts list collection:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | router.get ('/ posts', async (req, res) => { try { const options = { sort: {_id: -1}, limit: parseInt (req.query.limit || 20, 10), page: parseInt (req.query.page || 1, 10) } const posts = await Post.paginate ({}, options) return res.send (posts) } catch (e) { return handlePageError (res, e) } }) |
In the previous article, the API code takes the list of documents in the posts
collection, which is quite short and simple. Only use a paginate
method provided by the plugin for mongoose is mongoose-paginate-v2
. Thus, this issue is most likely due to the pagination logic inside the plugin. Nor is it possible that this collection is declared and there is a problem, so when the data query is slow.
Learn and fix
To interrupt the problem, let’s try another way to get the data for a page. Use the traditional way with find
and limit
as to how the query speed works:
1 2 3 4 | // .... const posts = await Post.find (). limit (20) return res.send (posts) |
The query speed is very fast, less than 30ms has been answered. Now, it can be confirmed that there is a problem with paging with the plugin. The simplest way is to manually perform paging without using the plugin to run the application smoothly again. In addition, it is also possible to create issues and pull requests to fix errors for the plugin above.
I will write down the paging. There are several effective ways to quickly paginate collections as follows:
C1: Use find, skip and limit
Use paging techniques with skip and limit. Inside:
limit
to get the right number of documents in a pageskip
to skip the number of documents on the pages before the current page.
At this point, the code will become as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | router.get ('/ posts', async (req, res) => { try { const perPage = parseInt (req.query.limit || 20) const page = parseInt (req.query.page || 1) const query = Post.find (). sort ({_id: -1}) const data = await query.skip ((page - 1) * limit) .limit (perPage) const totalDocuments = await query.countDocuments () const totalPage = Math.ceil (totalDocuments / perPage); return res.send ({ data, meta: { page, perPage, totalDocuments, totalPage, } }) } catch (e) { return handlePageError (res, e) } }) |
In the above code, a query is added to count the total number of documents. If you don’t need to, you can quit this query.
C2: Simple with find and limit
Alternatively, we do not rely on the current page number anymore. Which will use the id
of the last document in the previous request. Logic will understand something like this: “Get the next 20 documents after the document id is xxx”.
How to deploy as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 | router.get ('/ posts', async (req, res) => { try { const lastDocumentId = req.query.last_doc_id const perPage = parseInt (req.query.limit || 20) const query = Post.find ({_id: {$ gt: lastDocumentId}}). sort ({_id: -1}) const data = await query.limit (perPage) return res.send ({data}) } catch (e) { return handlePageError (res, e) } }) |
summary
After applying one of the above two methods, the request will have a response in ~ 40ms. As fast as the first time. Here we have resolved the issue of the response time of the request in part 1. Thank you very much for your interest and read your article!
Source : viblo