Hello everyone, I just got to know the Window Functions
topic, I personally feel quite good and I appreciate it as an advanced part. Because few people know it, Window Functions
finds it very rarely used, instead of lengthy subquery like the message to a crush , and others understand that the query is as short as the answer of the crush “yes” . Now I will introduce to everyone what Window Functions
are, and which problems apply.
I. What is Window Functions?
Mysql supports window functions, ie, for each row from the query, perform calculations using the rows associated with that row. Most of the Aggregate Functions
(the functions we often use like COUNT, SUM, MAX, MIN,..) Can also be used as Window Functions
when combined with OVER()
Hmm … still a bit difficult understand, let’s go into a simple example to visualize the face of Window Functions
like.
Suppose there is 1 table of products
:
Use Aggregate Functions
– SUM as usual, to get the total number of products in stock by category
1 2 3 4 5 6 | <span class="token punctuation">(</span> Query1 <span class="token punctuation">)</span> <span class="token operator">-</span> <span class="token operator">-</span> <span class="token operator">-</span> <span class="token operator">-</span> <span class="token operator">-</span> <span class="token operator">-</span> <span class="token operator">-</span> <span class="token operator">-</span> select category <span class="token punctuation">,</span> <span class="token builtin">sum</span> <span class="token punctuation">(</span> quantity_warehouse <span class="token punctuation">)</span> <span class="token keyword">as</span> total_warehouse <span class="token keyword">from</span> products group by category |
But what if I want to get the details of the products with the total number of products in stock by category
. Normally, we will do a subquery and then join is done, right?
1 2 3 4 5 6 7 8 9 10 11 | <span class="token punctuation">(</span> Query2 <span class="token punctuation">)</span> <span class="token operator">-</span> <span class="token operator">-</span> <span class="token operator">-</span> <span class="token operator">-</span> <span class="token operator">-</span> <span class="token operator">-</span> <span class="token operator">-</span> <span class="token operator">-</span> select p <span class="token punctuation">.</span> <span class="token operator">*</span> <span class="token punctuation">,</span> tmp <span class="token punctuation">.</span> total_warehouse <span class="token keyword">from</span> products <span class="token keyword">as</span> p join <span class="token punctuation">(</span> select category <span class="token punctuation">,</span> <span class="token builtin">sum</span> <span class="token punctuation">(</span> quantity_warehouse <span class="token punctuation">)</span> <span class="token keyword">as</span> total_warehouse <span class="token keyword">from</span> products group by category <span class="token punctuation">)</span> <span class="token keyword">as</span> tmp on p <span class="token punctuation">.</span> category <span class="token operator">=</span> tmp <span class="token punctuation">.</span> category order by category |
Pretty wordy, but let’s see if you use the Window Functions
.
1 2 3 4 5 | <span class="token punctuation">(</span> Query3 <span class="token punctuation">)</span> <span class="token operator">-</span> <span class="token operator">-</span> <span class="token operator">-</span> <span class="token operator">-</span> <span class="token operator">-</span> <span class="token operator">-</span> <span class="token operator">-</span> <span class="token operator">-</span> select <span class="token operator">*</span> <span class="token punctuation">,</span> <span class="token builtin">sum</span> <span class="token punctuation">(</span> quantity_warehouse <span class="token punctuation">)</span> over <span class="token punctuation">(</span> partition by category <span class="token punctuation">)</span> <span class="token keyword">as</span> total_warehouse <span class="token keyword">from</span> products |
If you pay attention, you will see that Query1 and Query3 differ only OVER () – what it is, please calm down. But itself helps MYSQL distinguish
Window functions
andAggregate Functions
, it will create a window including rows related to it based onpartititon
to SUM.
Both ways will give the same result, but you can see that using Window Functions
is more convenient, right. If you understand the meaning and usage of Window Function
, I bet you will choose it instead of writing the long subquery
like a pump.
Note: MySQL currently only supports using Window Functions since version 8.0
II. Analyze the Window Functions component
A Window Function is defined when there is an OVER()
clause following the function call.
1 2 3 4 5 6 7 8 | OVER <span class="token punctuation">(</span> <span class="token punctuation">[</span> partition_clause <span class="token punctuation">]</span> <span class="token punctuation">[</span> order_clause <span class="token punctuation">]</span> <span class="token punctuation">[</span> frame_clause <span class="token punctuation">]</span> <span class="token punctuation">)</span> ví dụ <span class="token punctuation">:</span> OVER <span class="token punctuation">(</span> PARTITION BY category ORDER BY <span class="token builtin">id</span> ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING <span class="token punctuation">)</span> |
- partiiton_clause: For
Aggregate Functions
when we want to group related rows together, we use GROUP BY , inWindow Functions
also, that is PARTITION BY , the partition will group rows related to the current row into 1 window to perform calculations. As an example in Query3 I have used [partition by category] - order_clause: in the same way as normal ORDER BY, sorts the rows in the workbook
- frame_clouse: in case you want to limit rows in partition -> Window Frame12345678910111213141516171819202122232425frame_clause <span class="token punctuation">:</span>frame_units frame_extentframe_units <span class="token punctuation">:</span><span class="token punctuation">{</span> ROWS <span class="token operator">|</span> RANGE <span class="token punctuation">}</span><span class="token operator">-</span> ROWS <span class="token punctuation">:</span> Giá trị chênh lệch là sự khác biệt về số hàng so với số hàng hiện tại <span class="token punctuation">.</span><span class="token operator">-</span> RANGE <span class="token punctuation">:</span> Giá trị chênh lệch là sự khác biệt về giá trị hàng so với giá trị hàng hiện tại <span class="token punctuation">.</span>frame_extent <span class="token punctuation">:</span><span class="token punctuation">{</span> frame_start <span class="token operator">|</span> frame_between <span class="token punctuation">}</span>frame_between <span class="token punctuation">:</span>BETWEEN frame_start AND frame_endframe_start <span class="token punctuation">,</span> frame_end <span class="token punctuation">:</span> <span class="token punctuation">{</span>CURRENT ROW<span class="token operator">|</span> UNBOUNDED PRECEDING<span class="token operator">|</span> UNBOUNDED FOLLOWING<span class="token operator">|</span> expr PRECEDING<span class="token operator">|</span> expr FOLLOWING<span class="token punctuation">}</span><span class="token operator">-</span> CURRENT ROW <span class="token punctuation">:</span> row hiện tại<span class="token operator">-</span> UNBOUNDED PRECEDING <span class="token punctuation">:</span> tất cả các row trước <span class="token punctuation">,</span> tính cả row hiện tại<span class="token operator">-</span> UNBOUNDED FOLLOWING <span class="token punctuation">:</span> tất cả các row sau <span class="token punctuation">,</span> tính cả row hiện tại<span class="token operator">-</span> expr PRECEDING <span class="token operator">|</span> FOLLOWING <span class="token punctuation">:</span> expr row trước <span class="token operator">|</span> sau <span class="token punctuation">,</span> tính cả row hiện tại
Note: when using frame_clouse, it is imperative to use order_clause otherwise MYSQL will not know the position of the rows to create the Frame.
In the case of EVER empty, what about: OVER () -> all data of the table is considered 1 block – window This paragraph I learn it takes a lot of time to understand how it works. So I will demo an example for everyone to easily grasp. For details, everyone can see the doc here .
1 2 3 4 5 6 7 | select <span class="token operator">*</span> <span class="token punctuation">,</span> <span class="token builtin">sum</span> <span class="token punctuation">(</span> quantity_warehouse <span class="token punctuation">)</span> OVER <span class="token punctuation">(</span> ORDER BY quantity_warehouse DESC ROWS BETWEEN <span class="token number">2</span> PRECEDING AND <span class="token number">2</span> FOLLOWING <span class="token punctuation">)</span> <span class="token keyword">as</span> total <span class="token keyword">from</span> product |
In column total
, each resulting row will be the sum of the current row, the previous 2 rows and the 2 rows after. Conversely, if you replace ROWS
with RANGE
, each resulting row will be the sum of the current row for all rows with a quantity_warehouse +/- 2.
1 2 3 4 5 6 7 | select <span class="token operator">*</span> <span class="token punctuation">,</span> <span class="token builtin">sum</span> <span class="token punctuation">(</span> quantity_warehouse <span class="token punctuation">)</span> OVER <span class="token punctuation">(</span> ORDER BY quantity_warehouse DESC RANGE BETWEEN <span class="token number">2</span> PRECEDING AND <span class="token number">2</span> FOLLOWING <span class="token punctuation">)</span> <span class="token keyword">as</span> total <span class="token keyword">from</span> product |
III. Several of the Window Functions are advanced and extremely useful.
Function name | description |
---|---|
cume_dist () | (Number of records in window <= current row) / total records |
dense_rank () | Ranking of current row, with no gaps |
rank () | Rank the current row in the window based on order by, with a space |
row_number () | Numbering order in the window |
first_value (expr) | Get the first value in the window |
last_value (expr) | Get the final value in the window |
nth_value (expr, n) | Get the value of n in the window |
lag (exp, n, default) | Get the value before n units (n, default: option nullable) |
lead (exp, n, default) | Get the value then n units (n, default: option nullable) |
ntile () | Divide the window into n groups -> returns the group number |
percent_rank () | (rank – 1) / (rows – 1) |
Again : partiiton_clause groups rows related to the current row into one block – the window. There are also Aggregate Functions
when combined with OVER (). To find out the details of Window Functions
‘ operations, please see Part II in the article (Soon ^^).
The next part only has practice, so everyone try to master the knowledge so that they do not get eaten later, in the next part, we will have some extreme problems, solved after going to the interview x10 salary =)) Wait! …
Window Functions in MySQL, Advanced and extremely useful (Part II)
Thank you to everyone who followed my post! Have a nice working day!
Related document
https://dev.mysql.com/doc/refman/8.0/en/window-function-descriptions.html .
https://dev.mysql.com/doc/refman/8.0/en/window-functions-usage.html .
https://dev.mysql.com/doc/refman/8.0/en/window-functions-frames.html .