最近在工作上為了產報表,需要寫sql去撈每個group的top N筆紀錄。研究了一會兒後最後用了以下的解法。
假設今天有個payment
的資料表記錄了每筆交易所涉及的用戶、幣別、金額。
交易id(id) | 用戶id(uid) | 交易幣別(currency) | 金額(amount) |
---|---|---|---|
1 | 11 | NTD | 800 |
2 | 6 | NTD | 590 |
3 | 7 | NTD | 1020 |
4 | 55 | USD | 24 |
5 | 33 | USD | 27 |
6 | 2 | USD | 25 |
7 | 13 | JPY | 3870 |
8 | 10 | JPY | 2980 |
9 | 41 | JPY | 3780 |
我們想要取每個幣別的前2筆最大金額的交易記錄,我們可以這樣寫:
SELECT id, uid, currency, amount FROM ( SELECT id, uid, currency, amount, @counter := IF(@current_currency = currency, @counter + 1, 1) AS counter, @current_currency := currency AS current_currency FROM payment ORDER BY currency, amount DESC ) ordered_result WHERE counter <= 2;
細節講解
我們先從內層的子查詢看起。
SELECT id, uid, currency, amount FROM ( SELECT id, uid, currency, amount, @counter := IF(@current_currency = currency, @counter + 1, 1) AS counter, @current_currency := currency AS current_currency FROM payment ORDER BY currency, amount DESC ) ordered_result WHERE counter <= 2;
DB會先對payment
表根據幣別currency
以及交易金額amount
進行排序,其中amount
是descending order由大排到小,如下。
交易id(id) | 用戶id(uid) | 交易幣別(currency) | 金額(amount) |
---|---|---|---|
7 | 13 | JPY | 3870 |
9 | 41 | JPY | 3780 |
8 | 10 | JPY | 2980 |
3 | 7 | NTD | 1020 |
1 | 11 | NTD | 800 |
2 | 6 | NTD | 590 |
5 | 33 | USD | 27 |
6 | 2 | USD | 25 |
4 | 55 | USD | 24 |
當排序好後,子查詢的第二步DB會select抓出id
、uid
、currency
、amount
,其中還多了變數@counter
及@current_currency
。
SELECT id, uid, currency, amount FROM ( SELECT id, uid, currency, amount, @counter := IF(@current_currency = currency, @counter + 1, 1) AS counter, @current_currency := currency AS current_currency FROM payment ORDER BY currency, amount DESC ) ordered_result WHERE counter <= 2;
那兩個變數是做什麼用的勒,在這裡@counter
被我們拿來從1
開始記數,由於我們剛剛已經對currency
以及amount
排序過了,所以第一筆資料的@counter
值就會是1
,第二筆資料的@counter
值就會是2
,依此類推。
不過光靠@counter
還是不夠的,我們希望的是對每個group計數,因此還要另一個變數協助@counter
,當計數到新的group時@counter
自動回歸到1
,這就是我們引入第二個變數@current_currency
的用途。
有了這兩個變數後,我們來解釋第4行第5行變數的賦值邏輯:
一開始當SELECT語句排序完踏入第一筆紀錄時,@counter
及@current_currency
沒有給定預設值,所以default是NULL
。這會使得第4行的判斷式@current_currency = currency
為false
(NULL
!= JPY
),並將@counter
回歸至1
。第5行將目前的@current_currency
設定為JPY
。
接著進入第二筆紀錄,這次第4行的判斷式@current_currency = currency
為true
(JPY
= JPY
),所以@counter
= @counter
+ 1
= 2
。第5行依然將目前的@current_currency
設定為JPY
。
第三筆紀錄也是一樣的邏輯。
當到第四筆紀錄時,這次第4行的判斷式@current_currency = currency
是false
(JPY
!= NTD
),此時@counter
被回歸到1
。第5行將目前的@current_currency
設定為NTD
。
之後的紀錄也都是用這個邏輯一直到最後一筆。整個子查詢出來的東西會長這樣。
交易id(id) | 用戶id(uid) | 交易幣別(currency) | 金額(amount) | counter | current_currency |
---|---|---|---|---|---|
7 | 13 | JPY | 3870 | 1 | JPY |
9 | 41 | JPY | 3780 | 2 | JPY |
8 | 10 | JPY | 2980 | 3 | JPY |
3 | 7 | NTD | 1020 | 1 | NTD |
1 | 11 | NTD | 800 | 2 | NTD |
2 | 6 | NTD | 590 | 3 | NTD |
5 | 33 | USD | 27 | 1 | USD |
6 | 2 | USD | 25 | 2 | USD |
4 | 55 | USD | 24 | 3 | USD |
接下來我們來看主查詢的部份。
SELECT id, uid, currency, amount FROM ( SELECT id, uid, currency, amount, @counter := IF(@current_currency = currency, @counter + 1, 1) AS counter, @current_currency := currency AS current_currency FROM payment ORDER BY currency, amount DESC ) ordered_result WHERE counter <= 2;
主查詢就單純了許多,由於剛剛已經在子查詢對每個幣別的交易記錄金額由大到小計數過了,這時如果我們今天想要每個幣別的前2筆最大金額的交易記錄時,只要將第9行的條件counter <= 2
,就可以輕鬆撈出來了:)
交易id(id) | 用戶id(uid) | 交易幣別(currency) | 金額(amount) |
---|---|---|---|
7 | 13 | JPY | 3870 |
9 | 41 | JPY | 3780 |
3 | 7 | NTD | 1020 |
1 | 11 | NTD | 800 |
5 | 33 | USD | 27 |
6 | 2 | USD | 25 |
Hello,
Do you speak English? I would like to talk about sponsored post. Can we talk in English?