最近在工作上為了產報表,需要寫sql去撈每個group的top N筆紀錄。研究了一會兒後最後用了以下的解法。
假設今天有個payment的資料表記錄了每筆交易所涉及的用戶、幣別、金額。
| 交易id(id) | 用戶id(uid) | 交易幣別(currency) | 金額(amount) |
|---|---|---|---|
| 1 | 11 | NTD | 800 |
| 2 | 6 | NTD | 590 |
| 3 | 7 | NTD | 1020 |
| 4 | 55 | USD | 24 |
| 5 | 33 | USD | 27 |
| 6 | 2 | USD | 25 |
| 7 | 13 | JPY | 3870 |
| 8 | 10 | JPY | 2980 |
| 9 | 41 | JPY | 3780 |
我們想要取每個幣別的前2筆最大金額的交易記錄,我們可以這樣寫:
SELECT id, uid, currency, amount
FROM (
SELECT id, uid, currency, amount,
@counter := IF(@current_currency = currency, @counter + 1, 1) AS counter,
@current_currency := currency AS current_currency
FROM payment
ORDER BY currency, amount DESC
) ordered_result
WHERE counter <= 2;
細節講解
我們先從內層的子查詢看起。
SELECT id, uid, currency, amount
FROM (
SELECT id, uid, currency, amount,
@counter := IF(@current_currency = currency, @counter + 1, 1) AS counter,
@current_currency := currency AS current_currency
FROM payment
ORDER BY currency, amount DESC
) ordered_result
WHERE counter <= 2;
DB會先對payment表根據幣別currency以及交易金額amount進行排序,其中amount是descending order由大排到小,如下。
| 交易id(id) | 用戶id(uid) | 交易幣別(currency) | 金額(amount) |
|---|---|---|---|
| 7 | 13 | JPY | 3870 |
| 9 | 41 | JPY | 3780 |
| 8 | 10 | JPY | 2980 |
| 3 | 7 | NTD | 1020 |
| 1 | 11 | NTD | 800 |
| 2 | 6 | NTD | 590 |
| 5 | 33 | USD | 27 |
| 6 | 2 | USD | 25 |
| 4 | 55 | USD | 24 |
當排序好後,子查詢的第二步DB會select抓出id、uid、currency、amount,其中還多了變數@counter及@current_currency。
SELECT id, uid, currency, amount
FROM (
SELECT id, uid, currency, amount,
@counter := IF(@current_currency = currency, @counter + 1, 1) AS counter,
@current_currency := currency AS current_currency
FROM payment
ORDER BY currency, amount DESC
) ordered_result
WHERE counter <= 2;
那兩個變數是做什麼用的勒,在這裡@counter被我們拿來從1開始記數,由於我們剛剛已經對currency以及amount排序過了,所以第一筆資料的@counter值就會是1,第二筆資料的@counter值就會是2,依此類推。
不過光靠@counter還是不夠的,我們希望的是對每個group計數,因此還要另一個變數協助@counter,當計數到新的group時@counter自動回歸到1,這就是我們引入第二個變數@current_currency的用途。
有了這兩個變數後,我們來解釋第4行第5行變數的賦值邏輯:
一開始當SELECT語句排序完踏入第一筆紀錄時,@counter及@current_currency沒有給定預設值,所以default是NULL。這會使得第4行的判斷式@current_currency = currency為false(NULL != JPY),並將@counter回歸至1。第5行將目前的@current_currency設定為JPY。
接著進入第二筆紀錄,這次第4行的判斷式@current_currency = currency為true(JPY = JPY),所以@counter = @counter + 1 = 2。第5行依然將目前的@current_currency設定為JPY。
第三筆紀錄也是一樣的邏輯。
當到第四筆紀錄時,這次第4行的判斷式@current_currency = currency是false(JPY != NTD),此時@counter被回歸到1。第5行將目前的@current_currency設定為NTD。
之後的紀錄也都是用這個邏輯一直到最後一筆。整個子查詢出來的東西會長這樣。
| 交易id(id) | 用戶id(uid) | 交易幣別(currency) | 金額(amount) | counter | current_currency |
|---|---|---|---|---|---|
| 7 | 13 | JPY | 3870 | 1 | JPY |
| 9 | 41 | JPY | 3780 | 2 | JPY |
| 8 | 10 | JPY | 2980 | 3 | JPY |
| 3 | 7 | NTD | 1020 | 1 | NTD |
| 1 | 11 | NTD | 800 | 2 | NTD |
| 2 | 6 | NTD | 590 | 3 | NTD |
| 5 | 33 | USD | 27 | 1 | USD |
| 6 | 2 | USD | 25 | 2 | USD |
| 4 | 55 | USD | 24 | 3 | USD |
接下來我們來看主查詢的部份。
SELECT id, uid, currency, amount
FROM (
SELECT id, uid, currency, amount,
@counter := IF(@current_currency = currency, @counter + 1, 1) AS counter,
@current_currency := currency AS current_currency
FROM payment
ORDER BY currency, amount DESC
) ordered_result
WHERE counter <= 2;
主查詢就單純了許多,由於剛剛已經在子查詢對每個幣別的交易記錄金額由大到小計數過了,這時如果我們今天想要每個幣別的前2筆最大金額的交易記錄時,只要將第9行的條件counter <= 2,就可以輕鬆撈出來了:)
| 交易id(id) | 用戶id(uid) | 交易幣別(currency) | 金額(amount) |
|---|---|---|---|
| 7 | 13 | JPY | 3870 |
| 9 | 41 | JPY | 3780 |
| 3 | 7 | NTD | 1020 |
| 1 | 11 | NTD | 800 |
| 5 | 33 | USD | 27 |
| 6 | 2 | USD | 25 |
Hello,
Do you speak English? I would like to talk about sponsored post. Can we talk in English?