用scikit-learn計算NDCG

簡介

NDCG是衡量Ranking quality是重要指標。本文將用實際Python例子演示怎樣計算NDCG。

用scikit計算NDCG例子

注意ndcg_score接收的參數都是list of list。後面解釋為什麼是list of list。

如果評估一個排序請求,用以下例子

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from sklearn.metrics import ndcg_score
y_truth = [0, 1, 2, 3]
y_predict = [0, 1, 2, 3]
ndcg = ndcg_score([y_truth], [y_predict])
print(f'ndcg = {ndcg}')
from sklearn.metrics import ndcg_score y_truth = [0, 1, 2, 3] y_predict = [0, 1, 2, 3] ndcg = ndcg_score([y_truth], [y_predict]) print(f'ndcg = {ndcg}')
from sklearn.metrics import ndcg_score

y_truth = [0, 1, 2, 3]
y_predict = [0, 1, 2, 3]

ndcg = ndcg_score([y_truth], [y_predict])
print(f'ndcg = {ndcg}')

如果有多個排序請求,比如日誌里記錄的一天收到的所有排序請求,用以下例子(假設有3個排序請求)

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from sklearn.metrics import ndcg_score
y_truth1 = [0, 1, 2, 3]
y_predict2 = [0, 1, 2, 3]
y_truth2 = [0, 1, 2, 3]
y_predict2 = [0, 1, 2, 3]
y_truth3 = [0, 1, 2, 3]
y_predict3 = [0, 1, 2, 3]
ndcg = ndcg_score(
[y_truth1, y_truth2, y_truth3],
[y_predict1, y_predict2, y_predict3]
)
print(f'ndcg = {ndcg}')
from sklearn.metrics import ndcg_score y_truth1 = [0, 1, 2, 3] y_predict2 = [0, 1, 2, 3] y_truth2 = [0, 1, 2, 3] y_predict2 = [0, 1, 2, 3] y_truth3 = [0, 1, 2, 3] y_predict3 = [0, 1, 2, 3] ndcg = ndcg_score( [y_truth1, y_truth2, y_truth3], [y_predict1, y_predict2, y_predict3] ) print(f'ndcg = {ndcg}')
from sklearn.metrics import ndcg_score

y_truth1 = [0, 1, 2, 3]
y_predict2 = [0, 1, 2, 3]

y_truth2 = [0, 1, 2, 3]
y_predict2 = [0, 1, 2, 3]

y_truth3 = [0, 1, 2, 3]
y_predict3 = [0, 1, 2, 3]

ndcg = ndcg_score(
    [y_truth1, y_truth2, y_truth3], 
    [y_predict1, y_predict2, y_predict3]
)
print(f'ndcg = {ndcg}')

3個排序會產生3個NDCG,最後結果為所有NDCG的平均值,作為這麼多請求的總NDCG返回。

還要注意每個排序請求list裡面元素的數量要一樣,不然ndcg_score函數會報錯。

參考

本文鏈接

Leave a Comment

Your email address will not be published.