1 \) q" q1 K% B% U% `, Q 极限多标签分类-评价指标 8 P. d. X4 t8 m: s1 U: ~; M" Y! q! H3 e" M% F
极限多标签分类-评价指标 , ~0 q! ^/ f2 s* nReferences:3 r* d% j5 i& p7 ~
http://manikvarma.org/downloads/XC/XMLRepository.html - r5 a1 C/ T. g; \& F1 o) [' N' i. I% H' mhttps://blog.csdn.net/minfanphd/article/details/126737848?spm=1001.2014.3001.55023 \' R( W' p* g" h% S4 i
https://en.wikipedia.org/wiki/Discounted_cumulative_gain 7 m" {- J5 m! p4 N% ~# R: {/ }2 ^5 B% u! p9 u0 z
什么是极限多标签分类 (eXtreme multi-label Classification (XC))?6 g' J/ Y2 {! I. J8 S# L
标签数非常多(Million),典型的就是BoW数据标签。# V1 d* q! ?' z; Y
极限多标签分类的典型应用:Image Caption(头大)。不过在Image Caption里面,Word之间存在序关系。XC可以看成是Image Caption的一个关键阶段,它能够选出与当前Image最相关的BoW。; ^% v0 Y+ n! @1 C0 m; V8 [
(上述都是靠过往经验吹的,近期没调研)。4 E& G: `0 ~9 @0 W5 D
2 h: y% F0 u: j, ?' l- H先来看一下评价指标:4 k9 S5 h. [! e! x# D+ l
由于标签数非常多,且GroundTruth又非常小,因此通常意义上的分类精度、召回(多标签分类用macro或者micro的acc或者recall)等指标不work。) X/ |5 `6 q8 N i
这些评价指标通常考虑了head/tail labels,也就是高频标签和低频标签;以及reciprocal pairs(互惠对)去除?1 O0 s( f0 D& b) _ A3 H
互惠对似乎?是指彼此相关的标签对,比如针对一个数据点,如果预测了标签A,如果标签B和A相关,那可以自然预测B。" W; ]) g4 z. K9 @6 L/ P
为了避免这种trival prediction, reciprocal pairs应该被去除。; S& F' _! S/ e9 k6 b: m
( ?: b* Z N, W& |
(1) Top-k kk Performance:4 U9 n$ C0 F& V: e& f. n
(Precision@ k ) P @ k : = 1 k ∑ l ∈ rank k ( y ^ ) y l \text{(Precision@$k$)}\text{P}@k := \frac{1}{k}\sum_{l \in \text{rank}_k (\hat{\mathbf{y}})} \mathbf{y}_l # |& Q) m9 p/ b r: @) m& ]3 D(Precision@k)P@k:= 2 h$ h" N& J# J Y; n4 |- I
k* J# j% H6 C9 Q
1. D; b+ j: |2 u# S- L/ Y! n. W
& J% R* b, ?+ {4 D3 q/ A" O ( n" t1 z! V" S. x6 V" [l∈rank # O' ]8 x, q) g3 J1 U1 B
k2 u. R. D* x+ F% N+ t- _3 F
9 \% H. r* k& D5 Y
( 6 Y4 t& n- h' |
y $ E7 h/ U* v' N/ f" A* H^3 j @' Y, g+ R, ~' w) I. C; M+ Y
1 X( y3 K: K6 d) f2 |% y
) - @( ~3 d+ j# h- f∑ ( U7 L/ C% E3 a$ v1 |( `" W3 i" n, \5 J
y / J7 Z: f8 l9 W0 O2 j7 i- l9 sl ! N5 z0 p# W9 T. {) a ^ 4 e# [+ @$ Z2 w: A! k + s7 m/ W! V# }/ G ( [: x" `0 g9 ^! q3 \) J(Discounted Cumulative Gain (贴现累积收益))DCG @ k : = ∑ l ∈ rank k ( y ^ ) y l log ( l + 1 ) \text{(Discounted Cumulative Gain (贴现累积收益))} \text{DCG}@k := \sum_{l \in \text{rank}_k(\hat{\mathbf{y}})} \frac{\mathbf{y}_l}{\log(l+1)}% Z/ B9 o4 h# P0 [
(Discounted Cumulative Gain (贴现累积收益))DCG@k:= : r+ {3 }% A9 c* d; X0 Ql∈rank ) O5 U# S$ b7 L. S9 h: c- n" ~3 Ik / [$ V0 Q" Y) ?' } 5 {1 j" i, f6 K1 F$ ?7 V$ C; t ( / O7 a( i' F6 w: Z
y 8 h8 F+ O0 f: ]. k( m^ 0 q% B$ k- g' K0 T; ^' N9 j& N4 N% q( M6 X. M( W
) ( ~& V- e& u0 |* w∑ " F. s$ g8 u: b+ q0 H& x& O; ` 2 p3 t. w: @/ r* L( P7 d" V; r; p' ~& b# z6 w- y7 b/ {) t+ M' P
log(l+1) / H- E" T/ @3 e4 B' z5 Zy ) }2 ~. E; f1 O+ P, q
l; k. w; P+ q- ?0 R5 `
, W) D( L- m5 E' w ' w/ B# `+ d) I7 w3 ~4 r5 r% b / c$ G* r, D* [8 j: C5 F; C, W
! c X2 h2 |1 R0 ^5 v
(Normalized DCG)nDCG @ k : = DCG@ k ∑ l = 1 min ( k , ∣ ∣ y ∣ ∣ 0 ) 1 log ( l + 1 ) \text{(Normalized DCG)} \text{nDCG}@k := \frac{\text{DCG@$k$}}{\sum_{l=1}^{\min(k,||\mathbf{y}||_0)} \frac{1}{\log(l+1)}} + j: B( J h0 S7 _8 I(Normalized DCG)nDCG@k:= 4 H% t" K4 ^8 Z' |+ a- V% N
∑ ; N: b5 W7 M3 {' Dl=1: L5 _" V' e8 [/ _$ _7 ^
min(k,∣∣y∣∣ 2 }* z! I7 {8 L8 c: d) J0# I8 y& \5 }/ i% d9 |0 m
6 P8 f' v* ^3 d* L: o$ B7 u6 s3 E
) + ^% Y B6 u% L* x ( S* ]& _+ a8 D8 }5 ]/ l2 ^: t/ h8 i+ P: u( o& J: r N
log(l+1) t5 M, @3 i" ]* E6 X- C
1, {! ^1 s9 g8 n6 l# `$ X; l* |
5 N+ [) S3 Y1 R5 O& U& G* ?6 ^+ U# z& R3 t9 a# d" H" e. R
DCG@k " H4 W. X$ o2 v6 O Z/ o; [" e% Z% R+ Q8 T0 v7 W# A, ]
: v R# y. H; E% `& d$ K; s % D! ]( H1 ^+ k; ~rank k ( y ) \text{rank}_k(\mathbf{y})rank - [( I9 x X, z2 s6 nk5 \0 ^9 Z/ q z$ ]0 ?6 E9 m/ y
: `: T9 v8 u$ P9 l
(y)为逆序排列y \mathbf{y}y的前k个下标。Note: DCG公式里的分母实际上不是l,而是from 1 to k. 3 ^; F, Q' ^$ A9 @$ ~8 u# N4 v& R- T) p, w' ~7 {4 @) n8 g( Z
靠后的标签按照对数比例地减小,说白了就是加权。至于为什么用log?两个事实:1. 平滑缩减; 2. Wang等人提供了理论支撑说明了log缩减方式的合理性。The authors show that for every pair of substantially different ranking functions, the nDCG can decide which one is better in a consistent manner. (看不懂,暂时不管) 6 p4 Q8 M% c. }/ Y- T9 O/ e 4 J6 h6 V6 f- z' @. ^: O(2) Top-k kk Propensity-score: V! w# w: g7 n. W
6 K- U, V" r& e* v有些数据集包含一些频度很高的标签(通常称之为head labels),可以通过简单地重复预测头部标签来实现高的P @ k \text{P}@kP@k。Propensity-score可以检查这种微不足道的行为。% P" l& e. D+ `+ P5 l' v7 b
( Propensity-score Precision ) PSP @ k : = 1 k ∑ l ∈ rank k ( y ^ ) y l p l (\text{Propensity-score Precision}) \text{ PSP}@k := \frac{1}{k} \sum_{l\in \text{rank}_k(\hat{\mathbf{y}})} \frac{\mathbf{y}_l}{p_l} ) n# j9 j! @, V: f' _! r(Propensity-score Precision) PSP@k:= / U& d( p' Q( m5 J/ Uk 4 R6 S# W; u, ^; I- K; d3 F$ J1" Y0 O+ Z' \# X- E+ I
' k. ^; ?1 w" q( ^/ U1 f ( A" G: Z& e8 e! z4 [3 C/ `2 q8 `l∈rank 8 o& S% A0 [- ^6 d, f, F$ g
k : {! y" O. j5 y& p' X" S / t$ S' p+ n4 { ( / @4 Y0 t" w" u" Y+ ^% V( k" }1 p
y 5 v9 g% m0 \$ f( X; r4 y^9 ~8 v6 y; f- Y, R9 h& u
$ a# S3 ^: j' R. n
)' g% M+ M/ |( X# |% U. m \7 x9 K3 v
∑ 1 y7 C+ _: h: r B$ h$ m; p& e & v; D- M" E* i( R 4 x- f1 b1 T: A( ^ fp 5 E3 `, q- H2 \* y8 D! ?l( C1 I& Y ^- x! b4 H
0 Q, l" M9 x: f- ]) a
0 ^+ Z' [/ ?$ C% V/ Q0 F My ! Y' Z+ x& F2 R" N& T5 Ul - W' D. R3 I, r, u0 b& j4 L, E; k) _8 J3 q
; M3 z3 Q, b3 u l) a $ A8 I+ P: O5 f. t3 G5 J ! \# N, c8 ~: c; h4 _ 0 j* I. p9 d; v9 Z7 i HPSDCG @ k : = ∑ l ∈ rank k ( y ^ ) y l p l log ( l + 1 ) \text{PSDCG}@k := \sum_{l \in \text{rank}_k(\hat{\mathbf{y}})} \frac{\mathbf{y}_l}{p_l\log(l+1)} + w$ {. A" @! nPSDCG@k:= 8 B c: Z+ x# c1 O' n. Vl∈rank $ C+ y+ D( R; {6 u8 z& W" Zk . o4 j9 Q* _( S- H) C2 a6 e5 J0 F" o; A4 _" p
( 0 S( q8 [7 E1 Z& R, A, ny0 P1 l5 k9 a: P5 k9 u
^ . M' ?8 b/ f: R6 |' d # r2 U5 y- h4 f( m )0 H' @% p6 n3 G' W5 |
∑% e4 c" a% o7 i* T2 p. o( n) s
. t5 b4 p7 s/ j8 t( S3 H
) q- A, V1 h3 F6 R+ E9 Y
p ' D" n$ q- J: i) ?3 i
l0 q: x7 R$ M5 r' `9 M4 U d& p3 ?
a9 p u% k* ~/ c9 u0 G
log(l+1) / p& L2 O( @+ s+ E/ s4 |3 v. O yy 5 P: Q/ W% @) H1 e; L5 k, x7 Yl- ~" g* P, }* G- o. i
; ~0 ]8 o; V6 ^
& }- p' N& ]1 r: V6 Y3 t1 \2 a! O) W+ b' l. B
% A* j8 H1 d+ K4 _4 I
6 s& G, n# T6 m4 G
PSnDCG @ k : = PSDCG@ k ∑ l = 1 k 1 log ( l + 1 ) \text{PSnDCG}@k := \frac{\text{PSDCG@$k$}}{\sum_{l=1}^{k} \frac{1}{\log(l+1)}}+ D/ G9 ~$ m ^8 O, H. |/ [7 W
PSnDCG@k:= # a& g/ h' N9 j3 j: ?
∑ + u) x" f2 w" h* D+ A9 m) U
l=1 ( y) Z: n# \/ ?# P! bk W: z7 ~, q/ ~/ K/ G
" v p/ W" ^% S1 O' { 3 r7 g& B: k) R) P& r. I* Dlog(l+1) G" e4 s/ Y, B% U# ^% d: d
1+ l4 v$ M; H, B M% S4 F- l! ]5 b
- w3 j; j0 n D) n% U* `) b