: f8 p+ ~- F- ]/ Q' q 极限多标签分类-评价指标 - u2 i. v; K+ P b% C. G+ A: e$ Z- l U, m
极限多标签分类-评价指标* |2 f( ?3 c! T) l* M
References: ) d- O/ n( d% L: N+ ?1 I. M6 Ghttp://manikvarma.org/downloads/XC/XMLRepository.html ; B2 Z/ c0 y9 m. r- [https://blog.csdn.net/minfanphd/article/details/126737848?spm=1001.2014.3001.5502 3 m W4 a; l8 \. H$ vhttps://en.wikipedia.org/wiki/Discounted_cumulative_gain% ~! c, T) U3 c3 H! ?
3 {2 p; l( R3 i, Z" Q6 a
什么是极限多标签分类 (eXtreme multi-label Classification (XC))?6 y2 l, _, D* X6 J, \
标签数非常多(Million),典型的就是BoW数据标签。 4 G! c: R v! D' ]极限多标签分类的典型应用:Image Caption(头大)。不过在Image Caption里面,Word之间存在序关系。XC可以看成是Image Caption的一个关键阶段,它能够选出与当前Image最相关的BoW。# M% K# w E; o0 l
(上述都是靠过往经验吹的,近期没调研)。 ! h O w/ Y" l, t" B4 h3 \: Z5 g* B5 X- g$ _( F1 c, e! N: I
先来看一下评价指标: ' N$ v6 o* d1 g$ w" r由于标签数非常多,且GroundTruth又非常小,因此通常意义上的分类精度、召回(多标签分类用macro或者micro的acc或者recall)等指标不work。 + K' E# C) K7 x. P# m9 r这些评价指标通常考虑了head/tail labels,也就是高频标签和低频标签;以及reciprocal pairs(互惠对)去除?* G* N. M, Z# K6 X. G" r
互惠对似乎?是指彼此相关的标签对,比如针对一个数据点,如果预测了标签A,如果标签B和A相关,那可以自然预测B。- _( N1 x7 `' D/ s4 x6 y
为了避免这种trival prediction, reciprocal pairs应该被去除。$ x+ W" R9 ~/ L4 @4 C7 [2 s* R
. j9 \6 Q( o% l6 {9 |0 I(1) Top-k kk Performance:9 G2 ^, |" [; i% ]; I/ a, Y ?
(Precision@ k ) P @ k : = 1 k ∑ l ∈ rank k ( y ^ ) y l \text{(Precision@$k$)}\text{P}@k := \frac{1}{k}\sum_{l \in \text{rank}_k (\hat{\mathbf{y}})} \mathbf{y}_l7 ] ?. r, k8 X+ \0 e. o# b
(Precision@k)P@k:= 8 U" L; ]) J$ E+ k% K+ {' S% G* n! E
k & e2 y& o& j1 q1 y) ]2 F6 H' N1 " L: ]( r* v \ 1 T! Y, q6 Z+ m/ p, a( _0 ^* c9 X/ p a3 r; j; @& N
l∈rank 9 ?& ?% M, U1 t9 L: _ E* d
k& I- H, L9 j2 [6 m" ]9 d7 T/ A7 `
5 a' x q2 a& [) n ( 2 z2 p3 ^# M7 _9 j: T! ~! o$ ?% uy 2 a9 p- k* U7 C/ S& b9 H^ ' u8 ]9 M5 I5 K/ W# Q/ C9 u+ }* W; K8 I7 \, O/ s: Q9 L! @
)( C; ~8 y' J6 y6 i) y: \. v% O' a
∑+ z% |! W; l9 @0 i7 q/ M
! c; T" k& m/ ?% ~) T$ m& f0 R4 ] y # |1 V- w( D V% t9 O* \l 1 F, I3 ]" J: T/ W' u7 }! O! F8 b" I3 ?) v/ h/ h: n! A8 [- ?& n- p
, O i& _1 o/ D L9 _/ ?* T8 x( w- i" `* {4 V; {! H8 }$ y- r
(Discounted Cumulative Gain (贴现累积收益))DCG @ k : = ∑ l ∈ rank k ( y ^ ) y l log ( l + 1 ) \text{(Discounted Cumulative Gain (贴现累积收益))} \text{DCG}@k := \sum_{l \in \text{rank}_k(\hat{\mathbf{y}})} \frac{\mathbf{y}_l}{\log(l+1)}1 e" u4 j1 v* N+ {
(Discounted Cumulative Gain (贴现累积收益))DCG@k:= ! p6 Z1 R7 a) E. g( V3 A* @# ul∈rank $ _ l7 b2 c+ H' E- ok$ [0 M. Y8 e, ?4 Q8 s! O
0 ]- R& s/ Z3 A1 N: u% N ( ; c* s. J' n' s/ ?$ r" }( Z8 i
y% _4 y8 S' I% ^) U
^7 L/ M; j& }! G& ~* f% x) G" ^9 o
# ?) G, F3 @2 M7 N& v8 B7 ^/ E ) - T: v- h+ u6 |' h( A∑# n2 d% q" O A( z8 k3 U) ]
; r: w8 }$ x1 X9 X/ I. h7 i; i& R 6 u, V" E1 k1 h( K. l: ylog(l+1); E) \; m/ _2 w- t1 h( `- @
y 8 M2 C) n+ i: ~* A: E8 ll - H. t# S# K3 l' } ( {: k4 L+ l; d* h" l+ K - j5 V, c) W# ]3 C7 _4 ]! L- r: P$ Y
# X" W: D. K9 B' U) c9 q 1 N6 O K6 C8 O5 b4 y(Normalized DCG)nDCG @ k : = DCG@ k ∑ l = 1 min ( k , ∣ ∣ y ∣ ∣ 0 ) 1 log ( l + 1 ) \text{(Normalized DCG)} \text{nDCG}@k := \frac{\text{DCG@$k$}}{\sum_{l=1}^{\min(k,||\mathbf{y}||_0)} \frac{1}{\log(l+1)}}, [ @8 M1 _: u
(Normalized DCG)nDCG@k:= 2 L. f: [: d) V∑ 5 X6 P$ T; K) E9 ~7 B& o
l=1 ) T( v9 l* D/ z- d6 Bmin(k,∣∣y∣∣ * v4 Y/ z0 {$ D% a- w5 p0 3 ?# T& _) Y6 D' v7 R) }1 O) w5 X$ z6 f( l
)1 Z) @6 `8 B0 K8 L, Z3 @6 y1 w+ }
$ p0 j7 q# K7 a5 k$ G& u
8 D* V& D8 ?: v- z- _
log(l+1) # D. O" n) v1 M, _7 C8 T! a: C1 c( J/ O9 F0 u$ C& K% ]* M ; L) n: t3 y* C# M5 v c; \! P% T2 h" {8 x; x
DCG@k+ H8 S; O R! @$ C4 y8 }
/ |. G7 P- P- C- ~% M/ _- H ) b y! h* q* ?! E 8 P% j% P/ |% K( p, x" l) Jrank k ( y ) \text{rank}_k(\mathbf{y})rank 6 S1 ]( ?, ^4 X* }: H' o) o" t+ ?k 2 {" b% K4 E* e0 D$ u$ ~4 p7 I8 V4 V( ]: A. K6 N( p; L) v, v, [( ?- f. X
(y)为逆序排列y \mathbf{y}y的前k个下标。Note: DCG公式里的分母实际上不是l,而是from 1 to k.0 Z3 A* t3 w N( D9 c
, Y% l- f% w# q靠后的标签按照对数比例地减小,说白了就是加权。至于为什么用log?两个事实:1. 平滑缩减; 2. Wang等人提供了理论支撑说明了log缩减方式的合理性。The authors show that for every pair of substantially different ranking functions, the nDCG can decide which one is better in a consistent manner. (看不懂,暂时不管)8 v) ?% ?5 V0 n- h: T) p9 `
8 `1 U. Z4 i% }* [- f
(2) Top-k kk Propensity-score: 1 ~% L; {' X' `: d+ ?# R! w, U. E
有些数据集包含一些频度很高的标签(通常称之为head labels),可以通过简单地重复预测头部标签来实现高的P @ k \text{P}@kP@k。Propensity-score可以检查这种微不足道的行为。 C `+ Z6 J6 ^. |& I! d, |( Propensity-score Precision ) PSP @ k : = 1 k ∑ l ∈ rank k ( y ^ ) y l p l (\text{Propensity-score Precision}) \text{ PSP}@k := \frac{1}{k} \sum_{l\in \text{rank}_k(\hat{\mathbf{y}})} \frac{\mathbf{y}_l}{p_l} % U( S8 @! `' w% \- S3 G+ V(Propensity-score Precision) PSP@k:= & b" [) q! D' i3 a; ]# bk, O4 k1 f; d% L7 \2 i$ {9 a
1 ! U. e5 {( f6 A) L1 R' c" _0 e0 Z- z# O9 V5 F
% T% q% P( @9 L2 al∈rank * S# {9 Y' m/ Gk : _* J$ ]$ T! ` . \3 m+ P. {; o9 A% [ ( # O5 N; Q$ s \y 5 S$ N' A% j, S# A3 `6 Y% X^ 8 g- ~7 Q! Z7 Z* }6 ?) D) N" N8 B1 k0 j, R1 ?& j6 q3 K
)3 J+ b( V8 X2 c0 d
∑ : E5 H* j6 X) |8 A V& s 4 J3 p# N) P* A8 Q& L+ l; C% E8 y/ K: ~2 a* ~2 D; F: M
p ; K7 p, u0 V1 E% W: M2 D+ u' O
l 0 `. m" H0 n- U : o y) r! Z# }( B/ k2 F5 H& G; r4 [! A+ I0 G
y ; U/ Z& I! H9 q% n1 n$ k, C
l/ J- a/ A! g3 l' T. ~) x1 T1 A
) \3 x2 B0 G) x" e+ A4 Y$ |) Q; h4 y. @. e: B9 K( U
/ e- d; E" ?- n6 v
7 b4 U8 l% y, K0 J5 a6 ?9 a
G7 X, b+ c9 @" a; qPSDCG @ k : = ∑ l ∈ rank k ( y ^ ) y l p l log ( l + 1 ) \text{PSDCG}@k := \sum_{l \in \text{rank}_k(\hat{\mathbf{y}})} \frac{\mathbf{y}_l}{p_l\log(l+1)} 5 F9 E- l& ?5 ^4 wPSDCG@k:= ' Q+ z2 G9 a: p2 X% D
l∈rank * ]/ S$ Y8 o8 w1 l! x, \* @& n. `k / U* ?; J6 k! w: H ! C+ g3 |* O4 R: ~* ] ( 0 g9 o1 a( {1 [6 \2 U4 V$ }y / ^* k1 ?, T$ O( S; X/ _) i* u7 u^ 6 e, q" Y7 v: m, Y. M. x) e- Y, x9 p3 ~$ e* |4 h
)4 S% y3 k6 H+ k' g8 J$ `& d, I
∑ % [! r2 B, P7 H+ ^& K& I% M % e8 h" N( N$ [) R! e- h ! c5 o, A- r+ Z8 r6 K4 K. U3 Pp {% V( ^ S' ?
l % m' W; E/ D4 F( {/ ]. c3 D5 T3 T1 g- I- U) x5 C' ]7 ~ _; V
log(l+1)4 E# A/ P0 x; I
y - h H8 i7 B+ O5 k- f1 D b
l# g1 p$ T2 g1 f4 q! ?* ~
# p5 ` f2 y6 w- _9 s6 _0 e
% A5 O8 `# Z9 L; ~( ?) p2 H8 R- T4 x, S/ c
- z: Q0 s+ m8 g5 X : t, T# D' r0 M" w5 y5 V) W& GPSnDCG @ k : = PSDCG@ k ∑ l = 1 k 1 log ( l + 1 ) \text{PSnDCG}@k := \frac{\text{PSDCG@$k$}}{\sum_{l=1}^{k} \frac{1}{\log(l+1)}}! X* o! x$ ^/ H
PSnDCG@k:= & h% r, N$ ^( F) j+ L, ^- A∑ 7 w. t! s, G) M6 C6 Xl=1 5 N" [2 U5 w) X& M$ l/ Ok9 M; r; ~' s/ q$ A$ o+ z7 {$ ]
3 a/ o) D/ I1 r" k) [! g) p
9 S/ H, X# p$ s E
log(l+1) . f& p3 B+ h* s, L# A1 e1/ J9 z% T/ ? n
) Q4 F" g9 L4 B q# F$ m; v( A& r# J4 r
PSDCG@k: U5 A9 M1 `$ I7 Z1 w* _, ~' ~
& m {8 Q2 t/ @+ Z7 W" @, }/ A4 E8 V# O: F