$ r9 \* d4 {# a" M- c并将是否违约插入到最后一列# w- J/ k+ q% e; u
( M- H$ V; c* ~' \: m8 x3 D# c
+ L. x. ^; U; @9 u6 d
# 提取是否违约的列表 5 }9 }+ w) B9 `, Rm = []9 Q- W& Z% {. y" h; V" t! c
for name in code_list: 3 D2 _) e- [3 a+ I m.append(datas[datas['code']==name]['break_contract'].tolist()[0])7 f6 p- {$ l) u; T9 y9 e
df1.loc[:,len(df1)] = m 0 M- k+ G# \4 u& T8 h df1.to_excel(writer3,sheet_name='sheet1',index=False)! o6 ~0 P/ B# I
建立模型7 u! U: Y; u; I" r
Logistics违约率预测模型* G r: S0 ^# n
使用Logistics违约预测模型,代入所有的指标数据为自变量,是否违约为因变量,预测出违约率。 - a/ T4 z0 R; X4 ~( E: C" n s1 U- G
% U- i' P( r5 Y& NX=datas[['进项发票作废率','进项负数发票率','进项每月平均交易额','进项每月交易次数','销项发票作废率','销项负数发票率','销项每月平均交易额','销项每月交易次数','销售收入增长率']] 8 |' t/ y1 {, W5 p9 M$ n9 L Ny=datas['是否违约']$ a4 A" Q& v. q3 V
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.2,random_state=2020)5 L! K- B4 E. c5 S0 ^5 i
X_validation, X_test, y_validation, y_test = train_test_split(X_test,y_test, test_size=0.1,random_state=2020)9 Y, z! Z% o7 z+ T( v
model = LogisticRegression()- L+ A# {% \2 Z& b3 q
model.fit(X_train,y_train) ) v: Z! V8 t2 ]0 b ~2 s$ B9 w6 U2 Aa=model.predict_proba(X_validation) 4 V8 n+ p5 S" d! h: P" nresult=[] 3 t! n8 t2 K. j2 A/ d# T! Cfor i in range(len(a)): - Y' X2 a4 A. o* N# L- `8 q5 k7 i6 xif a[1]>0.5: 3 b5 g5 M: t8 k2 D6 ~& L result.append(1)/ @& _7 a# B/ J2 p! @' I5 ]/ q
else:# q: b. }% l) @3 A
result.append(0) ( ?( R3 _" a2 k$ T. Lfrom sklearn import metrics, s, j K( b% K( ~
print('误差: %.4f' % (1-metrics.recall_score(y_validation,result,average='weighted'))) ( P; \* O+ w2 c3 O$ E最终得到一张我们的分析表格* s9 w% G9 c# A2 X+ O: T
8 R8 i2 W8 C9 _
; J h! U5 B6 ]+ h7 x 0 [% X. ?, b" v 0 h2 ?9 W" h" Y' m- x d# \& r- @* M$ a: q. O# l$ R' h; J
通过预测是否违约,我们就能解决贷不贷款的问题。0 O$ V0 V7 j* c
4 F' l; Q4 B9 Q0 P5 w$ m $ _( z; c q9 A( J* {* P贷款金额# Y4 v" d* _; D) @
贷款金额的确认,根据该公司不违约率在所有公司中的权重,乘以总贷款金额确认: - i- n2 E" _- P0 Z4 L' ]& D + l# h, X; W) ]# r* X: h& D/ _% b A% r7 A$ Z0 z' H% @ 1 Z+ R: Y" R: T( W
0 v9 m! G7 h. U9 s2 S* P% `" P
' n' O8 [9 o: m7 |% ~ri=1−Zi∑123j=1(1−Zi)×M; n% C8 v2 N& p% e6 i
ri=1−Zi∑j=1123(1−Zi)×M8 O. R/ R1 e) E: \9 D
因此,我们得到的贷款金额是违约率和贷款总金额组成的关系式,这在第二问中能起到重要作用。+ J* H, `5 `: @3 F5 p7 u( q9 u
) z( z' \3 m3 O1 f0 L( P7 c3 u, @' [
贷款年利率5 C2 J+ z. c' v4 \* d; ~/ Z+ b
绘制出年利率与客户流失率图,可以分析出两者应该是有关系的。利用SPSS拟合出不同信誉等级,年利率与客户流失率的关系式。1 @6 V* m l' F4 k' E
* }$ K1 L6 W" k9 F+ j& j5 T3 u8 x _5 m, X8 p9 Q$ f e0 t
信誉等级 R平方 关系式. H. k" {) M: n: j% s Q+ @
A 0.9977 y = 37.97x^3-258.57x^2+640.944*x -1.121 1 _! w* U' }1 Z4 T. eB 0.9982 y = 33.995x^3-225.051x^2+552.829*x-1.017# Y1 F# _) D+ C
C 0.9982 y = 32.157x^3-207.386x^2+504.717*x-0.973$ d& Y1 `* \2 x* U% M2 `
银行获利=贷款金额x贷款年利率x(1-利率对于信誉评级客户流失率) W3 l P( k3 } o, A! A) f
u2 T' B/ g) y9 q0 G9 h% Y
0 C% R0 w, q: }% y. w2 b
在贷款金额确认,贷款年利率范围在0.4~1.5的情况下,利用上面拟合的关系式,我们能够暴力跑出最优年利率。4 z- z7 E" [( d; i
3 k; l/ }) f3 |) V: E3 S1 w' E# ~
, L& t0 A7 A$ T
double turnover_rate(double x, char ch) { 0 i- ^7 S7 s% B double y = 0, result = 0;" }, W, s; w. L( e% m3 T m
switch (ch) { # @6 @2 f) ~& R3 n case 'A': / g8 h( s# W% {6 n- ?0 g" E y = 37.969520 * pow(x, 3) - 258.570452 * pow(x, 2) + 640.944427 * x - 1.121484;2 E# K& n. _* u0 [* z9 O
result = x * (1 - y / 100.0); * d# J6 g- T+ ^ break; + J3 n8 { E4 f# t; ^7 n case 'B':7 \) J8 S$ P2 F$ S- X/ f
y = 33.994698 * pow(x, 3) - 225.050538 * pow(x, 2) + 552.829151 * x - 1.016503; * u9 e; f4 Q' i* o2 i8 g& O result = x * (1 - y / 100.0); , _( B1 I9 f5 F; [+ ~1 k+ } h break; ; \! E1 B* n" v& P/ T case 'C':$ }* t* F& j0 ~4 f V8 \
y = 32.156864 * pow(x, 3) - 207.385880 * pow(x, 2) + 504.716993 * x - 0.973497; " t `* O: U; i6 t6 d# o result = x * (1 - y / 100.0);7 o. {- [, ]. z- b/ J
break; , @7 S, z5 @$ d3 C t8 a7 } default: # L r2 ^7 e2 i& g cout << "输出有误!" << ch << endl; / i) V0 D7 f! I+ F } 4 Z6 I& F: ~) A$ W; U : d$ o' z6 ^6 r2 M
return result;; c% y' Z# F% a; Q
} ' X) h d+ n& U% N( J( W4 U ; l6 ~* O: ?( H2 r6 |; G. Q ' f5 O* y* O" ?% ^! V- h3 x5 [$ u3 t9 |" t$ P' W) h' Z4 X0 s
" q, v6 {1 V d7 {0 @问题二 d, b; \! r. P! Y+ n5 ~) I. M
利用代码,重新计算出各指标数据 8 D0 I. S# b2 p4 T1 I5 q6 _代入Logistics违约率预测模型,预测出各公司的违约率 1 E; \4 w9 }1 S, @! @/ K( n3 F根据标准普尔评级建立,主标尺,对不同违约率进行A~D等级划分,信誉等级D不予贷款 3 u$ O+ |* w6 L- c# G将违约率代入,之前得到的公式,得到具体贷款金额 7 N: T! t' ?3 A9 E. L最优年利率沿用上一问' Q8 G W! D% {& @4 ~0 I
# 信用等级& F; d g" g* J- [9 U: ?+ N1 E) E
cs = []4 d: A" e* a" K; r7 t
# 最优年利率,客户流失率,利率值+ y. Y3 p" A0 q0 q5 g
tax = [] ! k1 c# y/ C" }/ Lfor i in m:% S% C' M- B8 W. f1 I* G
if i <= 0.0069264:# n( Q0 t! C3 |$ C- C4 k9 S
cs.append('A') - {# I% P/ B3 |0 T; `' G tax.append([0.083,0.503173,0.0412366]) / \, L; V& q3 V. s9 o elif i > 0.0069264 and i <= 0.22619:7 J) p. P) l! g8 C: O/ ^0 Y. g
cs.append('B')% s+ y) U- m. t9 G6 [2 Q# `. J
tax.append([0.097,0.505215,0.0479942]) . K' f3 M! z. { o3 D elif i > 0.22619 and i <= 0.509915:3 Q* R* } b( [9 T% E
cs.append('C') # j7 P S% W0 e1 H" z" O6 n( {8 M tax.append([0.1069,0.506501,0.052755])5 _2 v2 o8 ~9 e. V& \4 U o5 S& `
elif i > 0.509915:: V0 W5 {2 s8 M/ h3 ?; x) V$ E
cs.append('D') 5 N* V, I5 o6 j( @9 m! u4 w. H tax.append([0.15,0,0]) & u) \3 U) s$ `5 g' [% N else: 2 A0 M" a+ Z9 j9 Q {( v' _! j print('违规') + a5 `8 |' v9 A9 m* _/ _ 6 d! J( P0 ~) s
parr = [] ) I( R, B* N" p& Pfor arr in list(a):6 `( B$ h8 Y8 p* Y! l& J
parr.append(list(arr)[0])4 A: A( O' Q% w1 [5 t$ H
sum_val = sum(parr) 5 ?; C' R& i) Y9 B% ?0 yamount = []/ z% M8 b3 B7 R" r1 G7 _; J, O A) m
for ival in parr: & D1 H$ z2 |5 H' k tmp = ival / sum_val * 100000000 2 _ O: c6 {1 P* b$ l8 f2 C) v! Q if ival < 1 - 0.509915: 9 Z3 w1 ^' t: Y! u amount.append(0) % B( @' `1 ]( |( D2 r O else:3 _1 ]3 ?* s- R& i' E5 U/ E
amount.append(tmp)4 @3 U4 m) s" d. K; i5 s: y# T( L6 T1 @