数学建模社区-数学中国
标题: 使用LSTM预测空气质量pm2.5 [打印本页]
作者: 1047521767 时间: 2021-10-15 10:53
标题: 使用LSTM预测空气质量pm2.5
使用LSTM预测时间序列数据
+ H, X: N& K2 Z" G
; r0 E7 W/ F$ }0 _* e9 Q: r
4 y& L2 _0 s- n( K" u$ ?文章目录
; H& g) m4 Y, G8 U& a8 l2 t h9 i" M背景. k4 O' |) C7 F0 M; D8 D; u3 v
结论7 @+ b! z4 k2 d$ J1 w
代码- ~+ i0 A9 `5 P! d
实验结果2 b; t$ l% U+ A' W% U$ ^
RNN和DNN的区别( i# r0 T" Y* q+ c* l( r- t+ e
RNN和LSTM的区别: o9 l$ o. X5 d; @8 e% ?" ^
背景0 [3 o7 |" e4 `: ]( s
复现 @“使用Keras进行LSTM实战” https://blog.csdn.net/u012735708/article/details/82769711 中的实验- r2 V; G8 i- ^( p. _+ K; ]
熟悉用LSTM模型训练
3 ?- a* Z2 z" p2 x+ M/ M验证将时序数据 转化为分类问题后,预测是否有效果5 O( J d' M% M6 j
对比SimpleRNN与LSTM模型 哪个效果最好?
" R/ F, Y8 K' {; F) m. U. m( d验证LSTM相比于Dense()模型 是否有提升?; A8 d" _; }. a; n
对比使用前3天的数据 和使用前1天的数据 哪个效果最好?
$ A2 w5 Z; E0 n9 s) H- ]: G; M$ j结论; n7 M& [. ]0 e, u' \0 @& @
使用前3天的数据预测当天的空气质量的效果 没有 比只用前一天的数据 好9 c0 J9 |0 d+ j7 s3 k* |' W8 ?5 K
使用LSTM的效果优于SimpleRNN! k5 H3 v5 f- f8 [
代码" \$ D0 o6 H& F, k8 M5 T
from pandas import read_csv" J3 n0 C% h+ f) @
from datetime import datetime8 O8 E+ W2 d0 S$ ^) ^) {
import pandas as pd! A9 Y( O; e0 F( p. Z
from pandas import DataFrame
3 r4 u2 A6 T. k; O1 n: Cfrom sklearn.preprocessing import LabelEncoder,MinMaxScaler2 [' f/ M% u& }$ O6 X" }3 v
from sklearn.metrics import mean_squared_error, j* e G7 u1 E# |" t1 G8 p4 h
from keras.models import Sequential
# o6 R; Y5 c/ wfrom keras.layers import Dense, Dropout
$ a5 k& |" w4 Ifrom keras.layers import LSTM
& e. B# O. U! L( Y) t3 J) ofrom keras.layers.recurrent import SimpleRNN5 I" G$ v# k$ c6 r" g7 q
from numpy import concatenate* R& n7 P8 R) [) P9 c' R* Q$ F
from math import sqrt# k% F2 ?8 ]6 G6 M' Q; g
9 i! G0 D3 `% \) V6 {* u" j* y1 g9 `' c+ W
4 d" }6 T; V7 f& }! r7 D ]( A, D% A! \" A6 T$ L( q
# load data
9 ]' [# f" u* _def parse(x):
9 W0 I8 ^, C$ X5 y return datetime.strptime(x, '%Y %m %d %H')
$ L4 L* y8 n" K) E. K
* b0 x) N; u/ c8 A6 g+ N# fdef read_raw(): p s( k- A, E/ ^) ]+ r% o
dataset = pd.read_csv('raw.csv', parse_dates = [['year', 'month', 'day', 'hour']], index_col=0, date_parser=parse): j4 C( {0 m: t( V& D, [
dataset.drop('No', axis=1, inplace=True)3 N, G' i. x5 b
# manually specify column names
. C {1 M: Z) A3 F: @! | dataset.columns = ['pollution', 'dew', 'temp', 'press', 'wnd_dir', 'wnd_spd', 'snow', 'rain']
2 c7 D; F( S D2 E5 D* x dataset.index.name = 'date'
" x7 m; Z# ?3 ?( K) p # mark all NA values with 0& D5 O9 }7 j8 E8 L+ t( N7 M/ u9 b
dataset['pollution'].fillna(0, inplace=True)
& `! q R/ c) }5 x0 _# ?+ H; D' h0 ?! c # drop the first 24 hours+ m ^7 X$ s0 R. D* T9 Z
dataset = dataset[24:]
- l$ r, i! s% `, `* P) T" V # summarize first 5 rows. N& C: y, M* S% V& R, ]7 v% O) m. ]$ K
print(dataset.head(5))
4 W% I( S0 C) A+ {% z6 Y # save to file3 h& H, e& [1 \. s# C4 _# i+ N
dataset.to_csv('pollution.csv') r% V+ Q5 n0 K, S. B
6 U( h0 j4 l! Z( n
' O1 u g3 |: l4 ?1 [! V l+ C4 j2 y) t# u; a7 d% C" n! D
- i2 |4 j" [2 L3 ~8 U) |" R% ] r
# convert series to supervised learning' Q9 F2 `0 R0 j, @: V5 i# j4 K; H
def series_to_supervised(data, n_in=1, n_out=1, dropnan=True):
0 Y" [9 E+ w* p4 f( i# f n_vars = 1 if type(data) is list else data.shape[1]3 l# Z+ g( u0 Y$ o$ p) g7 F7 J$ G
df = DataFrame(data)
! ?; D" ?+ w) _( _# L2 k cols, names = list(), list() [% J/ \; h. F8 v8 n. x# f( k
# input sequence (t-n, ... t-1)
0 h% W% ]: R, c4 h# a for i in range(n_in, 0, -1):: c* a8 n/ K. V* |+ f+ ~7 n# O
cols.append(df.shift(i))' X' K. D% c* b. s$ h
names += [('var%d(t-%d)' % (j+1, i)) for j in range(n_vars)]
$ R9 O2 s! f9 C" y) \; j: ]4 m& a # forecast sequence (t, t+1, ... t+n)
/ d" C5 Z( z& h for i in range(0, n_out):4 @* {4 x! v6 F6 \* q
cols.append(df.shift(-i)): z- M! F3 S# Z$ J1 m: F* J# b8 o
if i == 0:& b* d$ \' p Z
names += [('var%d(t)' % (j+1)) for j in range(n_vars)]
) C4 s8 \& @) P. B. |5 O& S else:
! _, n' [' r8 J* q* G& A! b: i names += [('var%d(t+%d)' % (j+1, i)) for j in range(n_vars)]! d5 d9 k) a$ H/ E8 O$ s
# put it all together
7 f! o' U5 D8 }7 E5 e- v9 M& P1 j agg = pd.concat(cols, axis=1)
; K1 a% o# a1 z( B5 {! E: | agg.columns = names7 K( M! v( ?/ ] @
# drop rows with NaN values- @$ \$ D- h% D# }3 c/ C
if dropnan:
" ^0 Z4 C `. V4 H. e agg.dropna(inplace=True)
' u; [, _" Z) @ return agg
5 C& W7 r9 }0 p8 f$ c8 b* W * L1 c7 g7 K0 A# f* ^
# load dataset
9 v; n5 q) n# ^- q* Bdataset = read_csv('pollution.csv', header=0, index_col=0)
8 y2 `8 U6 F o( dvalues = dataset.values
6 _. }" |" O% o t6 A! J& c; {, y6 ]
" B; C/ I L) |5 b" p
# integer encode direction* Z$ E: Q( P6 e. C& I* ]5 \
encoder = LabelEncoder()
/ @& x1 p4 \+ u/ fprint(values[:,4])! q" b( A( }$ u( L9 L3 l
values[:,4] = encoder.fit_transform(values[:,4])9 [9 g; w- f+ `6 c% _1 m
# ensure all data is float
t' U, Z/ k9 _- [values = values.astype('float32')
6 H. K0 Z: o/ n" s, v* E# normalize features/ q F4 s9 J5 z3 r* Y1 M9 J1 I
scaler = MinMaxScaler(feature_range=(0, 1))% k; q# W( b/ e I4 T) K
scaled = scaler.fit_transform(values)% h3 t& A7 r1 c3 G
# frame as supervised learning2 R2 Z+ _/ d! k
reframed = series_to_supervised(scaled, 1, 1) $ G$ t( ]3 h% `) P2 B( S. R2 D+ [
#reframed = series_to_supervised(scaled, 3, 1) #用前3天的数据,预测当天的数据, A, g1 ^+ C" C1 ] L$ x& ~
print("columns:", reframed.columns)' `* y3 r& V# @+ M8 L/ x6 L; y* Z0 O
# drop columns we don't want to predict
; J2 N$ N: v6 }/ R6 I! S: d* Kreframed.drop(reframed.columns[[9,10,11,12,13,14,15]], axis=1, inplace=True) #用前1天的数据,预测当天的数据
$ E, I @( a$ w, k0 h" T#reframed.drop(reframed.columns[[25,26,27,28,29,30,31]], axis=1, inplace=True)#用前3天的数据,预测当天的数据
5 t6 x+ k( {" aprint(reframed.head())* Z' |$ ~, ^0 a U9 k
print("new columns:", reframed.columns)
( v/ @5 T2 y0 G% }4 c/ \# split into train and test sets
/ k1 L3 s1 d; K6 h. Qvalues = reframed.values
5 P8 S% c+ F, y+ P4 X1 Un_train_hours = 365 * 24. A. X5 i# A+ m
train = values[:n_train_hours, :]
& I+ D; W6 Z7 g( n4 u ~& Vtest = values[n_train_hours:, :]
4 \ p" C; F: W+ Z# split into input and outputs
% [3 ~! r" d* u0 j8 ?' Htrain_X, train_y = train[:, :-1], train[:, -1]7 h }! w4 d: m. q% K
test_X, test_y = test[:, :-1], test[:, -1]5 W) E ?9 G3 @& P9 G
# reshape input to be 3D [samples, timesteps, features]
. C$ ~8 P- O0 r7 v! t: ~. k#使用Dense()模型时不用变换
2 C8 S1 X0 |7 s/ r) Ytrain_X = train_X.reshape((train_X.shape[0], 1, train_X.shape[1]))8 q" r) |- G7 X0 d2 {2 [$ Y
test_X = test_X.reshape((test_X.shape[0], 1, test_X.shape[1]))7 V! ~! { D7 E; ?, p: M
print(train_X.shape, train_y.shape, test_X.shape, test_y.shape)% |. [# y- _: `/ f$ N8 O/ Z I& R
# design network
. r3 S* K' q# o/ D- T4 f6 o2 s* Smodel = Sequential()4 S4 t- L4 w$ u) d8 R! s" B+ H
#model.add(LSTM(50, input_shape=(train_X.shape[1], train_X.shape[2])))
9 k9 _6 p- w1 [2 G#model.add(Dense(50, activation='relu', input_dim = 8))
# Y6 W' | }/ c# z$ H0 Q4 h1 V8 C7 Mmodel.add(SimpleRNN(50, input_shape=(train_X.shape[1], train_X.shape[2]))): c3 ^1 @* X+ X% J* v
model.add(Dense(1))
- H" O- s5 Z% Imodel.compile(loss='mae', optimizer='adam')
+ \+ ~$ ?# ~: N# C0 r( b# fit network
" \4 l2 n0 v6 `: vhistory = model.fit(train_X, train_y, epochs=50, batch_size=72, validation_data=(test_X, test_y), verbose=2, shuffle=False)* t9 `7 z& r% h3 O8 }7 q( B5 `) j
# make a prediction& g4 G+ o4 d7 x' H
yhat = model.predict(test_X)" v- J# C0 q" S
print("yhat shape:", yhat.shape)- o3 `0 n R) q
'''8 H( M3 T1 O3 N' @0 k( n
计算在测试集上的均方差
5 ^, a. X' a) D, B7 ^test_X = test_X.reshape((test_X.shape[0], test_X.shape[2])); b$ P* Q" m; y' L
print("test_X shape:", test_X.shape)
. f& b2 s& X. B0 |/ X# g7 q% M7 c; z3 F- u( s' J: l' k' n" Z8 \
6 I7 d* m. m+ B# q# invert scaling for forecast
6 t0 u' C! C: [inv_yhat = concatenate((yhat, test_X[:, 1:]), axis=1)
0 F+ t C3 V. J, pinv_yhat = scaler.inverse_transform(inv_yhat)
- T+ p( N2 |3 b0 Hinv_yhat = inv_yhat[:,0]$ t' X$ {4 ]! W
# invert scaling for actual3 E6 ?, ?- a- L: N+ X
test_y = test_y.reshape((len(test_y), 1))
) v$ a7 r) p: h# g3 }inv_y = concatenate((test_y, test_X[:, 1:]), axis=1)
4 X! a2 _" E7 }" Y/ Mprint("inv_y:", inv_y[:10])
/ A5 S2 n. h! B1 k& _print("inv_y shape:", inv_y.shape) g) Y$ a( ?5 P5 H/ C) ~0 l2 J n
: t+ \& q% U+ v3 X- l- a/ \6 W& \% j
inv_y = scaler.inverse_transform(inv_y), c1 i0 E# ~& Z
print(inv_y, "*"*30)
- o8 Q) N- m+ ?" A" \! i1 K6 c) s) b! B! c% ^. a ^7 Q6 {
0 X" @7 N7 X* ^& ^0 W( ^+ P# @
inv_y = inv_y[:,0]
& f$ a+ d- ~5 V, X) D$ D' _# calculate RMSE" Y' \) ^0 o2 O' p6 ]4 L
rmse = sqrt(mean_squared_error(inv_y, inv_yhat))
+ q! S2 o8 a' A* P% B; o! Nprint(inv_y[:100], inv_yhat[:100])( `. R0 y& x0 N; j
print('Test RMSE: %.3f' % rmse)
/ M0 I, b7 U W% @2 ?* l' d+ t' z! q. v, Y6 m3 Q# @+ B) U3 N" x% `
$ }( N ?# _0 b5 v) h'''2 R% k- u+ n; V0 g3 s
实验结果
& a! e9 _" f9 L8 t$ C! m3 T1 ` V实验1:用前一天的天气数据,预测某一天的空气质量6 Z: u* X4 l# D1 ^
使用LSTM模型
/ H5 o4 v2 v( Q- ?1 Y结果:
& F. E6 W; v3 T" \, |. ^Epoch 49/50
& ~/ x( b% N+ e0s - loss: 0.0144 - val_loss: 0.0133
4 _$ j* x1 R; D9 R7 E$ e, i9 \7 OEpoch 50/50
8 u) f) y; U" ^0 H7 I% J e; y) {0s - loss: 0.0144 - val_loss: 0.0133& [$ l, U0 F6 T
/ x+ G# q$ ^0 j/ Q3 {6 i' {+ H3 Y* M. M2 a$ u* I/ }. `
实验2:用前3天的天气数据,预测某一天的空气质量7 T: e2 t& z/ a
使用LSTM模型
7 _3 [( K2 |$ G. G$ Q4 j, v$ o
) \0 N$ o- `# t
0 }8 n2 e2 ~# q) v9 Z' y3 R. J结果:
# X4 h! M' V, S, C4 DEpoch 49/507 G( Y7 h/ Y7 T
0s - loss: 0.0147 - val_loss: 0.0149
& h6 M, D! v- fEpoch 50/502 I% ^; U7 A7 G
0s - loss: 0.0147 - val_loss: 0.0150
( i6 r1 B- e/ S8 q1 ~7 l0 p$ ~3 @4 E9 }6 J# s0 z" P$ D& k
0 O1 U7 D$ R+ c, L9 D g6 v实验3:用前一天的天气数据,预测某一天的空气质量
: A8 U& N" v$ U6 t使用普通的全连接模型 Dense()
2 j5 C, ]3 B3 n8 {- l& L+ N) I结果:
; b2 ~( Q q2 ^6 Y, L3 R1 C1 IEpoch 49/502 `3 b( n- [4 \4 a. d+ O
0s - loss: 0.0144 - val_loss: 0.0146
: U/ ~7 F$ _) r3 uEpoch 50/500 r' @+ v" q5 K* H+ w+ _ y; u0 g* s6 @
0s - loss: 0.0148 - val_loss: 0.0151
9 \; S. ^6 b9 U
1 P. P; ~" P' X, x% I+ g# U# a. j$ I. P# N9 N- u) _
实验4:用前三天的天气数据,预测某一天的空气质量/ Y! K# ?$ ~9 v& Y! y4 i7 _9 S% J
使用普通的全连接模型 Dense()) j" C) V& t) T8 Z: p
结果:
# z5 e1 U) p# j- v) ^- e; _Epoch 49/50
( ?% W! I, J" Q4 l0s - loss: 0.0150 - val_loss: 0.01659 j! W' Y6 R4 X8 k1 k) E; k1 K
Epoch 50/50( y2 X0 @. a3 x: C
0s - loss: 0.0148 - val_loss: 0.01413 u, K6 o3 ^- L$ C" \
7 X6 d n- w5 ^& a5 }. C
1 [( W) r, n* i8 A, l' e6 e实验5:用前一天的天气数据,预测某一天的空气质量& ?" {/ r$ a2 R$ z5 [4 [) Y
使用SimpleRNN% q7 C" [- q% s! z N. m9 Z
Epoch 49/50
" H' H, V o3 }! J O0s - loss: 0.0160 - val_loss: 0.0140
3 {0 ]6 q: F+ [ y c {$ w- jEpoch 50/50
& [: b* u% y" F/ n0s - loss: 0.0147 - val_loss: 0.01504 a& ^, d3 k1 e$ @. z
" y: i) n7 w6 n. [% z
* m7 _% u% \9 ]2 ~: V, D4 B0 Q8 w
实验6:用前三天的天气数据,预测某一天的空气质量
! v/ T3 r1 Q& I$ j使用SimpleRNN& x/ P4 T% {0 g* r# A" J% [6 ~4 k
Epoch 49/500 A& y6 H, ~2 h) l" P( A
0s - loss: 0.0164 - val_loss: 0.0233
1 a8 v- V8 u9 y1 c: U. R0 i3 y, CEpoch 50/50
7 g' F! h2 K3 A. l6 R2 o& k0s - loss: 0.0166 - val_loss: 0.0227; Y- i4 ~$ H9 q! K+ c8 p+ h+ \
RNN和DNN的区别RNN中的循环是指一个序列当前的输出与前面的输出也有关系。也就是说,网络会对前面的信息进行记忆并应用于当前输出的计算中,即隐层之间的节点不再是无连接的而是有连接的,并且隐层的输入不仅包括输入层的输出还包括上一时刻隐层的输出。
& e7 G' x/ b5 n" w- ?
0 D8 x% S- A [9 t9 o
; m% \7 o" W5 y8 Y
RNN和LSTM的区别
) c4 N8 S) m3 Q2 d% F# |! M+ uLSTM的内部结构通过门控状态来控制传输状态,记住需要长时间记忆的,忘记不重要的信息;而不像普通的RNN那样只能够“呆萌”地仅有一种记忆叠加方式。对很多需要“长期记忆”的任务来说,尤其好用。3 p9 a7 e+ W" b% S5 R5 n2 h4 e5 _
0 ^4 p" N5 _. X1 H, K
5 x1 S' `7 m6 B( H Q# O7 J" ?但也因为引入了很多内容,导致参数变多,也使得训练难度加大了很多。因此很多时候我们往往会使用效果和LSTM相当但参数更少的GRU来构建大训练量的模型。
! o6 w% x% H8 _: ~" s) F
0 |% _( p% n7 D6 V8 ]( B% e. a7 O' E& E, O& M4 B
请关注数学中国微博和数学中国公众号,如有疑问联系数学中国工作人员
4 ^% K& h9 B( C9 W
5 j7 _$ V7 V) K/ ] W1 ?( a8 q/ }2 {
作者: sjlxdn 时间: 2021-10-23 14:51
1111111111111 F) a& o* l1 s, G0 h$ y1 z1 h
| 欢迎光临 数学建模社区-数学中国 (http://www.madio.net/) |
Powered by Discuz! X2.5 |