查看: 3228|回复: 0

使用卷积神经网络开发图像分类模型

字体大小: 正常放大

1178 主题	15 听众	1万积分

TA的每日心情

	开心 2023-7-31 10:17

签到天数: 198 天

[LV.7]常住居民III

自我介绍: 数学中国浅夏

电梯直达

1^#

发表于 2021-10-29 17:31 |只看该作者 |倒序浏览

|招呼Ta 关注Ta

                                                使用卷积神经网络开发图像分类模型
简介

这篇文章是关于卷积网络、它的工作和组件: 在本文中，我们将使用卷积神经网络执行图像分类，并详细了解所有步骤。因此，如果你对此不熟悉，请继续阅读。

简而言之，CNN 是一种深度学习算法，也是适用于图像和视频的神经网络类型之一。我们可以从 CNN 中实现各种功能，其中一些是图像分类、图像识别、目标检测、人脸识别等等。

今天，我们将对CIFAR10 数据集执行图像分类，它是 Tensorflow 库的一部分。它由各种物体的图像组成，如船舶、青蛙、飞机、狗、汽车。该数据集共有 60,000 张彩色图像和 10 个标签。现在让我们进入编码部分。

实施

# importing necessary libraries
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
# To convert to categorical data
from tensorflow.keras.utils import to_categorical
#libraries for building model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, MaxPool2D, Dropout,Flatten
from tensorflow.keras.datasets import cifar10

#loading the data
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

探索性数据分析
#shape of the dataset
print(X_train.shape)
print(y_train.shape)
print(X_test.shape)
print(y_test.shape)

我们的训练数据有 50,000 张图像，测试数据有 10,000 张图像，大小为 32*32 和 3 个通道，即 RGB（红、绿、蓝）
#checking the labels
np.unique(y_train)

#first image of training data
plt.subplot(121)
plt.imshow(X_train[0])
plt.title("Label : {}".format(y_train[0]))
#first image of test data
plt.subplot(122)
plt.imshow(X_test[0])
plt.title("Label : {}".format(y_test[0]));

#visualizing the first 20 images in the dataset
for i in range(20):
#subplot
plt.subplot(5, 5, i+1)
# plotting pixel data
plt.imshow(X_train, cmap=plt.get_cmap('gray'))
3 A/ o5 N! j2 ]- ^0 {) O& ~& W# show the figure
, E' k! ~. v# ?. x$ K+ ]plt.show()
7 T" ?2 s" t6 B
5 X+ A# ?0 q3 p3 v# s/ X7 D. k  |. h
预处理数据
对于数据预处理，我们只需要在这里执行两个步骤，首先是缩放图像的像素值到0到1之间，然后是将标签从 2D 重塑为 1D
# W5 Z' @& R) ?5 k
# Scale the data to lie between 0 to 1- m7 r: |1 d, r- m) `% |3 K$ l! }3 K
X_train = X_train/2551 o' x( ?+ q, j1 |9 ^& x( m7 H
X_test = X_test/2553 [, Y3 z% H8 r. ~
print(X_train): W- h9 v! b, A1 H

1 c. s1 k* {7 Y( y" R. o( r$ L; w& g$ h5 q7 W- C* e
#reshaping the train and test lables to 1D
: {/ Q/ j7 J( D0 z& vy_train = y_train.reshape(-1,)
: z- v+ z5 k2 ?3 {y_test = y_test.reshape(-1,)
8 U" |1 _) j; |- T5 M+ {/ N6 u* S+ X9 q5 j# I" \$ u" i; D" A) t
我们在上图中可以看到，图像的像素值已经进行了缩放，其数值在 0 到 1 之间，并且标签也进行了重塑。数据已准备好建模，现在让我们构建 CNN 模型。2 K2 y  F; u* D% O6 a
模型搭建
正如我们之前讨论的，深度学习模型的构建分为 5 个步骤，即定义模型、编译模型、拟合模型、评估模型和进行预测，这也是我们在这里要做的。
3 D& A( B" Z4 M. D
model=Sequential()
- Y& V6 z& B* Y. d" X#adding the first Convolution layer  O8 Q( {* Y9 D3 M
model.add(Conv2D(32,(3,3),activation='relu',input_shape=(32,32,3)))) l8 i9 g9 e, k" @/ t, ^1 W: [
#adding Max pooling layer' Z5 ?) M  D* x: E5 A6 ~
model.add(MaxPool2D(2,2))! I# n; c7 y0 [$ X/ A5 p* y
#adding another Convolution layer8 ^) n. u# a# l8 w, w8 D. g
model.add(Conv2D(64,(3,3),activation='relu'))
/ U9 Y9 x! v7 T# Cmodel.add(MaxPool2D(2,2))
* W0 ^# \& Y8 \; u3 f+ l, N( Smodel.add(Flatten())& m- j  a. }- E  i( [/ R
#adding dense layer5 h0 x/ J8 X& g; P
model.add(Dense(216,activation='relu'))
# @' a- p! T/ y1 F/ d#adding output layer
/ ?8 H' e: c# V( x% g) gmodel.add(Dense(10,activation='softmax'))- E- n' r4 u, U/ P/ v

5 }, f6 E2 T5 N. k5 @我们添加了第一个带有 32 个大小为 (3*3) 的过滤器的卷积层，使用的激活函数是 Relu，并为模型提供输入形状。
, L3 ^0 e& S7 _8 g
5 j; R/ `# S4 k4 x  U接下来添加了大小为 (2*2)的Max Pooling 层。最大池化有助于减少维度。CNN 组件的解释请参考：https://www.analyticsvidhya.com/blog/2021/08/beginners-guide-to-convolutional-neural-network-with-implementation-in-python/
& y/ E2 ?1 D7 [* t! ^. q2 u
6 K6 k; c9 C$ R" g# V$ W然后我们又添加了一个卷积层，其中包含 64 个大小为(3*3) 的过滤器和一个大小为 (2*2)的最大池化层; d' s, J4 K/ M/ K1 r
3 r/ W3 q, A6 C; T& h( Z( Q1 S
在下一步中，我们将层展平以将它们传递到 Dense 层，并添加了一个包含 216 个神经元的Dense 层。8 q0 h0 n/ N$ x6 ]
( b9 }) _6 b2 R7 X
最后，输出层添加了一个 softmax 激活函数，因为我们有 10 个标签。
7 g8 L4 H, O& w0 E- C2 A, ]2 B* w% f4 Q& i
第 2 步：编译模型
; ]8 ?4 n5 a1 L: S% ]' Amodel.compile(optimizer='rmsprop',loss='sparse_categorical_crossentropy',metrics=['accuracy'])1 w8 E: y: {+ c5 K
4 v$ X+ ~' w4 U/ Q
第 3 步：拟合模型model.fit(X_train,y_train,epochs=10)- {- v- @- q* y3 _

9 A0 \) o# D: l7 \* x" `. C5 ]$ }! ^2 s+ [
如上图所示，我们的准确率为 89%，损失为 0.31。让我们看看测试数据的准确性。% g' d! t, z. f: s/ o9 L
第 4 步：评估模型model.evaluate(X_test,y_test)
  x: M- [8 C3 y$ z8 X5 Y0 C2 O7 \3 a8 H) q7 x/ A7 Q, t$ r2 u* \: Z
% Y8 J. ]) T- Z: x7 B& b: Y7 p# S
测试数据的准确率为 69%，与训练数据相比非常低，这意味着我们的模型过度拟合。6 y+ i# a5 l3 H- {% K  }
第 5 步：进行预测$ r4 H3 u) D$ o0 \& `& ^
pred=model.predict(X_test)
. I: Q" D* f. p#printing the first element from predicted data) f& e# }) B/ w5 A9 e; v; S
print(pred[0])( ~; j8 f+ z' @" \3 a# L
#printing the index of
# W! ?' L8 e9 Y! W  D# Yprint('Index:',np.argmax(pred[0]))! V& x1 {9 t4 }5 I$ `# n9 @

/ S* a& v; _6 }6 u1 R: A/ @. N( m6 ]: ?9 T% L; t
0 a0 b) F, z1 J4 A# v8 d

因此，预测函数给出的是所有10个标签的概率值，概率最高的标签是最终预测。在我们的例子中，我们得到了第三个索引处的标签作为预测。
将预测值与实际值进行比较以查看模型执行的正确程度。
在下图中，我们可以看到预测值与实际值的差异。
y_classes = [np.argmax(element) for element in pred]9 |2 _; F# Q  f  S4 L  p" n  p* C
print('Predicted_values:',y_classes[:10])* m0 V" d4 S5 i' s0 A- V  L
print('Actual_values:',y_test[:10]): |& W! Q0 |1 v0 f0 e8 ~
( |) M) ~: `, i" ]
) h; G* n: y1 V' e6 ]; o* |+ j7 c3 e

当我们看到我们的模型过度拟合时，我们可以使用一些额外的步骤来提高模型性能并减少过度拟合，例如向模型添加 Dropouts或执行数据增强，因为过度拟合问题也可能是由于可用数据量较少。
在这里，我将展示我们如何使用 Dropout 来减少过拟合。我将为此定义一个新模型。
; @( a0 M+ n, s4 S1 H: J
model4=Sequential()9 D; ^, ^8 k) i
#adding the first Convolution layer. c2 j) c& k1 ~! _0 A+ i
model4.add(Conv2D(32,(3,3),activation='relu',input_shape=(32,32,3)))
2 m7 I- v: x; y- q" Y+ v0 j#adding Max pooling layer3 s: ^# h& e+ r' n4 f4 u: Y
model4.add(MaxPool2D(2,2)): g  P4 w) ?1 W! l. n# q. V( f4 Y
#adding dropout
* Y0 \* ?4 [* j' o! smodel4.add(Dropout(0.2))% w0 W- G% U# L. A0 y
#adding another Convolution layer* Y4 [3 K* K% y$ U9 f
model4.add(Conv2D(64,(3,3),activation='relu'))
3 C9 }- q6 x( c* B% {7 Xmodel4.add(MaxPool2D(2,2)): b6 H1 H3 b4 L: N
#adding dropout
9 N: u8 z+ B2 l7 @model4.add(Dropout(0.2))! l. r& ]# G1 _
model4.add(Flatten())" H9 j; @0 F5 }8 ^+ _- Q
#adding dense layer4 B) {7 U& ]: ~) R# H2 _- \( R6 D
model4.add(Dense(216,activation='relu'))
* H* H% c6 Y4 O. l" k#adding dropout  h: G6 \% E8 _6 j2 d: r( }# U  y& G
model4.add(Dropout(0.2))
  f+ m# g; A' A3 M2 I' `#adding output layer
1 G9 D$ U" W6 |& B: _) Nmodel4.add(Dense(10,activation='softmax'))
% v: Z+ V7 d$ @$ vmodel4.compile(optimizer='adam',loss='sparse_categorical_crossentropy',metrics=['accuracy']), z4 s/ I0 w) T! ~
model4.fit(X_train,y_train,epochs=10)
/ A9 f  m: h8 \" a! v' H  w8 x% }" i; o+ K

& [- h# S6 m0 |/ ^7 n4 ~model4.evaluate(X_test,y_test)6 ~* f8 J% O4 `/ c( G

* k/ x4 Z+ y* V6 r/ g$ X. O通过这个模型，我们得到了76%的训练准确率（低于第一个模型），但我们得到了72%的测试准确率，这意味着过拟合的问题在一定程度上得到了解决。+ j4 N3 E' v% d) X, N: J# r8 G9 ?
0 `" V5 Y0 {0 ]
尾注9 _& K% s9 Q# i0 L: t
这就是我们在 Python 中实现 CNN 的方式。这里使用的数据集是一个简单的数据集，可用于学习目的，但一定要尝试在更大和更复杂的数据集上实现 CNN。这也将有助于发现更多挑战和解决方案。1 Q$ Z8 E% ]" g( r
+ D& w' U: x8 W  l8 |, x# N
$ u9 r- L2 C6 ]( p% Z1 _2 d/ Z, ^

' l# D0 p' F) v+ Q