查看: 2721|回复: 0

【深度学习】图像识别实战 102鲜花分类（flower 102）实战案例

[复制链接]

字体大小: 正常放大

杨利霞

5273 主题	82 听众	17万积分

TA的每日心情

	开心 2021-8-11 17:59

签到天数: 17 天

[LV.4]偶尔看看III

网络挑战赛参赛者

自我介绍: 本人女，毕业于内蒙古科技大学，担任文职专业，毕业专业英语。

群组: 2018美赛大象算法课程

群组: 2018美赛护航培训课程

群组: 2019年数学中国站长建

群组: 2019年数据分析师课程

群组: 2018年大象老师国赛优

电梯直达

1^#

发表于 2022-9-8 10:41 |只看该作者 |倒序浏览

|招呼Ta 关注Ta

【深度学习】图像识别实战 102鲜花分类（flower 102）实战案例

文章目录
卷积网络实战对花进行分类
数据预处理部分
网络模块设置
网络模型的保存与测试
数据下载：
1. 导入工具包
2. 数据预处理与操作
3. 制作好数据源
读取标签对应的实际名字
4.展示一下数据
5. 加载models提供的模型，并直接用训练好的权重做初始化参数
6.初始化模型架构
7. 设置需要训练的参数
7. 训练与预测
7.1 优化器设置
7.2 开始训练模型
7.3 训练所有层
开始训练
8. 加载已经训练的模型
9. 推理
9.1 计算得到最大概率
9.2 展示预测结果
写在最后
卷积网络实战对花进行分类
本文主要对牛津大学的花卉数据集flower进行分类任务，写了一个具有普适性的神经网络架构（主要采用ResNet进行实现），结合了pytorch的框架中的一些常用操作，预处理、训练、模型保存、模型加载等功能

在文件夹中有102种花，我们主要要对这些花进行分类任务
文件夹结构

flower_data

train

1(类别)
2
xxx.png / xxx.jpg
valid

主要分为以下几个大模块

数据预处理部分
数据增强
数据预处理
网络模块设置
加载预训练模型，直接调用torchVision的经典网络架构
因为别人的训练任务有可能是1000分类（不一定分类一样），应该将其改为我们自己的任务
网络模型的保存与测试
模型保存可以带有选择性
数据下载：
https://www.kaggle.com/datasets/nunenuh/pytorch-challange-flower-dataset

改一下文件名，然后将它放到同一根目录就可以了

下面是我的数据根目录

1. 导入工具包
import os
import matplotlib.pyplot as plt
# 内嵌入绘图简去show的句柄
%matplotlib inline
import numpy as np
import torch
from torch import nn

import torch.optim as optim
import torchvision
from torchvision import transforms, models, datasets

import imageio
import time
import warnings
import random
import sys
import copy
import json
from PIL import Image

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
2. 数据预处理与操作
#路径设置
data_dir = './flower_data/' # 当前文件夹下的flowerdata目录
train_dir = data_dir + '/train'
valid_dir = data_dir + '/valid'
1
2
3
4
python目录点杠的组合与区别
注：里面注明了点杠和斜杠的操作

3. 制作好数据源
data_transforms中制定了所有图像预处理的操作
ImageFolder假设所有文件按文件夹保存好，每个文件夹下存储同一类图片
data_transforms = {
# 分成两部分，一部分是训练
'train': transforms.Compose([transforms.RandomRotation(45), # 随机旋转 -45度到45度之间
                              transforms.CenterCrop(224), # 从中心处开始裁剪
                              # 以某个随机的概率决定是否翻转 55开
                              transforms.RandomHorizontalFlip(p = 0.5), # 随机水平翻转
                              transforms.RandomVerticalFlip(p = 0.5), # 随机垂直翻转
                              # 参数1为亮度，参数2为对比度，参数3为饱和度，参数4为色相
                              transforms.ColorJitter(brightness = 0.2, contrast = 0.1, saturation = 0.1, hue = 0.1),
                              transforms.RandomGrayscale(p = 0.025), # 概率转换为灰度图，三通道RGB
                              # 灰度图转换以后也是三个通道，但是只是RGB是一样的
                              transforms.ToTensor(),
                              transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) # 均值，标准差
                              ]),
# resize成256 * 256 再选取中心 224 * 224，然后转化为向量，最后正则化
'valid': transforms.Compose([transforms.Resize(256),
                              transforms.CenterCrop(224),
                              transforms.ToTensor(),
                              transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) # 均值和标准差和训练集相同
                              ]),
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
batch_size = 8
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir,x), data_transforms[x]) for x in ['train', 'valid']}
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=batch_size, shuffle=True) for x in ['train', 'valid']}
dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'valid']}
class_names = image_datasets['train'].classes

#查看数据集合
image_datasets

1
2
3
4
5
6
7
8
9
{'train': Dataset ImageFolder
   Number of datapoints: 6552
   Root location: ./flower_data/train
   StandardTransform
Transform: Compose(
            RandomRotation(degrees=[-45.0, 45.0], interpolation=nearest, expand=False, fill=0)
            CenterCrop(size=(224, 224))
            RandomHorizontalFlip(p=0.5)
            RandomVerticalFlip(p=0.5)
            ColorJitter(brightness=[0.8, 1.2], contrast=[0.9, 1.1], saturation=[0.9, 1.1], hue=[-0.1, 0.1])
            RandomGrayscale(p=0.025)
            ToTensor()
            Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
         ),
'valid': Dataset ImageFolder
   Number of datapoints: 818
   Root location: ./flower_data/valid
   StandardTransform
Transform: Compose(
            Resize(size=256, interpolation=bilinear, max_size=None, antialias=None)
            CenterCrop(size=(224, 224))
            ToTensor()
            Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
         )}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# 验证一下数据是否已经被处理完毕
dataloaders
1
2
{'train': <torch.utils.data.dataloader.DataLoader at 0x2796a9c0940>,
'valid': <torch.utils.data.dataloader.DataLoader at 0x2796aaca6d8>}
1
2
dataset_sizes
1
{'train': 6552, 'valid': 818}
1
读取标签对应的实际名字
使用同一目录下的json文件，反向映射出花对应的名字

with open('./flower_data/cat_to_name.json', 'r') as f:
cat_to_name = json.load(f)
1
2
cat_to_name
1
{'21': 'fire lily',
'3': 'canterbury bells',
'45': 'bolero deep blue',
'1': 'pink primrose',
'34': 'mexican aster',
'27': 'prince of wales feathers',
'7': 'moon orchid',
'16': 'globe-flower',
'25': 'grape hyacinth',
'26': 'corn poppy',
'79': 'toad lily',
'39': 'siam tulip',
'24': 'red ginger',
'67': 'spring crocus',
'35': 'alpine sea holly',
'32': 'garden phlox',
'10': 'globe thistle',
'6': 'tiger lily',
'93': 'ball moss',
'33': 'love in the mist',
'9': 'monkshood',
'102': 'blackberry lily',
'14': 'spear thistle',
'19': 'balloon flower',
'100': 'blanket flower',
'13': 'king protea',
'49': 'oxeye daisy',
'15': 'yellow iris',
'61': 'cautleya spicata',
'31': 'carnation',
'64': 'silverbush',
'68': 'bearded iris',
'63': 'black-eyed susan',
'69': 'windflower',
'62': 'japanese anemone',
'20': 'giant white arum lily',
'38': 'great masterwort',
'4': 'sweet pea',
'86': 'tree mallow',
'101': 'trumpet creeper',
'42': 'daffodil',
'22': 'pincushion flower',
'2': 'hard-leaved pocket orchid',
'54': 'sunflower',
'66': 'osteospermum',
'70': 'tree poppy',
'85': 'desert-rose',
'99': 'bromelia',
'87': 'magnolia',
'5': 'english marigold',
'92': 'bee balm',
'28': 'stemless gentian',
'97': 'mallow',
'57': 'gaura',
'40': 'lenten rose',
'47': 'marigold',
'59': 'orange dahlia',
'48': 'buttercup',
'55': 'pelargonium',
'36': 'ruby-lipped cattleya',
'91': 'hippeastrum',
'29': 'artichoke',
'71': 'gazania',
'90': 'canna lily',
'18': 'peruvian lily',
'98': 'mexican petunia',
'8': 'bird of paradise',
'30': 'sweet william',
'17': 'purple coneflower',
'52': 'wild pansy',
'84': 'columbine',
'12': "colt's foot",
'11': 'snapdragon',
'96': 'camellia',
'23': 'fritillary',
'50': 'common dandelion',
'44': 'poinsettia',
'53': 'primula',
'72': 'azalea',
'65': 'californian poppy',
'80': 'anthurium',
'76': 'morning glory',
'37': 'cape flower',
'56': 'bishop of llandaff',
'60': 'pink-yellow dahlia',
'82': 'clematis',
'58': 'geranium',
'75': 'thorn apple',
'41': 'barbeton daisy',
'95': 'bougainvillea',
'43': 'sword lily',
'83': 'hibiscus',
'78': 'lotus lotus',
'88': 'cyclamen',
'94': 'foxglove',
'81': 'frangipani',
'74': 'rose',
'89': 'watercress',
'73': 'water lily',
'46': 'wallflower',
'77': 'passion flower',
'51': 'petunia'}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
4.展示一下数据
def im_convert(tensor):
"""数据展示"""
image = tensor.to("cpu").clone().detach()
image = image.numpy().squeeze()
# 下面将图像还原，使用squeeze，将函数标识的向量转换为1维度的向量，便于绘图
# transpose是调换位置，之前是换成了（c， h， w），需要重新还原为（h， w， c）
image = image.transpose(1, 2, 0)
# 反正则化（反标准化）
image = image * np.array((0.229, 0.224, 0.225)) + np.array((0.485, 0.456, 0.406))

# 将图像中小于0 的都换成0，大于的都变成1
image = image.clip(0, 1)

return image
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# 使用上面定义好的类进行画图
fig = plt.figure(figsize = (20, 12))
columns = 4
rows = 2

# iter迭代器
# 随便找一个Batch数据进行展示
dataiter = iter(dataloaders['valid'])
inputs, classes = dataiter.next()

for idx in range(columns * rows):
ax = fig.add_subplot(rows, columns, idx + 1, xticks = [], yticks = [])
# 利用json文件将其对应花的类型打印在图片中
ax.set_title(cat_to_name[str(int(class_names[classes[idx]]))])
plt.imshow(im_convert(inputs[idx]))
plt.show()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

5. 加载models提供的模型，并直接用训练好的权重做初始化参数
model_name = 'resnet' # 可选的模型比较多['resnet', 'alexnet', 'vgg', 'squeezenet', 'densent', 'inception']
# 主要的图像识别用resnet来做
# 是否用人家训练好的特征
feature_extract = True
1
2
3
4
# 是否用GPU进行训练
train_on_gpu = torch.cuda.is_available()

if not train_on_gpu:
print('CUDA is not available. Training on CPU ...')
else:
print('CUDA is available! Training on GPU ...')

device = torch.device("cuda:0" if torch.cuda.is_available() else 'cpu')
1
2
3
4
5
6
7
8
9
CUDA is not available. Training on CPU ...
1
# 将一些层定义为false，使其不自动更新
def set_parameter_requires_grad(model, feature_extracting):
if feature_extracting:
      for param in model.parameters():
         param.requires_grad = False
1
2
3
4
5
# 打印模型架构告知是怎么一步一步去完成的
# 主要是为我们提取特征的

model_ft = models.resnet152()
model_ft
1
2
3
4
5
ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
(0): Bottleneck(
   (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
   (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
   (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
   (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
   (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
   (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
   (relu): ReLU(inplace=True)
   (downsample): Sequential(
      (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
   )
)
中间还有很多输出结果，我们着重看模型架构的两个层级就完了，缩略。。。
(2): Bottleneck(
   (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
   (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
   (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
   (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
   (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
   (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
   (relu): ReLU(inplace=True)
)
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
  (fc): Linear(in_features=2048, out_features=1000, bias=True)
)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
最后是1000分类，2048输入，分为1000个分类
而我们需要将我们的任务进行调整，将1000分类改为102输出

6.初始化模型架构
步骤如下：

将训练好的模型拿过来，并pre_train = True 得到他人的权重参数
可以自己指定一下要不要把某些层给冻住，要冻住的可以指定（将梯度更新改为False）
无论是分类任务还是回归任务，还是将最后的FC层改为相应的参数
官方文档链接
https://pytorch.org/vision/stable/models.html

# 将他人的模型加载进来
def initialize_model(model_name, num_classes, feature_extract, use_pretrained = True):
# 选择适合的模型，不同的模型初始化参数不同
model_ft = None
input_size = 0

if model_name == "resnet":
      """
      Resnet152
      """

      # 1. 加载与训练网络
      model_ft = models.resnet152(pretrained = use_pretrained)
      # 2. 是否将提取特征的模块冻住，只训练FC层
      set_parameter_requires_grad(model_ft, feature_extract)
      # 3. 获得全连接层输入特征
      num_frts = model_ft.fc.in_features
      # 4. 重新加载全连接层，设置输出102
      model_ft.fc = nn.Sequential(nn.Linear(num_frts, 102),
                                 nn.LogSoftmax(dim = 1)) # 默认dim = 0（对列运算），我们将其改为对行运算，且元素和为1
      input_size = 224

elif model_name == "alexnet":
      """
      Alexnet
      """
      model_ft = models.alexnet(pretrained = use_pretrained)
      set_parameter_requires_grad(model_ft, feature_extract)

      # 将最后一个特征输出替换序号为【6】的分类器
      num_frts = model_ft.classifier[6].in_features # 获得FC层输入
      model_ft.classifier[6] = nn.Linear(num_frts, num_classes)
      input_size = 224

elif model_name == "vgg":
      """
      VGG11_bn
      """
      model_ft = models.vgg16(pretrained = use_pretrained)
      set_parameter_requires_grad(model_ft, feature_extract)
      num_frts = model_ft.classifier[6].in_features
      model_ft.classifier[6] = nn.Linear(num_frts, num_classes)
      input_size = 224

elif model_name == "squeezenet":
      """
      Squeezenet
      """
      model_ft = models.squeezenet1_0(pretrained = use_pretrained)
      set_parameter_requires_grad(model_ft, feature_extract)
      model_ft.classifier[1] = nn.Conv2d(512, num_classes, kernel_size = (1, 1), stride = (1, 1))
      model_ft.num_classes = num_classes
      input_size = 224

elif model_name == "densenet":
      """
      Densenet
      """
      model_ft = models.desenet121(pretrained = use_pretrained)
      set_parameter_requires_grad(model_ft, feature_extract)
      num_frts = model_ft.classifier.in_features
      model_ft.classifier = nn.Linear(num_frts, num_classes)
      input_size = 224

elif model_name == "inception":
      """
      Inception V3
      """
      model_ft = models.inception_V(pretrained = use_pretrained)
      set_parameter_requires_grad(model_ft, feature_extract)

      num_frts = model_ft.AuxLogits.fc.in_features
      model_ft.AuxLogits.fc = nn.Linear(num_frts, num_classes)

      num_frts = model_ft.fc.in_features
      model_ft.fc = nn.Linear(num_frts, num_classes)
      input_size = 299

else:
      print("Invalid model name, exiting...")
      exit()

return model_ft, input_size

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
7. 设置需要训练的参数
# 设置模型名字、输出分类数
model_ft, input_size = initialize_model(model_name, 102, feature_extract, use_pretrained = True)

# GPU 计算
model_ft = model_ft.to(device)

# 模型保存, checkpoints 保存是已经训练好的模型，以后使用可以直接读取
filename = 'checkpoint.pth'

# 是否训练所有层
params_to_update = model_ft.parameters()
# 打印出需要训练的层
print("Params to learn:")
if feature_extract:
params_to_update = []
for name, param in model_ft.named_parameters():
      if param.requires_grad == True:
         params_to_update.append(param)
         print("\t", name)
else:
for name, param in model_ft.named_parameters():
      if param.requires_grad ==True:
         print("\t", name)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Params to learn:
   fc.0.weight
   fc.0.bias
1
2
3
7. 训练与预测
7.1 优化器设置
# 优化器设置
optimizer_ft  = optim.Adam(params_to_update, lr = 1e-2)
# 学习率衰减策略
scheduler = optim.lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)
# 学习率每7个epoch衰减为原来的1/10
# 最后一层使用LogSoftmax(), 故不能使用nn.CrossEntropyLoss()来计算

criterion = nn.NLLLoss()
1
2
3
4
5
6
7
8
# 定义训练函数
#is_inception：要不要用其他的网络
def train_model(model, dataloaders, criterion, optimizer, num_epochs=10, is_inception=False,filename=filename):
since = time.time()
#保存最好的准确率
best_acc = 0
"""
checkpoint = torch.load(filename)
best_acc = checkpoint['best_acc']
model.load_state_dict(checkpoint['state_dict'])
optimizer.load_state_dict(checkpoint['optimizer'])
model.class_to_idx = checkpoint['mapping']
"""
#指定用GPU还是CPU
model.to(device)
#下面是为展示做的
val_acc_history = []
train_acc_history = []
train_losses = []
valid_losses = []
LRs = [optimizer.param_groups[0]['lr']]
#最好的一次存下来
best_model_wts = copy.deepcopy(model.state_dict())

for epoch in range(num_epochs):
      print('Epoch {}/{}'.format(epoch, num_epochs - 1))
      print('-' * 10)

      # 训练和验证
      for phase in ['train', 'valid']:
         if phase == 'train':
            model.train()  # 训练
         else:
            model.eval() # 验证

         running_loss = 0.0
         running_corrects = 0

         # 把数据都取个遍
         for inputs, labels in dataloaders[phase]:
            #下面是将inputs,labels传到GPU
            inputs = inputs.to(device)
            labels = labels.to(device)

            # 清零
            optimizer.zero_grad()
            # 只有训练的时候计算和更新梯度
            with torch.set_grad_enabled(phase == 'train'):
                  #if这面不需要计算，可忽略
                  if is_inception and phase == 'train':
                     outputs, aux_outputs = model(inputs)
                     loss1 = criterion(outputs, labels)
                     loss2 = criterion(aux_outputs, labels)
                     loss = loss1 + 0.4*loss2
                  else:#resnet执行的是这里
                     outputs = model(inputs)
                     loss = criterion(outputs, labels)

                     #概率最大的返回preds
                  _, preds = torch.max(outputs, 1)

                  # 训练阶段更新权重
                  if phase == 'train':
                     loss.backward()
                     optimizer.step()

            # 计算损失
            running_loss += loss.item() * inputs.size(0)
            running_corrects += torch.sum(preds == labels.data)

         #打印操作
         epoch_loss = running_loss / len(dataloaders[phase].dataset)
         epoch_acc = running_corrects.double() / len(dataloaders[phase].dataset)

         time_elapsed = time.time() - since
         print('Time elapsed {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))
         print('{} Loss: {:.4f} Acc: {:.4f}'.format(phase, epoch_loss, epoch_acc))

         # 得到最好那次的模型
         if phase == 'valid' and epoch_acc > best_acc:
            best_acc = epoch_acc
            #模型保存
            best_model_wts = copy.deepcopy(model.state_dict())
            state = {
                  #tate_dict变量存放训练过程中需要学习的权重和偏执系数
               'state_dict': model.state_dict(),
               'best_acc': best_acc,
               'optimizer' : optimizer.state_dict(),
            }
            torch.save(state, filename)
         if phase == 'valid':
            val_acc_history.append(epoch_acc)
            valid_losses.append(epoch_loss)
            scheduler.step(epoch_loss)
         if phase == 'train':
            train_acc_history.append(epoch_acc)
            train_losses.append(epoch_loss)

      print('Optimizer learning rate : {:.7f}'.format(optimizer.param_groups[0]['lr']))
      LRs.append(optimizer.param_groups[0]['lr'])
      print()

time_elapsed = time.time() - since
print('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))
print('Best val Acc: {:4f}'.format(best_acc))

# 保存训练完后用最好的一次当做模型最终的结果
model.load_state_dict(best_model_wts)
return model, val_acc_history, train_acc_history, valid_losses, train_losses, LRs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
7.2 开始训练模型
我这里只训练了4轮（因为训练真的太长了），大家自己玩的时候可以调大训练轮次

#若太慢，把epoch调低，迭代50次可能好些
#训练时，损失是否下降，准确是否有上升；验证与训练差距大吗？若差距大，就是过拟合
model_ft, val_acc_history, train_acc_history, valid_losses, train_losses, LRs  = train_model(model_ft, dataloaders, criterion, optimizer_ft, num_epochs=5, is_inception=(model_name=="inception"))

1
2
3
4
Epoch 0/4
----------
Time elapsed 29m 41s
train Loss: 10.4774 Acc: 0.3147
Time elapsed 32m 54s
valid Loss: 8.2902 Acc: 0.4719
Optimizer learning rate : 0.0010000

Epoch 1/4
----------
Time elapsed 60m 11s
train Loss: 2.3126 Acc: 0.7053
Time elapsed 63m 16s
valid Loss: 3.2325 Acc: 0.6626
Optimizer learning rate : 0.0100000

Epoch 2/4
----------
Time elapsed 90m 58s
train Loss: 9.9720 Acc: 0.4734
Time elapsed 94m 4s
valid Loss: 14.0426 Acc: 0.4413
Optimizer learning rate : 0.0001000

Epoch 3/4
----------
Time elapsed 132m 49s
train Loss: 5.4290 Acc: 0.6548
Time elapsed 138m 49s
valid Loss: 6.4208 Acc: 0.6027
Optimizer learning rate : 0.0100000

Epoch 4/4
----------
Time elapsed 195m 56s
train Loss: 8.8911 Acc: 0.5519
Time elapsed 199m 16s
valid Loss: 13.2221 Acc: 0.4914
Optimizer learning rate : 0.0010000

Training complete in 199m 16s
Best val Acc: 0.662592

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
7.3 训练所有层
# 将全部网络解锁进行训练
for param in model_ft.parameters():
param.requires_grad = True

# 再继续训练所有的参数，学习率调小一点\
optimizer = optim.Adam(params_to_update, lr = 1e-4)
scheduler = optim.lr_scheduler.StepLR(optimizer_ft, step_size = 7, gamma = 0.1)

# 损失函数
criterion = nn.NLLLoss()
1
2
3
4
5
6
7
8
9
10
# 加载保存的参数
# 并在原有的模型基础上继续训练
# 下面保存的是刚刚训练效果较好的路径
checkpoint = torch.load(filename)
best_acc = checkpoint['best_acc']
model_ft.load_state_dict(checkpoint['state_dict'])
optimizer.load_state_dict(checkpoint['optimizer'])
1
2
3
4
5
6
7
开始训练
注：这里训练时长会变得别慢：我的显卡是1660ti，仅供各位参考

model_ft, val_acc_history, train_acc_history, valid_losses, train_losses, LRs  = train_model(model_ft, dataloaders, criterion, optimizer, num_epochs=2, is_inception=(model_name=="inception"))
1
Epoch 0/1
----------
Time elapsed 35m 22s
train Loss: 1.7636 Acc: 0.7346
Time elapsed 38m 42s
valid Loss: 3.6377 Acc: 0.6455
Optimizer learning rate : 0.0010000

Epoch 1/1
----------
Time elapsed 82m 59s
train Loss: 1.7543 Acc: 0.7340
Time elapsed 86m 11s
valid Loss: 3.8275 Acc: 0.6137
Optimizer learning rate : 0.0010000

Training complete in 86m 11s
Best val Acc: 0.645477

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
8. 加载已经训练的模型
相当于做一次简单的前向传播（逻辑推理），不用更新参数

model_ft, input_size = initialize_model(model_name, 102, feature_extract, use_pretrained=True)

# GPU 模式
model_ft = model_ft.to(device) # 扔到GPU中

# 保存文件的名字
filename='checkpoint.pth'

# 加载模型
checkpoint = torch.load(filename)
best_acc = checkpoint['best_acc']
model_ft.load_state_dict(checkpoint['state_dict'])
1
2
3
4
5
6
7
8
9
10
11
12
<All keys matched successfully>
1
def process_image(image_path):
# 读取测试集数据
img = Image.open(image_path)
# Resize, thumbnail方法只能进行比例缩小，所以进行判断
# 与Resize不同
# resize()方法中的size参数直接规定了修改后的大小，而thumbnail()方法按比例缩小
# 而且对象调用方法会直接改变其大小，返回None
if img.size[0] > img.size[1]:
      img.thumbnail((10000, 256))
else:
      img.thumbnail((256, 10000))

# crop操作，将图像再次裁剪为 224 * 224
left_margin = (img.width - 224) / 2 # 取中间的部分
bottom_margin = (img.height - 224) / 2
right_margin = left_margin + 224 # 加上图片的长度224，得到全部长度
top_margin = bottom_margin + 224

img = img.crop((left_margin, bottom_margin, right_margin, top_margin))

# 相同预处理的方法
# 归一化
img = np.array(img) / 255
mean = np.array([0.485, 0.456, 0.406])
std = np.array([0.229, 0.224, 0.225])
img = (img - mean) / std

# 注意颜色通道和位置
img = img.transpose((2, 0, 1))

return img

def imshow(image, ax = None, title = None):
"""展示数据"""
if ax is None:
      fig, ax = plt.subplots()

# 颜色通道进行还原
image = np.array(image).transpose((1, 2, 0))

# 预处理还原
mean = np.array([0.485, 0.456, 0.406])
std = np.array([0.229, 0.224, 0.225])
image = std * image + mean
image = np.clip(image, 0, 1)

ax.imshow(image)
ax.set_title(title)

return ax

image_path = r'./flower_data/valid/3/image_06621.jpg'
img = process_image(image_path) # 我们可以通过多次使用该函数对图片完成处理
imshow(img)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
<AxesSubplot:>
1

上面是我们对测试集图片进行预处理之后的操作，我们使用shape来查看图片大小，预处理函数是否正确

img.shape
1
(3, 224, 224)
1
证明了通道提前了，而且大小没改变

9. 推理
img.shape

# 得到一个batch的测试数据
dataiter = iter(dataloaders['valid'])
images, labels = dataiter.next()

model_ft.eval()

if train_on_gpu:
# 前向传播跑一次会得到output
output = model_ft(images.cuda())
else:
output = model_ft(images)

# batch 中有8 个数据，每个数据分为102个结果值，每个结果是当前的一个概率值
output.shape

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
torch.Size([8, 102])
1
9.1 计算得到最大概率
_, preds_tensor = torch.max(output, 1)

preds = np.squeeze(preds_tensor.numpy()) if not train_on_gpu else np.squeeze(preds_tensor.cpu().numpy())# 将秩为1的数组转为 1 维张量
1
2
3
9.2 展示预测结果
fig = plt.figure(figsize = (20, 20))
columns = 4
rows = 2

for idx in range(columns * rows):
ax = fig.add_subplot(rows, columns, idx + 1, xticks =[], yticks =[])
plt.imshow(im_convert(images[idx]))
ax.set_title("{} ({})".format(cat_to_name[str(preds[idx])], cat_to_name[str(labels[idx].item())]),
            color = ("green" if cat_to_name[str(preds[idx])]==cat_to_name[str(labels[idx].item())] else "red"))
plt.show()
# 绿色的表示预测是对的，红色表示预测错了
1
2
3
4
5
6
7
8
9
10
11

————————————————
版权声明：本文为CSDN博主「FeverTwice」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/LeungSr/article/details/126747940

zan