, c9 k# p- l' Q3 b0 A, e - C6 S) W# R5 F: {4 F) Q3 o# D$ W. y使用notepad++打开源文件夹中的data/predefined_classes.txt,修改默认分类,如person、car、motorcycle这三个分类。 # w. r# L1 A7 U [+ E* @6 y$ q ( `4 D- q- o: z+ E* K3 m1 g2 ^* O3 z- z/ {' [7 i
“打开目录”打开图片文件夹,选择第一张图片开始标注,用“创建矩形框”或“Ctrl+N”启动框,点击结束框,双击选择类别。完成一张图片点击“保存”保存后,XML文件已经保存到本地了。单击“下一张图片”转到下一张图片。0 e2 R3 Q4 \5 w8 i% P9 l
8 r1 z+ y# M5 p 5 e' X4 h2 ]! u# q4 r( n6 v贴标过程可以随时返回修改,保存的文件会覆盖上一个。) y4 E8 i2 c+ e" F, E4 l, L: w, ] i
+ U; \9 x( w# w0 k! H1 w" _: n: R* ~9 X
完成注解后,打开XML文件,发现和PASCAL VOC格式一样。 - e) Q, T" o6 X/ x/ f! C9 o2 X7 W* N) _9 \. n% `7 _4 V7 X* v
1 ?7 D. c/ \1 e" i! I' p
将xml文件提取图像信息' M8 `8 {0 J3 K5 F0 ^. F( _% d, G& p
下面列举如何将xml文件提取图像信息,图片保存到image文件夹,xml保存标注内容。图片和标注的文件名字一样的。; y7 _# m' U4 C. o
~7 G6 h9 [& y0 n8 e6 e
3 u8 C/ }/ M3 d+ g; I6 R * I- d+ h" T: Z0 j7 s W ( d+ G8 h% u" _# U( m3 E下面是images图片中的一个。 7 P. Y5 {# Z- q% ?: B # D# m2 L. e: E" H9 m6 e) q" q' ~2 S9 n
下面是对应的xml文件。" ?# n2 J9 q4 R: I7 w
% I! N- k9 b# c- ?' R+ \0 ?/ M ! [: j% U+ {9 @' U: h; F8 k<annotation>8 A. R) V' \+ A. g# |
<folder>train</folder> & U/ [/ j1 R6 a- S/ z5 C7 G$ P& B <filename>apple_30.jpg</filename>. ]7 Y4 i: ^& q+ M9 q) w: S& H
<path>C:\tensorflow1\models\research\object_detection\images\train\apple_30.jpg</path> % ?5 A) ]% R$ s7 P" I8 X& p: ^ <source>0 e. j( j7 I1 P6 ^# w
<database>Unknown</database>. N: h1 K( I) P) B4 b
</source> ) H& d2 p5 U, d <size>9 @7 q$ g2 W. c/ I/ e3 Y% w
<width>800</width> $ ?3 C3 u! O; f4 R <height>800</height> $ w8 B {+ J# n" o1 u2 w! [ <depth>3</depth>8 }# w8 w- A! }3 e) [4 |2 B
</size> " \8 c0 i3 A' R* G; J( x: C <segmented>0</segmented>$ M9 X1 ?' |) o* V" p) M. o7 I
<object>7 Y: {0 t2 R/ y8 Z8 o/ l9 v& g
<name>apple</name> $ P7 w; _4 H* i: N# W' O <pose>Unspecified</pose>/ Q. f9 I @7 E! ^/ y
<truncated>0</truncated> 9 \, Y/ ^/ M( E7 _0 y <difficult>0</difficult> " V8 M& [+ P8 g- |( F4 n0 m# _ <bndbox> % N) Z) p' C# q <xmin>254</xmin>$ Y; {/ `9 S5 l0 U8 m2 M
<ymin>163</ymin> 2 `& o; e" {! U0 }. U m+ L. i <xmax>582</xmax>' j3 \5 f5 ]( ^$ s2 ~2 C
<ymax>487</ymax> $ k# d C* M) M1 a6 f' Z </bndbox> " a- p7 G7 D8 @ </object> z# L' \" j/ k7 R1 { <object> l P1 s- O" e9 j) h2 D# ? <name>apple</name>1 W7 d1 ^. u) }! v! p" A
<pose>Unspecified</pose> 8 u* Y R& y8 D$ G# Q+ w <truncated>0</truncated> " l3 T) f: u4 _+ Z; C <difficult>0</difficult>% G6 P& M# N* [1 K! ^6 [: R. Y( s4 B
<bndbox>8 S- L" s5 W% _; t1 ~
<xmin>217</xmin> ( Q8 r% e/ P+ _" m1 ?$ F# f' _ <ymin>448</ymin>* ^ v$ ?8 ^7 G* L
<xmax>535</xmax>5 ^4 z+ J' D6 R r3 P3 G
<ymax>713</ymax>* i& c' ]& n; Q( z4 |7 @
</bndbox> + y& H+ z, U; c4 | </object>4 O' m4 f$ f. K* e# B
<object>: W' z% J6 c! K9 X V$ K' b
<name>apple</name> Z% y4 t5 f; [' z
<pose>Unspecified</pose>: V8 j' g6 {5 t% L0 }! Q# `8 ?) ?
<truncated>1</truncated>" y* f9 f6 M8 s/ x7 Z- Q% W& g
<difficult>0</difficult>3 @8 K4 Q9 ?/ }! P# N
<bndbox>% `9 _& `7 e3 u, _
<xmin>603</xmin> - Y2 u) s7 B$ j U( ~5 p; d <ymin>470</ymin>4 s: P2 F/ e; T4 ^) I- @3 @4 u) S
<xmax>800</xmax> 5 T* p" Z/ _& k2 ?4 k7 P <ymax>716</ymax>! ]$ @( s* ^9 F: f I( q! s5 D" `& u% a
</bndbox># d3 y5 J& @8 |; H. M! w, Z
</object> ' S/ X6 _* h/ b' k5 y# e) c& } <object>) }* i+ l H8 R r3 [, W
<name>apple</name>* n* h7 ~4 e- f8 R
<pose>Unspecified</pose>0 H8 V) s& g- k6 V+ l( G2 k
<truncated>0</truncated> 8 w" A+ I5 f9 w" {1 `) g& v <difficult>0</difficult>' C1 h/ }0 W4 G# e: J s
<bndbox> 6 W9 C* K3 a2 B k/ e0 k/ w g8 N <xmin>468</xmin>) O, U8 V# q5 Y' u/ d9 R
<ymin>179</ymin>8 C0 f$ D! p4 F* S7 r4 u! R9 z
<xmax>727</xmax> 7 m2 J' {' Q( t" _% ` <ymax>467</ymax> $ e. A9 Q7 I7 |3 ]& k' ^0 [ </bndbox>% E$ N& M/ {( k+ V, ~
</object>( O, l* ^( p( T$ C
<object> # S' Q: e' W* H; i; L <name>apple</name> , f6 s( m6 m3 V4 q <pose>Unspecified</pose> 1 R. @9 m- H" f3 ] <truncated>1</truncated> X2 z5 Q/ J2 h. k <difficult>0</difficult> % X1 j/ t( `/ a7 ?1 Q0 T! H; ~, t <bndbox> ; q9 X+ E. M; i" ?+ c9 M- Z <xmin>1</xmin>. L6 a; C; q( z8 I# R' i$ X: I1 g
<ymin>63</ymin>" c! H/ C+ ]- x0 g
<xmax>308</xmax>) j4 P3 J2 E, p; i+ l5 I
<ymax>414</ymax> s- t6 a- w; q5 z </bndbox>( j' d3 ~" B2 R; M
</object> ) h5 P; T& y8 }" K2 V3 K</annotation> 9 C& L/ D/ V; E( E& j0 l1 3 C! r& [9 r/ ~4 L4 |. K& n1 i2 ( Q; f# B6 _9 h1 c2 M! s3 , u3 w9 k+ X: N2 A$ V4 ) R* P4 W, e( e$ Z5 7 ^" L5 J, ~' V+ ^2 N, a* Q6 ' {8 I2 _- p2 P1 V; ?75 h, W& h, x( p" }8 M. W& \, T+ J5 C
8 * h# _6 X. A, Z5 ~2 h0 \9* ]( l7 D) m5 U$ Y* @
10& y' A6 x8 U+ A* S
11 9 j* Y9 l3 V% n12 * Y) \0 l& Q/ Q! O& _" N' z; S4 x7 C13, [; C) I. O% h* @. ]# q
14% O* ]3 D+ L- B8 O0 U6 \
15 + j; ?; x9 r0 N16 6 j: \9 ]1 T/ T/ y" X& N. g17$ J3 |5 E/ Q& i; M+ X' n/ K
182 d% |& @) \+ `! G
19 8 [9 S2 G' a; A205 }" L0 [! Y' V( ~. R
21 i3 p0 @# A6 W8 L1 K8 V
22+ q6 j9 d: Z6 v, h
23 ! }5 W) _5 R- w6 T9 X5 p2 E( m24+ u0 [" B0 Z3 b; Z* Y
250 T1 c) Q+ Q* X
26 * k! W' C! }$ V8 r! H27 k9 Y+ `. D" K1 j: Q& ]. l* q28% t: R# O1 p, Z: R+ r3 \* s. T
29 0 U$ W. N% C6 E# O30 * r$ x* r7 i5 X; ?) n0 r31 ; |3 R# q* Y! e+ X L& G- q/ {32: ]6 g/ x- [0 \: X
33 7 A- I# d9 {+ s% B% j$ G- i34) d2 z; S$ \1 D& F# {' H
35 9 \5 P3 X8 x) g3 \& N+ Y36 4 F, O. F+ @0 \; ^% t& n6 `37 . `2 J9 x& J: p2 n3 U- E* j38 7 ?* Y& E' i2 T# E ~& f$ o390 Q+ P& ^/ q2 @1 D" W7 W
40 * Q% l& Y6 ^" H7 A0 v9 |41) e9 v9 d: D7 F
428 \1 ~6 y' P- M
43( m% P+ b# V0 s$ ^, K7 H* u3 b
44% n) X$ ~# _$ T% t4 d" y
45/ D' U$ n' U. T6 Z F
46 + y q" Z3 F( j" B& Q# U47 4 U/ T1 C/ O( O1 Y0 A484 M: I2 E2 L/ d: {" Z5 Y" |
49 0 Y6 _5 A1 z6 p$ q50 7 a$ ]! S* x* q( d6 g4 M% W# Q51 4 V; [- b7 ], b1 Y: F" f5 j52 $ d- E; }: Q j+ D53 - |. Y5 a' |: o548 e+ M* l& w+ Z3 {6 G
55. t! W) ~$ \$ v
56 ' A7 }% A" F5 z* @' @+ Y57 9 R. A2 b! E# X7 U( V1 ^6 ]58 # q6 x- k/ K& a+ U6 ?2 X) W0 I59 ' @9 M; F1 ?+ ?0 h) {2 u( j% N# K60 * a5 K2 I" J. w- S# n) }# _) f61 2 M# R1 r3 Q* W! H% w; J# B62' J, y5 b @, A; ~# I5 ^
636 k* C* G9 }2 N8 X
64! j Y) D7 i7 V
65( h9 V @( H* O+ U6 ?2 l6 B$ Q8 C
66 3 q5 Z5 O- o$ L4 m# T* j671 x# c) b/ e! A! p6 ~+ Q
680 ]: u' c. b' x( m8 m4 t3 R
69' ~% i" @9 ~, t- F8 q
70 % `2 v+ X( E/ ^+ _/ ~; W+ X$ L e. F; W711 e$ q& L3 U4 _+ q* h
72 " f: j R$ R! |% @" Y4 d73 ; K7 c8 l9 j. r1 T$ g74' k# [4 j2 Q( D: O
将xml文件提取图像信息,主要使用xml和opencv,基于torch提取,代码比较凌乱。- t6 ?# L- X$ o
2 M) J& [: x; g / y: \6 o2 N1 qimport os8 I& S4 Q6 _% }, ~
import numpy as np* l$ |8 v' B5 B% ~7 p, g" p. `) I
import cv2" {4 G0 y$ M( A- K- Q7 \
import torch * z/ B( Y/ q. q, x+ h( d5 timport matplotlib.patches as patches' b* H" C7 u/ P: c- I/ W6 g/ t
import albumentations as A * k; w4 Z! t9 X* t; n3 ?from albumentations.pytorch.transforms import ToTensorV2" y* I& M+ i9 Q& q; ]4 _3 L* [
from matplotlib import pyplot as plt/ [2 y$ ?3 v* F' L j
from torch.utils.data import Dataset4 _: ~6 h {; [& t
from xml.etree import ElementTree as et + Y8 F( {# W I! z- ufrom torchvision import transforms as torchtrans 3 k' V4 ~" O6 D4 k5 y' S. V! ]; T3 [) V, ?: a; }
' ?$ y) ] F% W1 Q) |8 p4 \' y# defining the files directory and testing directory0 @6 U f m( B: j- K9 C5 I
train_image_dir = 'train/train/image' 2 E3 O M/ e5 Z! Z. S% F; l: z) Xtrain_xml_dir = 'train/train/xml' # A# v. i- l2 I0 U- _8 P# test_image_dir = 'test/test/image'( b/ ~6 b1 T; P1 a1 v( ~: C0 e' x
# test_xml_dir = 'test/test/xml'. \4 l8 r3 C- Y. c6 T4 ~5 l, x0 h8 ^
) O. y8 H7 ~8 g- H . ~( G& _& K. t% A, N. |3 q: o1 Zclass FruitImagesDataset(Dataset): 6 C3 C2 z" m; ]0 n/ C) h / e& o' j; t" ]+ |# V- w5 s P- f
def __init__(self, image_dir, xml_dir, width, height, transforms=None):" ^5 v* |' q! x, j) D+ Z# o
self.transforms = transforms; I8 J* W" K# g( }
self.image_dir = image_dir( l( V6 b* b9 Y2 u8 Q( F- ]
self.xml_dir = xml_dir5 ?6 P0 E" ~" q/ a' S" C; O
self.height = height ; I& T* k" H' N9 W self.width = width ! V" N) {; o3 Y5 W* q8 i7 Y- ?! e# ^3 E; o7 ~, J* M* R: \$ t
. }2 K) t% j4 V% z. Q+ r$ o # sorting the images for consistency/ [1 p/ I* e9 }' {# x) v* h& Q2 O
# To get images, the extension of the filename is checked to be jpg9 _: z/ M7 o& E: X
self.imgs = [image for image in os.listdir(self.image_dir) 8 G# I& R+ R, `( B, c if image[-4:] == '.jpg'] ( D: L: l3 u3 U! q1 K- c/ z5 Q self.xmls = [xml for xml in os.listdir(self.xml_dir): a- s ]5 d( d% M1 Z% o ?1 m3 N# [
if xml[-4:] == '.xml']0 {9 D q" p3 ?* D( a
3 g+ ^+ l) h( m8 ^7 q* P - V" F+ @# P7 p1 i% ^. w # classes: 0 index is reserved for background5 _: w3 @6 g3 I& ]8 t& N
self.classes = ['apple', 'banana', 'orange']+ H0 l; g' Q" d$ a6 S' q- o* W
3 e. b' m8 q7 g5 Z
4 a8 [3 H( T% \1 }" \- _9 \9 a5 b- e2 j
def __getitem__(self, idx):3 ?4 r& V' ]' P! Y4 i
( T% D% ~7 v; b
7 t5 ?8 ], F; X+ u& ~3 v6 f$ {
img_name = self.imgs[idx]6 u% e* k2 T! h
image_path = os.path.join(self.image_dir, img_name) C+ b* P t; O5 k/ {
7 j4 v( x! K' z' y ! ?' L+ r( [$ Y) X" G # reading the images and converting them to correct size and color * _" j0 f+ K- S5 r+ E# q1 y1 z( N img = cv2.imread(image_path)# g& |0 O+ g( z: v
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB).astype(np.float32)! y; Q/ k) b' c( ^- E
img_res = cv2.resize(img_rgb, (self.width, self.height), cv2.INTER_AREA) , ^. y4 c5 v) r # diving by 255 . X R4 a0 z# l& @9 J) R img_res /= 255.04 ^4 p9 L# T! X
4 E# g! ]- _; L% p& i ( r/ f6 E/ q+ `% p # annotation file% @& L1 b# _4 C- h/ Y
annot_filename = img_name[:-4] + '.xml'9 b3 | V/ c+ w
annot_file_path = os.path.join(self.xml_dir, annot_filename)0 }, G2 f9 n* t8 M( }6 c% @
6 V1 g) B$ L( F
# z- z* M) g9 B8 a7 U! g
boxes = [], L1 `" s J% s& r) P6 A% V
labels = []; L) S1 ]1 L0 U, `% j
tree = et.parse(annot_file_path) + Z- k$ s+ j* ?2 J root = tree.getroot() 8 `# Z7 N, {$ h+ e1 x6 B, [1 u & X# e7 u1 X" f6 p; e 8 D) ?' `1 r: |! i8 _7 U9 U9 ^ # cv2 image gives size as height x width 2 X7 K" [( Q5 m9 d0 u& \ wt = img.shape[1] Q/ p3 [) p+ Z; e5 t9 i
ht = img.shape[0]0 z" d" l6 \8 S, a! [' X F4 s* J
# W5 p: s! k: v4 I" ~# e( c4 }: ]( |. J5 x. U
# box coordinates for xml files are extracted and corrected for image size given 1 V0 Q. f9 h7 v" y9 K4 B5 `' E1 E for member in root.findall('object'): 1 p& z+ s# p5 Y% K, D labels.append(self.classes.index(member.find('name').text)), V: N0 x8 L2 ]+ l
* p7 n' J a# }% Q' u
K# e/ @) ?4 s) z2 b # bounding box+ ?8 H* L6 u8 y# K1 q7 U |6 k
xmin = int(member.find('bndbox').find('xmin').text) 6 U4 P. `# N/ k6 h: ] xmax = int(member.find('bndbox').find('xmax').text) % I: |( A: @6 P" p8 ^7 t 2 O. k, k+ z+ l7 u, u d1 N# P) h& B) O4 K( K/ @/ B
ymin = int(member.find('bndbox').find('ymin').text) C l3 A+ F4 X; A4 ]" ?! v9 Z7 b ymax = int(member.find('bndbox').find('ymax').text). c' m& \- f- e% D1 r( p9 [
1 g( f L A9 N; Z. Z5 v! B" |9 B) y. p( R: U q; K4 q! w! t
xmin_corr = (xmin / wt) * self.width e* l. y/ M; {- K- V6 C
xmax_corr = (xmax / wt) * self.width " y l. v% c0 X0 M# V' C N- q! C ymin_corr = (ymin / ht) * self.height % o% C8 g$ A, ^6 `& ^ ymax_corr = (ymax / ht) * self.height C/ D! C" q3 x8 q1 O& D, V boxes.append([xmin_corr, ymin_corr, xmax_corr, ymax_corr])0 K4 |5 u7 i% b, ^. c
* r: e4 I* n2 t
4 T# W! d! u- j# P+ T9 D # convert boxes into a torch.Tensor . A! E9 C$ ~) |/ E3 a% F boxes = torch.as_tensor(boxes, dtype=torch.float32)5 N" S# G! B: g6 a4 d
. K8 J4 P7 N+ s) d& s
# I! N5 M4 _7 F" Y, @: N
# getting the areas of the boxes 9 z! K8 j! v% L. Y, x1 R+ I: A e2 j area = (boxes[:, 3] - boxes[:, 1]) * (boxes[:, 2] - boxes[:, 0]) % Y, N- |% h) n6 D6 h4 ~ 0 B6 o: I( L1 t 8 c, `( b1 m# ]+ h # suppose all instances are not crowd2 U! l" |, i. G, J
iscrowd = torch.zeros((boxes.shape[0],), dtype=torch.int64) 0 u( x. q% q6 U! D! y. e+ P; b! ]& h# b9 x( _
0 x+ _, D4 e- S5 J% o labels = torch.as_tensor(labels, dtype=torch.int64) 9 x/ [+ r. t8 q6 p6 X- W , K0 |% n: W! y) c : X/ w( L8 b2 U6 x( Y% y8 M0 | D5 s target = {} 6 T$ Q1 l- a4 u, } target["boxes"] = boxes0 {. T5 o$ q7 R& ]( V
target["labels"] = labels& w8 |# q) b, r, X7 B
target["area"] = area) y3 |/ q3 C( E0 |+ D
target["iscrowd"] = iscrowd K- b7 a9 Z. k" [+ q, ?1 m # image_id - w% S& \+ C+ ]3 O: M7 z image_id = torch.tensor([idx]) , O0 u4 I4 d+ [+ i/ h5 U target["image_id"] = image_id # n4 v' W" C8 [3 B$ D F, e$ g3 y5 X# N% g$ o4 u! q l 3 A7 V+ |( L* \! \1 _) {( b, n/ t. r1 L if self.transforms: . r, ~& O1 j! o8 u j W+ a( G( U sample = self.transforms(image=img_res, * b3 M2 O# h; o- G0 I9 K bboxes=target['boxes'], * Q7 y5 b9 M5 m- g ~. F0 F1 O labels=labels)' l" N [0 l. M/ O `