README.MD 8.0 KB
Newer Older
C
chenyuntc 已提交
1
# A Simple and Fast Implementation of Faster R-CNN
C
chenyuntc 已提交
2

Y
Yun Chen 已提交
3
## 1. Introduction
C
chenyuntc 已提交
4

5
**[Update:]** I've further simplified the code to pytorch 1.5, torchvision 0.6, and replace the customized ops roipool and nms with the one from torchvision.  if you want the old version code, please checkout branch [v1.0](https://github.com/chenyuntc/simple-faster-rcnn-pytorch/tree/v1.0)
C
chenyun 已提交
6

7 8 9


This project is a **Simplified** Faster R-CNN implementation based on [chainercv](https://github.com/chainer/chainercv) and other [projects](#acknowledgement) . I hope it can serve as an start code for those who want to know the detail of Faster R-CNN.  It aims to:
C
chenyuntc 已提交
10 11

- Simplify the code (*Simple is better than complex*)
C
chenyuntc 已提交
12
- Make the code more straightforward (*Flat is better than nested*)
C
chenyuntc 已提交
13
- Match the performance reported in [origin paper](https://arxiv.org/abs/1506.01497) (*Speed Counts and mAP Matters*)
C
chenyuntc 已提交
14

C
chenyuntc 已提交
15
And it has the following features:
16
- It can be run as pure Python code, no more build affair. 
C
chenyuntc 已提交
17 18
- It's a minimal implemention in around 2000 lines valid code with a lot of comment and instruction.(thanks to chainercv's excellent documentation)
- It achieves higher mAP than the origin implementation (0.712 VS 0.699)
19
- It achieve speed compariable with other implementation (6fps and 14fps for train and test in TITAN XP)
C
chenyuntc 已提交
20 21 22
- It's memory-efficient (about 3GB for vgg16)


Y
Yun Chen 已提交
23
![img](imgs/faster-speed.jpg)
C
chenyuntc 已提交
24 25 26



Y
Yun Chen 已提交
27
## 2. Performance
C
chenyuntc 已提交
28

Y
Yun Chen 已提交
29
### 2.1 mAP
C
chenyuntc 已提交
30

C
chenyuntc 已提交
31
VGG16 train on `trainval` and test on `test` split. 
C
chenyuntc 已提交
32

C
chenyuntc 已提交
33
**Note**: the training shows great randomness, you may need a bit of luck and more epoches of training to reach the highest mAP. However, it should be easy to surpass the lower bound. 
C
chenyuntc 已提交
34 35 36 37

|              Implementation              |     mAP     |
| :--------------------------------------: | :---------: |
| [origin paper](https://arxiv.org/abs/1506.01497) |    0.699    |
C
chenyuntc 已提交
38
|    train with caffe pretrained model     | 0.700-0.712 |
C
chenyuntc 已提交
39
| train with torchvision pretrained model  | 0.685-0.701 |
C
chenyuntc 已提交
40
| model converted from [chainercv](https://github.com/chainer/chainercv/tree/master/examples/faster_rcnn) (reported 0.706) |   0.7053    |
C
chenyuntc 已提交
41

Y
Yun Chen 已提交
42
### 2.2 Speed
C
chenyuntc 已提交
43

C
chenyuntc 已提交
44 45 46
|              Implementation              |   GPU    | Inference | Trainining |
| :--------------------------------------: | :------: | :-------: | :--------: |
| [origin paper](https://arxiv.org/abs/1506.01497) |   K40    |   5 fps   |     NA     |
C
chenyuntc 已提交
47 48
|                 This[1]                  | TITAN Xp | 14-15 fps |   6 fps    |
| [pytorch-faster-rcnn](https://github.com/ruotianluo/pytorch-faster-rcnn) | TITAN Xp | 15-17fps  |    6fps    |
C
chenyuntc 已提交
49

C
chenyun 已提交
50
[1]: make sure you install cupy correctly and only one program run on the GPU. The training speed is sensitive to your gpu status. see [troubleshooting](troubleshooting) for more info. Morever it's slow in the start of the program -- it need time to warm up.
51 52

It could be faster by removing visualization, logging, averaging loss etc.
Y
Yun Chen 已提交
53
## 3. Install dependencies
C
chenyuntc 已提交
54 55


56 57 58 59 60 61 62 63
Here is an example of create environ **from scratch** with `anaconda`

```sh
# create conda env
conda create --name simp python=3.7
conda activate simp
# install pytorch
conda install pytorch torchvision cudatoolkit=10.2 -c pytorch
C
chenyuntc 已提交
64

65 66
# install other dependancy
pip install visdom scikit-image tqdm fire ipdb pprint matplotlib torchnet
C
chenyuntc 已提交
67

68 69 70 71
# start visdom
nohup python -m visdom.server &

```
C
chenyuntc 已提交
72

73
If you don't use anaconda, then:
C
chenyuntc 已提交
74

75 76 77
- install PyTorch with GPU (code are GPU-only), refer to [official website](http://pytorch.org)

- install other dependencies:  `pip install visdom scikit-image tqdm fire ipdb pprint matplotlib torchnet`
C
chenyuntc 已提交
78

C
chenyun 已提交
79
- start visdom for visualization
C
chenyuntc 已提交
80

Y
Yun Chen 已提交
81
```Bash
C
chenyun 已提交
82
nohup python -m visdom.server &
C
chenyuntc 已提交
83 84 85
```


86

Y
Yun Chen 已提交
87
## 4. Demo
C
chenyuntc 已提交
88

89
Download pretrained model from [Google Drive](https://drive.google.com/open?id=1cQ27LIn-Rig4-Uayzy_gH5-cW-NRGVzY) or [Baidu Netdisk( passwd: scxn)](https://pan.baidu.com/s/1o87RuXW)
C
chenyuntc 已提交
90

C
chenyuntc 已提交
91

C
chenyuntc 已提交
92
See [demo.ipynb](https://github.com/chenyuntc/simple-faster-rcnn-pytorch/blob/master/demo.ipynb) for more detail.
C
chenyuntc 已提交
93

Y
Yun Chen 已提交
94
## 5. Train
C
chenyuntc 已提交
95

Y
Yun Chen 已提交
96
### 5.1 Prepare data
C
chenyuntc 已提交
97 98 99 100 101

#### Pascal VOC2007

1. Download the training, validation, test data and VOCdevkit

Y
Yun Chen 已提交
102
   ```Bash
C
chenyuntc 已提交
103 104 105 106 107 108 109
   wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
   wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
   wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
   ```

2. Extract all of these tars into one directory named `VOCdevkit`

Y
Yun Chen 已提交
110
   ```Bash
C
chenyuntc 已提交
111 112 113 114 115 116 117
   tar xvf VOCtrainval_06-Nov-2007.tar
   tar xvf VOCtest_06-Nov-2007.tar
   tar xvf VOCdevkit_08-Jun-2007.tar
   ```

3. It should have this basic structure

Y
Yun Chen 已提交
118
   ```Bash
C
chenyuntc 已提交
119 120 121 122 123 124
   $VOCdevkit/                           # development kit
   $VOCdevkit/VOCcode/                   # VOC utility code
   $VOCdevkit/VOC2007                    # image sets, annotations, etc.
   # ... and several other directories ...
   ```

Y
Yun Chen 已提交
125
4. modify `voc_data_dir` cfg item in `utils/config.py`, or pass it to program using argument like `--voc-data-dir=/path/to/VOCdevkit/VOC2007/` .
C
chenyuntc 已提交
126 127


128
### 5.2 [Optional]Prepare caffe-pretrained vgg16
C
chenyuntc 已提交
129

C
chenyuntc 已提交
130
If you want to use caffe-pretrain model as initial weight, you can run below to get vgg16 weights converted from caffe, which is the same as the origin paper use.
C
chenyuntc 已提交
131

Y
Yun Chen 已提交
132
````Bash
C
chenyuntc 已提交
133 134 135
python misc/convert_caffe_pretrain.py
````

136
This scripts would download pretrained model and converted it to the format compatible with torchvision. If you are in China and can not download the pretrain model, you may refer to [this issue](https://github.com/chenyuntc/simple-faster-rcnn-pytorch/issues/63)
C
chenyuntc 已提交
137

C
chenyun 已提交
138
Then you could specify where caffe-pretraind model `vgg16_caffe.pth` stored in `utils/config.py` by setting `caffe_pretrain_path`. The default path is ok.
C
chenyuntc 已提交
139

Y
Yun Chen 已提交
140
If you want to use pretrained model from torchvision, you may skip this step.
C
chenyuntc 已提交
141

C
chenyuntc 已提交
142
**NOTE**, caffe pretrained model has shown slight better performance.
C
chenyuntc 已提交
143

C
chenyuntc 已提交
144
**NOTE**: caffe model require images in BGR 0-255, while torchvision model requires images in RGB and 0-1. See `data/dataset.py`for more detail. 
C
chenyuntc 已提交
145

Y
Yun Chen 已提交
146
### 5.3 begin training
C
chenyuntc 已提交
147 148


C
chenyuntc 已提交
149
```bash
150
python train.py train --env='fasterrcnn' --plot-every=100
C
chenyuntc 已提交
151 152
```

Y
Yun Chen 已提交
153
you may refer to `utils/config.py` for more argument.
C
chenyuntc 已提交
154 155 156

Some Key arguments:

Y
Yun Chen 已提交
157 158
- `--caffe-pretrain=False`: use pretrain model from caffe or torchvision (Default: torchvison)
- `--plot-every=n`: visualize prediction, loss etc every `n` batches.
C
chenyuntc 已提交
159
- `--env`: visdom env for visualization
C
chenyuntc 已提交
160
- `--voc_data_dir`: where the VOC data stored
Y
Yun Chen 已提交
161
- `--use-drop`: use dropout in RoI head, default False
C
chenyuntc 已提交
162
- `--use-Adam`: use Adam instead of SGD, default SGD. (You need set a very low `lr` for Adam)
Y
Yun Chen 已提交
163
- `--load-path`: pretrained model path, default `None`, if it's specified, it would be loaded.
C
chenyuntc 已提交
164

Y
Yun Chen 已提交
165
you may open browser, visit `http://<ip>:8097` and see the visualization of training procedure as below:
C
chenyuntc 已提交
166

Y
Yun Chen 已提交
167
![visdom](imgs/visdom-fasterrcnn.png)
Y
Yun Chen 已提交
168

C
chenyuntc 已提交
169
## Troubleshooting
C
chenyuntc 已提交
170 171

- dataloader: `received 0 items of ancdata` 
Y
Yun Chen 已提交
172

C
chenyuntc 已提交
173
  see [discussion](https://github.com/pytorch/pytorch/issues/973#issuecomment-346405667), It's alreadly fixed in [train.py](https://github.com/chenyuntc/simple-faster-rcnn-pytorch/blob/master/train.py#L17-L22). So I think you are free from this problem.
Y
Yun Chen 已提交
174
  
C
chenyun 已提交
175
- Windows support
176

C
chenyun 已提交
177
  I don't have windows machine with GPU to debug and test it. It's welcome if anyone could make a pull request and test it.
C
chenyuntc 已提交
178

C
chenyuntc 已提交
179

C
chenyuntc 已提交
180

C
chenyuntc 已提交
181
## Acknowledgement
C
chenyuntc 已提交
182 183
This work builds on many excellent works, which include:

C
chenyuntc 已提交
184
- [Yusuke Niitani's ChainerCV](https://github.com/chainer/chainercv) (mainly)
Y
Yun Chen 已提交
185 186
- [Ruotian Luo's pytorch-faster-rcnn](https://github.com/ruotianluo/pytorch-faster-rcnn) which based on [Xinlei Chen's tf-faster-rcnn](https://github.com/endernewton/tf-faster-rcnn)
- [faster-rcnn.pytorch by Jianwei Yang and Jiasen Lu](https://github.com/jwyang/faster-rcnn.pytorch).It mainly refer to [longcw's faster_rcnn_pytorch](https://github.com/longcw/faster_rcnn_pytorch)
C
chenyuntc 已提交
187
- All the above Repositories have referred to [py-faster-rcnn by Ross Girshick and Sean Bell](https://github.com/rbgirshick/py-faster-rcnn)  either directly or indirectly. 
C
chenyuntc 已提交
188

Y
Yun Chen 已提交
189
## ^_^
C
chenyuntc 已提交
190 191
Licensed under MIT, see the LICENSE for more detail.

C
chenyuntc 已提交
192
Contribution Welcome.
C
chenyuntc 已提交
193

C
chenyun 已提交
194
If you encounter any problem, feel free to open an issue, but too busy lately.
C
chenyuntc 已提交
195 196

Correct me if anything is wrong or unclear.
Y
Yun Chen 已提交
197 198

model structure
Y
Yun Chen 已提交
199
![img](imgs/model_all.png)