README.MD

    A Pythonic, Extensible and Minimal Implemention of Faster RCNN Without Harming Performance

    Introduction

    This project is a Simplified Faster R-CNN implementation mostly based on chainercv and Other projects . It aims to:

    • Simplify the code (Simple is better than complex)
    • Make the code more straight forward (Flat is better than nested)
    • Match the performance reported in origin paper (Speed Counts and mAP Matters)

    Performance

    • mAP

    VGG16 train on trainval and test on test, Note, the training show great randomness, you may need to train more epoch to reach the highest mAP. However, it should be easy to reach the lowerboud. It's also reported that train it with more epochs may

    Implementation mAP
    origin paper 0.699
    using caffe pretrained model (enable with--caffe-pretrain) 0.700-0.708
    using torchvision pretrained model 0.690-0.701
    model converted from chainercv (reported 0.706) 0.7053
    current the best i've ever seen (ruotian's) 0.711
    • Speed
    Implementation GPU Inference Trainining
    origin paper K40 5 fps NA
    This TITAN Xp 12 fps^*^ 5-6 fps
    pytorch-faster-rcnn TITAN Xp NA 5-6fps^**^

    * include reading images from disk, preprocessing, etc. see eval in train.py for more detail.

    ** it depends on the environment.

    NOTE that you should make sure you install cupy correctly to reach the benchmark.

    Install Prerequisites

    • install PyTorch >=0.3 with GPU (code are gpu-only), refer to official website

    • install cupy, you can install via pip install but it's better to read the docs and make sure the environ is correctly set

    • install other dependencies: pip install -r requirements.txt

    • Optional but recommended: build nms_gpu_post: cd model/utils/nmspython3 build.py build_ext --inplace

    • start vidom for visualize

    nohup python3 -m visdom.server &

    If you're in China and have encounter problem with visdom (i.e. timeout, blank screen), you may refer to visdom issue, and a temporay solution provided by me

    Demo

    download pretrained model from [..............................................]

    see demo.ipynb for detail

    Train

    Data

    Pascal VOC2007

    1. Download the training, validation, test data and VOCdevkit

      wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
      wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
      wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
    2. Extract all of these tars into one directory named VOCdevkit

      tar xvf VOCtrainval_06-Nov-2007.tar
      tar xvf VOCtest_06-Nov-2007.tar
      tar xvf VOCdevkit_08-Jun-2007.tar
    3. It should have this basic structure

      $VOCdevkit/                           # development kit
      $VOCdevkit/VOCcode/                   # VOC utility code
      $VOCdevkit/VOC2007                    # image sets, annotations, etc.
      # ... and several other directories ...
    4. specifiy the voc_data_dir in config.py, or pass it to program using argument like '--voc-data-dir=/path/to/VOCdevkit/VOC2007/' .

    COCO

    TBD

    preprare caffe-pretrained vgg16

    if you want to use caffe-pretrain model, you can run:

    python misc/convert_caffe_pretrain.py

    then you should speicified where caffe-pretraind model vgg16_caffe.pth stored in config.py

    if you want to use torchvision pretrained model, you may skip this.

    begin traininig

    make checkpoints/ # make dir for storing snapshots
    python3 train.py train --env='fasterrcnn-caffe' --plot-every=100 --caffe-pretrain

    you may refer to config.py for more argument.

    Some Key arguments:

    • --caffe-pretrain=True: use caffe pretrain model or use torchvision pretrained model(Default: torchvison)
    • --plot-every=n: visulize predict, loss etc every n batches.
    • --env: visdom env for visulization
    • --voc_data_dir: where the VOC data stored
    • --use-drop: use dropout in roi head, default without dropout
    • --use-adam: use adam instead of SGD, default SGD
    • --load-path: pretrained model path, default None, if it's specified, the pretrained model would be loaded.

    Troubleshooting

    • visdom
    • dataloader/ulimit
    • cupy
    • vgg

    TODO

    [] training on coco [] resnet [] replace cupy with THTensor+cffi?

    Acknowledge

    This work builds on many excellent works, which include:

    项目简介

    🚀 Github 镜像仓库 🚀

    源项目地址

    https://github.com/chenyuntc/simple-faster-rcnn-pytorch

    发行版本

    当前项目没有发行版本

    贡献者 5

    开发语言

    • Jupyter Notebook 86.3 %
    • Python 13.8 %