getting_started_en.md 7.3 KB
Newer Older
S
refine  
shippingwang 已提交
1
# Getting Started
S
shippingwang 已提交
2
---
S
refine  
shippingwang 已提交
3
Please refer to [Installation](install.md) to setup environment at first, and prepare ImageNet1K data by following the instruction mentioned in the [data](data.md)
S
shippingwang 已提交
4

littletomatodonkey's avatar
littletomatodonkey 已提交
5
## 1. Training and Evaluation on Windows or CPU
S
shippingwang 已提交
6

littletomatodonkey's avatar
littletomatodonkey 已提交
7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
If training and evaluation are performed on Windows system or CPU, it is recommended to use the `tools/train_multi_platform.py` and `tools/eval_multi_platform.py` scripts.


## 1.1 Model training

After preparing the configuration file, The training process can be started in the following way.

```
python tools/train_multi_platform.py \
    -c configs/ResNet/ResNet50.yaml \
    -o model_save_dir=./output/ \
    -o use_gpu=True
```

Among them, `-c` is used to specify the path of the configuration file, `-o` is used to specify the parameters needed to be modified or added, `-o model_save_dir=./output/` means to modify the `model_save_dir` in the configuration file to ` ./output/`. `-o use_gpu=True` means to use GPU for training. If you want to use the CPU for training, you need to set `use_gpu` to `False`.


Of course, you can also directly modify the configuration file to update the configuration. For specific configuration parameters, please refer to [Configuration Document](config.md).

* The output log examples are as follows:
    * If mixup or cutmix is used in training, only loss, lr (learning rate) and training time of the minibatch will be printed in the log.

    ```
    train step:890  loss:  6.8473 lr: 0.100000 elapse: 0.157s
    ```

    * If mixup or cutmix is not used during training, in addition to loss, lr (learning rate) and the training time of the minibatch, top-1 and top-k( The default is 5) will also be printed in the log.

    ```
    epoch:0    train    step:13    loss:7.9561    top1:0.0156    top5:0.1094    lr:0.100000    elapse:0.193s
    ```

During training, you can view loss changes in real time through `VisualDL`. The command is as follows.
S
shippingwang 已提交
40 41

```bash
littletomatodonkey's avatar
littletomatodonkey 已提交
42
visualdl --logdir ./scalar --host <host_IP> --port <port_num>
S
shippingwang 已提交
43 44
```

littletomatodonkey's avatar
littletomatodonkey 已提交
45
### 1.2 Model finetuning
S
shippingwang 已提交
46

littletomatodonkey's avatar
littletomatodonkey 已提交
47
* After configuring the configuration file, you can finetune it by loading the pretrained weights, The command is as shown below.
S
shippingwang 已提交
48

littletomatodonkey's avatar
littletomatodonkey 已提交
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
```
python tools/train_multi_platform.py \
    -c configs/ResNet/ResNet50.yaml \
    -o pretrained_model="./pretrained/ResNet50_pretrained"
```

Among them, `pretrained_model` is used to set the address to load the pretrained weights. When using it, you need to replace it with your own pretrained weights' path, or you can modify the path directly in the configuration file.

### 1.3 Resume Training

* If the training process is terminated for some reasons, you can also load the checkpoints to continue training.

```
python tools/train_multi_platform.py \
    -c configs/ResNet/ResNet50.yaml \
    -o checkpoints="./output/ResNet/0/ppcls"
```

The configuration file does not need to be modified. You only need to add the `checkpoints` parameter during training, which represents the path of the checkpoints. The parameter weights, earning rate, optimizer and other information will be loaded using this parameter.


### 1.4 Model evaluation

* The model evaluation process can be started as follows.
S
shippingwang 已提交
73 74

```bash
littletomatodonkey's avatar
littletomatodonkey 已提交
75 76 77 78 79
python tools/eval_multi_platform.py \
    -c ./configs/eval.yaml \
    -o ARCHITECTURE.name="ResNet50_vd" \
    -o pretrained_model=path_to_pretrained_models
```
S
shippingwang 已提交
80

littletomatodonkey's avatar
littletomatodonkey 已提交
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96
You can modify the `ARCHITECTURE.name` field and `pretrained_model` field in `configs/eval.yaml` to configure the evaluation model, and you also can update the configuration through the -o parameter.


**Note:** When loading the pretrained model, you need to specify the prefix of the pretrained model. For example, the pretrained model path is `output/ResNet50_vd/19`, and the pretrained model filename is `output/ResNet50_vd/19/ppcls.pdparams`, the parameter `pretrained_model` needs to be specified as `output/ResNet50_vd/19/ppcls`, PaddleClas will automatically fill in the `.pdparams` suffix.

### 2. Training and evaluation on Linux+GPU

If you want to run PaddleClas on Linux with GPU, it is highly recommended to use the model training and evaluation scripts provided by PaddleClas: `tools/train.py` and `tools/eval.py`.

### 2.1 Model training

After preparing the configuration file, The training process can be started in the following way.

```bash
# PaddleClas starts multi-card and multi-process training through launch
# Specify the GPU running card number by setting FLAGS_selected_gpus
S
shippingwang 已提交
97 98 99 100 101 102
python -m paddle.distributed.launch \
    --selected_gpus="0,1,2,3" \
    tools/train.py \
        -c ./configs/ResNet/ResNet50_vd.yaml
```

littletomatodonkey's avatar
littletomatodonkey 已提交
103
The configuration can be updated by adding the `-o` parameter.
S
shippingwang 已提交
104 105 106 107 108 109 110

```bash
python -m paddle.distributed.launch \
    --selected_gpus="0,1,2,3" \
    tools/train.py \
        -c ./configs/ResNet/ResNet50_vd.yaml \
        -o use_mix=1 \
littletomatodonkey's avatar
littletomatodonkey 已提交
111
        --vdl_dir=./scalar/
S
shippingwang 已提交
112 113
```

littletomatodonkey's avatar
littletomatodonkey 已提交
114 115 116 117 118 119 120
The format of output log information is the same as above.



### 2.2 Model finetuning

* After configuring the configuration file, you can finetune it by loading the pretrained weights, The command is as shown below.
S
shippingwang 已提交
121 122

```
littletomatodonkey's avatar
littletomatodonkey 已提交
123 124 125 126 127
python -m paddle.distributed.launch \
    --selected_gpus="0,1,2,3" \
    tools/train.py \
        -c configs/ResNet/ResNet50.yaml \
        -o pretrained_model="./pretrained/ResNet50_pretrained"
S
shippingwang 已提交
128 129
```

littletomatodonkey's avatar
littletomatodonkey 已提交
130
Among them, `pretrained_model` is used to set the address to load the pretrained weights. When using it, you need to replace it with your own pretrained weights' path, or you can modify the path directly in the configuration file.
S
shippingwang 已提交
131

littletomatodonkey's avatar
littletomatodonkey 已提交
132
* There contains a lot of examples of model finetuning in [The quick start tutorial](./quick_start_en.md). You can refer to this tutorial to finetune the model on a specific dataset.
S
shippingwang 已提交
133

littletomatodonkey's avatar
littletomatodonkey 已提交
134
### 2.3 Resume Training
S
shippingwang 已提交
135

littletomatodonkey's avatar
littletomatodonkey 已提交
136
* If the training process is terminated for some reasons, you can also load the checkpoints to continue training.
S
shippingwang 已提交
137

littletomatodonkey's avatar
littletomatodonkey 已提交
138 139 140 141 142 143 144
```
python -m paddle.distributed.launch \
    --selected_gpus="0,1,2,3" \
    tools/train.py \
        -c configs/ResNet/ResNet50.yaml \
        -o checkpoints="./output/ResNet/0/ppcls"
```
S
shippingwang 已提交
145

littletomatodonkey's avatar
littletomatodonkey 已提交
146
The configuration file does not need to be modified. You only need to add the `checkpoints` parameter during training, which represents the path of the checkpoints. The parameter weights, learning rate, optimizer and other information will be loaded using this parameter.
S
shippingwang 已提交
147

littletomatodonkey's avatar
littletomatodonkey 已提交
148
### 2.4 Model evaluation
S
shippingwang 已提交
149

littletomatodonkey's avatar
littletomatodonkey 已提交
150
* The model evaluation process can be started as follows.
S
shippingwang 已提交
151 152

```bash
littletomatodonkey's avatar
littletomatodonkey 已提交
153
python tools/eval_multi_platform.py \
S
shippingwang 已提交
154 155 156
    -c ./configs/eval.yaml \
    -o ARCHITECTURE.name="ResNet50_vd" \
    -o pretrained_model=path_to_pretrained_models
littletomatodonkey's avatar
littletomatodonkey 已提交
157
```
S
shippingwang 已提交
158

littletomatodonkey's avatar
littletomatodonkey 已提交
159
You can modify the `ARCHITECTURE.name` field and `pretrained_model` field in `configs/eval.yaml` to configure the evaluation model, and you also can update the configuration through the -o parameter.
S
refine  
shippingwang 已提交
160

S
shippingwang 已提交
161

littletomatodonkey's avatar
littletomatodonkey 已提交
162
## 3. Model inference
S
shippingwang 已提交
163

littletomatodonkey's avatar
littletomatodonkey 已提交
164
PaddlePaddle provides three ways to perform model inference. Next, how to use the inference engine to perforance model inference will be introduced.
S
refine  
shippingwang 已提交
165

littletomatodonkey's avatar
littletomatodonkey 已提交
166
Firstly, you should export inference model using `tools/export_model.py`.
S
shippingwang 已提交
167 168 169

```bash
python tools/export_model.py \
S
refine  
shippingwang 已提交
170 171 172
    --model=model_name \
    --pretrained_model=pretrained_model_dir \
    --output_path=save_inference_dir
S
shippingwang 已提交
173 174

```
littletomatodonkey's avatar
littletomatodonkey 已提交
175 176

Secondly, Inference engine can be started using the following commands.
S
refine  
shippingwang 已提交
177

S
shippingwang 已提交
178 179
```bash
python tools/infer/predict.py \
S
refine  
shippingwang 已提交
180 181 182
    -m model_path \
    -p params_path \
    -i image path \
S
shippingwang 已提交
183 184 185
    --use_gpu=1 \
    --use_tensorrt=True
```
littletomatodonkey's avatar
littletomatodonkey 已提交
186
please refer to [inference](../extension/paddle_inference_en.md) for more details.