megengine_vision_weightnet.md 5.3 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145
---
template: hub1
title: WeightNet 
summary:
    en_US: "WeightNet: Revisiting the Design Space of Weight Network"
    zh_CN: WeightNet - ShuffleNet V2(ImageNet 预训练权重)
author: MegEngine Team
tags: [vision, classification]
github-link: https://github.com/megvii-model/WeightNet
---

```python
import megengine.hub
model = megengine.hub.load('megvii-model/weightnet', 'shufflenet_v2_x0_5', pretrained=True)
model.eval()
```
<!-- section: zh_CN -->

所有预训练模型希望数据被正确预处理。
模型要求输入BGR的图片, 短边缩放到`256`, 并中心裁剪至`(224 x 224)`的大小,无需归一化处理。

下面是一段处理一张图片的样例代码。

```python
# Download an example image from the megengine data website
import urllib
url, filename = ("https://data.megengine.org.cn/images/cat.jpg", "cat.jpg")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)

# Read and pre-process the image
import cv2
import numpy as np
import megengine.data.transform as T
import megengine.functional as F

image = cv2.imread("cat.jpg").astype(np.float32)
transform = T.Compose([
    T.Resize(256),
    T.CenterCrop(224),
    T.ToMode("CHW"),
])
processed_img = transform.apply(image)[np.newaxis, :]  # CHW -> 1CHW
logits = model(processed_img)
probs = F.softmax(logits)
print(probs)
```

### 模型描述

目前我们提供了部分在ImageNet上的预训练模型(见下表),各个网络结构在ImageNet验证集上的表现如下:


| Model               | #Params. | FLOPs | Top-1 err. |
|---------------------|----------|-------|------------|
| ShuffleNetV2 (0.5×) | 1.4M     | 41M   | 39.7       |
| + WeightNet (1×)    | 1.5M     | 41M   | **36.7**   |
| ShuffleNetV2 (1.0×) | 2.2M     | 138M  | 30.9       |
| + WeightNet (1×)    | 2.4M     | 139M  | **28.8**   |
| ShuffleNetV2 (1.5×) | 3.5M     | 299M  | 27.4       |
| + WeightNet (1×)    | 3.9M     | 301M  | **25.6**   |
| ShuffleNetV2 (2.0×) | 5.5M     | 557M  | 25.5       |
| + WeightNet (1×)    | 6.1M     | 562M  | **24.1**   |



| Model               | #Params. | FLOPs | Top-1 err. |
|---------------------|----------|-------|------------|
| ShuffleNetV2 (0.5×) | 1.4M     | 41M   | 39.7       |
| + WeightNet (8×)    | 2.7M     | 42M   | **34.0**   |
| ShuffleNetV2 (1.0×) | 2.2M     | 138M  | 30.9       |
| + WeightNet (4×)    | 5.1M     | 141M  | **27.6**   |
| ShuffleNetV2 (1.5×) | 3.5M     | 299M  | 27.4       |
| + WeightNet (4×)    | 9.6M     | 307M  | **25.0**   |
| ShuffleNetV2 (2.0×) | 5.5M     | 557M  | 25.5       |
| + WeightNet (4×)    | 18.1M    | 573M  | **23.5**   |

### 参考文献

- [WeightNet: Revisiting the Design Space of Weight Network](https://arxiv.org/abs/2007.11823), Ma, Ningning, et al. "WeightNet: Revisiting the Design Space of Weight Network." Proceedings of the European Conference on Computer Vision (ECCV). 2020.

<!-- section: en_US -->

All pre-trained models expect input images normalized in the same way,
i.e. input images must be 3-channel BGR images of shape `(H x W x 3)`, and reszied shortedge to `256`, center-cropped to `(224 x 224)`.
No normalizations required.

Here's a sample execution.

```python
# Download an example image from the megengine data website
import urllib
url, filename = ("https://data.megengine.org.cn/images/cat.jpg", "cat.jpg")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)

# Read and pre-process the image
import cv2
import numpy as np
import megengine.data.transform as T
import megengine.functional as F

image = cv2.imread("cat.jpg").astype(np.float32)
transform = T.Compose([
    T.Resize(256),
    T.CenterCrop(224),
    T.ToMode("CHW"),
])
processed_img = transform.apply(image)[np.newaxis, :]  # CHW -> 1CHW
logits = model(processed_img)
probs = F.softmax(logits)
print(probs)
```

### Model Description

Currently we provide several pretrained models(see the table below), Their 1-crop accuracy on ImageNet validation dataset can be found in following table.

| Model               | #Params. | FLOPs | Top-1 err. |
|---------------------|----------|-------|------------|
| ShuffleNetV2 (0.5×) | 1.4M     | 41M   | 39.7       |
| + WeightNet (1×)    | 1.5M     | 41M   | **36.7**   |
| ShuffleNetV2 (1.0×) | 2.2M     | 138M  | 30.9       |
| + WeightNet (1×)    | 2.4M     | 139M  | **28.8**   |
| ShuffleNetV2 (1.5×) | 3.5M     | 299M  | 27.4       |
| + WeightNet (1×)    | 3.9M     | 301M  | **25.6**   |
| ShuffleNetV2 (2.0×) | 5.5M     | 557M  | 25.5       |
| + WeightNet (1×)    | 6.1M     | 562M  | **24.1**   |



| Model               | #Params. | FLOPs | Top-1 err. |
|---------------------|----------|-------|------------|
| ShuffleNetV2 (0.5×) | 1.4M     | 41M   | 39.7       |
| + WeightNet (8×)    | 2.7M     | 42M   | **34.0**   |
| ShuffleNetV2 (1.0×) | 2.2M     | 138M  | 30.9       |
| + WeightNet (4×)    | 5.1M     | 141M  | **27.6**   |
| ShuffleNetV2 (1.5×) | 3.5M     | 299M  | 27.4       |
| + WeightNet (4×)    | 9.6M     | 307M  | **25.0**   |
| ShuffleNetV2 (2.0×) | 5.5M     | 557M  | 25.5       |
| + WeightNet (4×)    | 18.1M    | 573M  | **23.5**   |

### References

- [WeightNet: Revisiting the Design Space of Weight Network](https://arxiv.org/abs/2007.11823), Ma, Ningning, et al. "WeightNet: Revisiting the Design Space of Weight Network." Proceedings of the European Conference on Computer Vision (ECCV). 2020.