diff --git a/models/megengine_vision_frelu.md b/models/megengine_vision_frelu.md new file mode 100644 index 0000000000000000000000000000000000000000..c9cc46f64ac6b30f87247f08ea7184713a4427b8 --- /dev/null +++ b/models/megengine_vision_frelu.md @@ -0,0 +1,119 @@ +--- +template: hub1 +title: FReLU +summary: + en_US: "Funnel Activation for Visual Recognition" + zh_CN: FReLU (ImageNet 预训练权重) +author: MegEngine Team +tags: [vision, classification] +github-link: https://github.com/megvii-model/FunnelAct +--- + +```python +import megengine.hub +model = megengine.hub.load('megvii-model/funnelact', 'resnet50_frelu', pretrained=True) +# model = megengine.hub.load('megvii-model/funnelact', 'shufflenet_v2_x0_5', pretrained=True) +model.eval() +``` + + +所有预训练模型希望数据被正确预处理。 +模型要求输入BGR的图片, 短边缩放到`256`, 并中心裁剪至`(224 x 224)`的大小,无需归一化处理。 + +下面是一段处理一张图片的样例代码。 + +```python +# Download an example image from the megengine data website +import urllib +url, filename = ("https://data.megengine.org.cn/images/cat.jpg", "cat.jpg") +try: urllib.URLopener().retrieve(url, filename) +except: urllib.request.urlretrieve(url, filename) + +# Read and pre-process the image +import cv2 +import numpy as np +import megengine.data.transform as T +import megengine.functional as F + +image = cv2.imread("cat.jpg").astype(np.float32) +transform = T.Compose([ + T.Resize(256), + T.CenterCrop(224), + T.ToMode("CHW"), +]) +processed_img = transform.apply(image)[np.newaxis, :] # CHW -> 1CHW +logits = model(processed_img) +probs = F.softmax(logits) +print(probs) +``` + +### 模型描述 + +目前我们提供了部分在ImageNet上的预训练模型(见下表),各个网络结构在ImageNet验证集上的表现如下: + +| Model | Activation | Top-1 err.| +| :---------------------- | :--------: | :------: | +| ResNet50 | ReLU | 24.0 | +| ResNet50 | PReLU | 23.7 | +| ResNet50 | Swish | 23.5 | +| ResNet50 | FReLU | **22.4** | +| ShuffleNetV2 0.5x | ReLU | 39.6 | +| ShuffleNetV2 0.5x | PReLU | 39.1 | +| ShuffleNetV2 0.5x | Swish | 38.7 | +| ShuffleNetV2 0.5x | FReLU | **37.1** | + +### 参考文献 + +- [Funnel Activation for Visual Recognition](https://arxiv.org/abs/2007.11824), Ma, Ningning, et al. "Funnel Activation for Visual Recognition." Proceedings of the European Conference on Computer Vision (ECCV). 2020. + + + +All pre-trained models expect input images normalized in the same way, +i.e. input images must be 3-channel BGR images of shape `(H x W x 3)`, and reszied shortedge to `256`, center-cropped to `(224 x 224)`. +No normalization required. + +Here's a sample execution. + +```python +# Download an example image from the megengine data website +import urllib +url, filename = ("https://data.megengine.org.cn/images/cat.jpg", "cat.jpg") +try: urllib.URLopener().retrieve(url, filename) +except: urllib.request.urlretrieve(url, filename) + +# Read and pre-process the image +import cv2 +import numpy as np +import megengine.data.transform as T +import megengine.functional as F + +image = cv2.imread("cat.jpg").astype(np.float32) +transform = T.Compose([ + T.Resize(256), + T.CenterCrop(224), + T.ToMode("CHW"), +]) +processed_img = transform.apply(image)[np.newaxis, :] # CHW -> 1CHW +logits = model(processed_img) +probs = F.softmax(logits) +print(probs) +``` + +### Model Description + +Currently we provide several pretrained models(see the table below), Their 1-crop accuracy on ImageNet validation dataset can be found in following table. + +| Model | Activation | Top-1 err.| +| :---------------------- | :--------: | :------: | +| ResNet50 | ReLU | 24.0 | +| ResNet50 | PReLU | 23.7 | +| ResNet50 | Swish | 23.5 | +| ResNet50 | FReLU | **22.4** | +| ShuffleNetV2 0.5x | ReLU | 39.6 | +| ShuffleNetV2 0.5x | PReLU | 39.1 | +| ShuffleNetV2 0.5x | Swish | 38.7 | +| ShuffleNetV2 0.5x | FReLU | **37.1** | + +### References + +- [Funnel Activation for Visual Recognition](https://arxiv.org/abs/2007.11824), Ma, Ningning, et al. "Funnel Activation for Visual Recognition." Proceedings of the European Conference on Computer Vision (ECCV). 2020. diff --git a/models/megengine_vision_weightnet.md b/models/megengine_vision_weightnet.md new file mode 100644 index 0000000000000000000000000000000000000000..b922a18172b33c07465ff5770471033f4bd63f2a --- /dev/null +++ b/models/megengine_vision_weightnet.md @@ -0,0 +1,145 @@ +--- +template: hub1 +title: WeightNet +summary: + en_US: "WeightNet: Revisiting the Design Space of Weight Network" + zh_CN: WeightNet - ShuffleNet V2(ImageNet 预训练权重) +author: MegEngine Team +tags: [vision, classification] +github-link: https://github.com/megvii-model/WeightNet +--- + +```python +import megengine.hub +model = megengine.hub.load('megvii-model/weightnet', 'shufflenet_v2_x0_5', pretrained=True) +model.eval() +``` + + +所有预训练模型希望数据被正确预处理。 +模型要求输入BGR的图片, 短边缩放到`256`, 并中心裁剪至`(224 x 224)`的大小,无需归一化处理。 + +下面是一段处理一张图片的样例代码。 + +```python +# Download an example image from the megengine data website +import urllib +url, filename = ("https://data.megengine.org.cn/images/cat.jpg", "cat.jpg") +try: urllib.URLopener().retrieve(url, filename) +except: urllib.request.urlretrieve(url, filename) + +# Read and pre-process the image +import cv2 +import numpy as np +import megengine.data.transform as T +import megengine.functional as F + +image = cv2.imread("cat.jpg").astype(np.float32) +transform = T.Compose([ + T.Resize(256), + T.CenterCrop(224), + T.ToMode("CHW"), +]) +processed_img = transform.apply(image)[np.newaxis, :] # CHW -> 1CHW +logits = model(processed_img) +probs = F.softmax(logits) +print(probs) +``` + +### 模型描述 + +目前我们提供了部分在ImageNet上的预训练模型(见下表),各个网络结构在ImageNet验证集上的表现如下: + + +| Model | #Params. | FLOPs | Top-1 err. | +|---------------------|----------|-------|------------| +| ShuffleNetV2 (0.5×) | 1.4M | 41M | 39.7 | +| + WeightNet (1×) | 1.5M | 41M | **36.7** | +| ShuffleNetV2 (1.0×) | 2.2M | 138M | 30.9 | +| + WeightNet (1×) | 2.4M | 139M | **28.8** | +| ShuffleNetV2 (1.5×) | 3.5M | 299M | 27.4 | +| + WeightNet (1×) | 3.9M | 301M | **25.6** | +| ShuffleNetV2 (2.0×) | 5.5M | 557M | 25.5 | +| + WeightNet (1×) | 6.1M | 562M | **24.1** | + + + +| Model | #Params. | FLOPs | Top-1 err. | +|---------------------|----------|-------|------------| +| ShuffleNetV2 (0.5×) | 1.4M | 41M | 39.7 | +| + WeightNet (8×) | 2.7M | 42M | **34.0** | +| ShuffleNetV2 (1.0×) | 2.2M | 138M | 30.9 | +| + WeightNet (4×) | 5.1M | 141M | **27.6** | +| ShuffleNetV2 (1.5×) | 3.5M | 299M | 27.4 | +| + WeightNet (4×) | 9.6M | 307M | **25.0** | +| ShuffleNetV2 (2.0×) | 5.5M | 557M | 25.5 | +| + WeightNet (4×) | 18.1M | 573M | **23.5** | + +### 参考文献 + +- [WeightNet: Revisiting the Design Space of Weight Network](https://arxiv.org/abs/2007.11823), Ma, Ningning, et al. "WeightNet: Revisiting the Design Space of Weight Network." Proceedings of the European Conference on Computer Vision (ECCV). 2020. + + + +All pre-trained models expect input images normalized in the same way, +i.e. input images must be 3-channel BGR images of shape `(H x W x 3)`, and reszied shortedge to `256`, center-cropped to `(224 x 224)`. +No normalizations required. + +Here's a sample execution. + +```python +# Download an example image from the megengine data website +import urllib +url, filename = ("https://data.megengine.org.cn/images/cat.jpg", "cat.jpg") +try: urllib.URLopener().retrieve(url, filename) +except: urllib.request.urlretrieve(url, filename) + +# Read and pre-process the image +import cv2 +import numpy as np +import megengine.data.transform as T +import megengine.functional as F + +image = cv2.imread("cat.jpg").astype(np.float32) +transform = T.Compose([ + T.Resize(256), + T.CenterCrop(224), + T.ToMode("CHW"), +]) +processed_img = transform.apply(image)[np.newaxis, :] # CHW -> 1CHW +logits = model(processed_img) +probs = F.softmax(logits) +print(probs) +``` + +### Model Description + +Currently we provide several pretrained models(see the table below), Their 1-crop accuracy on ImageNet validation dataset can be found in following table. + +| Model | #Params. | FLOPs | Top-1 err. | +|---------------------|----------|-------|------------| +| ShuffleNetV2 (0.5×) | 1.4M | 41M | 39.7 | +| + WeightNet (1×) | 1.5M | 41M | **36.7** | +| ShuffleNetV2 (1.0×) | 2.2M | 138M | 30.9 | +| + WeightNet (1×) | 2.4M | 139M | **28.8** | +| ShuffleNetV2 (1.5×) | 3.5M | 299M | 27.4 | +| + WeightNet (1×) | 3.9M | 301M | **25.6** | +| ShuffleNetV2 (2.0×) | 5.5M | 557M | 25.5 | +| + WeightNet (1×) | 6.1M | 562M | **24.1** | + + + +| Model | #Params. | FLOPs | Top-1 err. | +|---------------------|----------|-------|------------| +| ShuffleNetV2 (0.5×) | 1.4M | 41M | 39.7 | +| + WeightNet (8×) | 2.7M | 42M | **34.0** | +| ShuffleNetV2 (1.0×) | 2.2M | 138M | 30.9 | +| + WeightNet (4×) | 5.1M | 141M | **27.6** | +| ShuffleNetV2 (1.5×) | 3.5M | 299M | 27.4 | +| + WeightNet (4×) | 9.6M | 307M | **25.0** | +| ShuffleNetV2 (2.0×) | 5.5M | 557M | 25.5 | +| + WeightNet (4×) | 18.1M | 573M | **23.5** | + +### References + +- [WeightNet: Revisiting the Design Space of Weight Network](https://arxiv.org/abs/2007.11823), Ma, Ningning, et al. "WeightNet: Revisiting the Design Space of Weight Network." Proceedings of the European Conference on Computer Vision (ECCV). 2020.