未验证 提交 186276d4 编写于 作者: L Leowolfking 提交者: GitHub

New feature (#842)

上级 271b8257
......@@ -7,7 +7,8 @@ if __name__ == "__main__":
text = ["今天是个好日子", "天气预报说今天要下雨"]
# 以key的方式指定text传入预测方法的时的参数,此例中为"data"
# 对应本地部署,则为lac.analysis_lexical(data=text, batch_size=1)
data = {"texts": text, "batch_size": 1}
# 若使用lac版本低于2.2.0,需要将`text`参数改为`texts`
data = {"text": text, "batch_size": 1}
# 指定预测方法为lac并发送post请求,content-type类型应指定json方式
url = "http://127.0.0.1:8866/predict/lac"
# 指定post请求的headers为application/json方式
......
二次开发
==================
本章节主要介绍如何通过修改源代码对已有的Task等内容进行自定义。
.. toctree::
:maxdepth: 1
自定义任务<how_to_define_task>
Hook机制<hook>
......@@ -8,10 +8,13 @@ PaddleHub 文档
:titlesonly:
概述<overview>
安装<installation>
快速体验<quickstart>
教程<tutorial/tutorial_index>
安装<install>
快速体验<quick_experience/quick_index>
迁移学习<tutorial/tutorial_index>
部署预训练模型<tutorial/Serving_index>
二次开发<Secondary_development/Secondary_index>
API<reference/ref_index>
命令行参考<tutorial/cmdintro>
FAQ<faq>
社区贡献<contribution/contri_index.rst>
社区贡献<contribution/contri_index>
更新历史<release>
\ No newline at end of file
# PaddleHub安装
## 环境准备
PaddleHub需要与飞桨一起使用,其硬件和操作系统的适用范围与[飞桨](https://www.paddlepaddle.org.cn/install/quick)相同。
> 注意:飞桨版本需要>= 1.7.0。
```shell
# 查看是否安装飞桨
$ python # 进入python解释器
```
```python
import paddle.fluid
paddle.fluid.install_check.run_check()
```
> 如果出现`Your Paddle Fluid is installed successfully`,说明飞桨已成功安装。
```shell
$ pip list | grep paddlepaddle # 查看飞桨版本。pip list查看所有的package版本,grep负责根据关键字筛选。
```
## 安装操作
使用 Python 包管理器pip安装PaddleHub。根据实际需要,在命令行下执行以下命令之一进行PaddleHub的安装(推荐使用第一个)。
> 1.安装过程中需要网络连接,请确保机器可以正常访问网络。成功安装之后,可以离线使用。
2.如果已安装PaddleHub,再次执行安装操作将先卸载再安装。安装方式支持:安装指定版本和安装最新版本。
3.由于国内网速的问题,直接pip安装包通常速度非常慢,而且经常会出现装到一半失败了的问题。使用国内镜像可以节省时间,提高pip安装的效率。
```
国内镜像源列表:
清华大学:https://pypi.tuna.tsinghua.edu.cn/simple/
百度:https://mirror.baidu.com/pypi/simple
```
```shell
$ pip install paddlehub --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple # 安装最新版本,使用清华源
```
```shell
$ pip install paddlehub==1.6.1 -i https://pypi.tuna.tsinghua.edu.cn/simple # 安装指定版本(==1.6.1表示PaddleHub的版本),使用清华源
```
```shell
$ pip install paddlehub --upgrade -i https://mirror.baidu.com/pypi/simple # 安装最新版本,使用百度源
```
```shell
$ pip install paddlehub==1.6.1 -i https://mirror.baidu.com/pypi/simple # 安装指定版本(==1.6.1表示PaddleHub的版本),使用百度源
```
> 等待片刻即安装完毕。如果出现`Successfully installed paddlehub`,说明PaddleHub安装成功。
## 验证安装
检查PaddleHub是否安装成功。
```shell
$ pip list | grep paddlehub # pip list查看所有的package版本,grep负责根据关键字筛选
```
```shell
$ pip show paddlehub # 查看PaddleHub详细信息
```
PaddleHub详细信息的如下面所示,可以查看显示了PaddleHub的版本、位置等信息。
```
Name: paddlehub
Version: 1.7.1
Summary: A toolkit for managing pretrained models of PaddlePaddle and helping user getting started with transfer learning more efficiently.
Home-page: https://github.com/PaddlePaddle/PaddleHub
Author: PaddlePaddle Author
Author-email: paddle-dev@baidu.com
License: Apache 2.0
Location: /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages
Requires: pandas, pre-commit, gunicorn, flake8, visualdl, yapf, flask, protobuf, sentencepiece, six, cma, colorlog
Required-by:
```
## 如何卸载
此卸载仅卸载PaddleHub,已下载的模型文件和数据集仍保留。
```shell
$ pip uninstall paddlehub -y # 卸载PaddleHub
```
> 如果出现`Successfully uninstalled paddlehub`,表明PaddleHub卸载成功。
## pip常用命令
pip是最为广泛使用的Python包管理器,可以帮助我们获得最新的Python包并进行管理。
常用命令如下:
```shell
$ pip install [package-name] # 安装名为[package-name]的包
$ pip install [package-name]==X.X # 安装名为[package-name]的包并指定版本X.X
$ pip install [package-name] --proxy=代理服务器IP:端口号 # 使用代理服务器安装
$ pip install [package-name] --upgrade # 更新名为[package-name]的包
$ pip uninstall [package-name] # 删除名为[package-name]的包
$ pip list # 列出当前环境下已安装的所有包
```
## 常见问题
1. 已安装PaddleHub,可以升级飞桨版本吗?
答复:可以。直接正常升级飞桨版本即可。
2. 已安装PaddleHub,如何升级?
答复:执行`pip install paddlehub --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple`,可以将PaddleHub升级到最新版本。
3. `upgrade`安装与安装指定版本有什么区别?
答复:`upgrade`安装的是最新版本,安装指定版本可以安装任意版本。
4. 如何设置PaddleHub下载的缓存位置?
答复:PaddleHub的Module默认会保存在用户目录下,可以通过修改环境变量`HUB_HOME`更改这个位置。
# 通过命令行调用方式使用PaddleHub
本页面的代码/命令可在[AIStudio](https://aistudio.baidu.com/aistudio/projectdetail/643120)上在线运行,类似notebook的环境,只需通过浏览器即可访问,无需准备环境,非常方便开发者快速体验。
PaddleHub在设计时,为模型的管理和使用提供了命令行工具,也提供了通过命令行调用PaddleHub模型完成预测的方式。比如,前面章节中人像分割和文本分词的任务也可以通过命令行调用的方式实现。
### 体验前请提前安装好PaddleHub
```shell
# 安装最新版本,使用清华源更稳定、更迅速
$ pip install paddlehub --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
```
### 人像扣图
```shell
# 下载待测试图片
$ wget https://paddlehub.bj.bcebos.com/resources/test_image.jpg
# 通过命令行方式实现人像扣图任务
$ hub run deeplabv3p_xception65_humanseg --input_path test_image.jpg --visualization=True --output_dir="humanseg_output"
```
--2020-07-22 12:19:52-- https://paddlehub.bj.bcebos.com/resources/test_image.jpg
Resolving paddlehub.bj.bcebos.com (paddlehub.bj.bcebos.com)... 182.61.200.195, 182.61.200.229
Connecting to paddlehub.bj.bcebos.com (paddlehub.bj.bcebos.com)|182.61.200.195|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 967120 (944K) [image/jpeg]
Saving to: ‘test_image.jpg’
test_image.jpg 100%[===================>] 944.45K 6.13MB/s in 0.2s
2020-07-22 12:19:53 (6.13 MB/s) - ‘test_image.jpg’ saved [967120/967120]
[{'save_path': 'humanseg_output/test_image.png', 'data': array([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]], dtype=float32)}]
![png](../imgs/humanseg_test_res.png)
### 中文分词
```shell
#通过命令行方式实现文本分词任务
$ hub run lac --input_text "今天是个好日子"
```
Install Module lac
Downloading lac
[==================================================] 100.00%
Uncompress /home/aistudio/.paddlehub/tmp/tmpjcskpj8x/lac
[==================================================] 100.00%
Successfully installed lac-2.1.1
[{'word': ['今天', '是', '个', '好日子'], 'tag': ['TIME', 'v', 'q', 'n']}]
上面的命令中包含四个部分,分别是:
- hub 表示PaddleHub的命令。
- run 调用run执行模型的预测。
- deeplabv3p_xception65_humanseg、lac 表示要调用的算法模型。
- --input_path/--input_text 表示模型的输入数据,图像和文本的输入方式不同。
另外,命令行中`visualization=True`表示将结果可视化输出,`output_dir="humanseg_output"`表示预测结果的保存目录,可以到该路径下查看输出的图片。
再看一个文字识别和一个口罩检测的例子。
### OCR文字识别
```shell
# 下载待测试的图片
$ wget https://paddlehub.bj.bcebos.com/model/image/ocr/test_ocr.jpg
# 该Module依赖于第三方库shapely和pyclipper,需提前安装
$ pip install shapely
$ pip install pyclipper
# 通过命令行方式实现文字识别任务
$ hub run chinese_ocr_db_crnn_mobile --input_path test_ocr.jpg --visualization=True --output_dir='ocr_result'
```
--2020-07-22 15:00:50-- https://paddlehub.bj.bcebos.com/model/image/ocr/test_ocr.jpg
Resolving paddlehub.bj.bcebos.com (paddlehub.bj.bcebos.com)... 182.61.200.195, 182.61.200.229
Connecting to paddlehub.bj.bcebos.com (paddlehub.bj.bcebos.com)|182.61.200.195|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 48680 (48K) [image/jpeg]
Saving to: ‘test_ocr.jpg’
test_ocr.jpg 100%[===================>] 47.54K --.-KB/s in 0.02s
2020-07-22 15:00:51 (2.88 MB/s) - ‘test_ocr.jpg’ saved [48680/48680]
Looking in indexes: https://pypi.mirrors.ustc.edu.cn/simple/
Requirement already satisfied: shapely in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (1.7.0)
Looking in indexes: https://pypi.mirrors.ustc.edu.cn/simple/
Requirement already satisfied: pyclipper in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (1.2.0)
[{'save_path': 'ocr_result/ndarray_1595401261.294494.jpg', 'data': [{'text': '纯臻营养护发素', 'confidence': 0.9438689351081848, 'text_box_position': [[24, 36], [304, 34], [304, 72], [24, 74]]}, {'text': '产品信息/参数', 'confidence': 0.9843138456344604, 'text_box_position': [[24, 80], [172, 80], [172, 104], [24, 104]]}, {'text': '(45元/每公斤,100公斤起订)', 'confidence': 0.9210420250892639, 'text_box_position': [[24, 109], [333, 109], [333, 136], [24, 136]]}, {'text': '每瓶22元,1000瓶起订)', 'confidence': 0.9685984253883362, 'text_box_position': [[22, 139], [283, 139], [283, 166], [22, 166]]}, {'text': '【品牌】', 'confidence': 0.9527574181556702, 'text_box_position': [[22, 174], [85, 174], [85, 198], [22, 198]]}, {'text': ':代加工方式/OEMODM', 'confidence': 0.9442129135131836, 'text_box_position': [[90, 176], [301, 176], [301, 196], [90, 196]]}, {'text': '【品名】', 'confidence': 0.8793742060661316, 'text_box_position': [[23, 205], [85, 205], [85, 229], [23, 229]]}, {'text': ':纯臻营养护发素', 'confidence': 0.9230973124504089, 'text_box_position': [[95, 204], [235, 206], [235, 229], [95, 227]]}, {'text': '【产品编号】', 'confidence': 0.9311650395393372, 'text_box_position': [[24, 238], [120, 238], [120, 260], [24, 260]]}, {'text': 'J:YM-X-3011', 'confidence': 0.8866629004478455, 'text_box_position': [[110, 239], [239, 239], [239, 256], [110, 256]]}, {'text': 'ODMOEM', 'confidence': 0.9916308522224426, 'text_box_position': [[414, 233], [430, 233], [430, 304], [414, 304]]}, {'text': '【净含量】:220ml', 'confidence': 0.8709315657615662, 'text_box_position': [[23, 268], [181, 268], [181, 292], [23, 292]]}, {'text': '【适用人群】', 'confidence': 0.9589888453483582, 'text_box_position': [[24, 301], [118, 301], [118, 321], [24, 321]]}, {'text': ':适合所有肤质', 'confidence': 0.935418963432312, 'text_box_position': [[131, 300], [254, 300], [254, 323], [131, 323]]}, {'text': '【主要成分】', 'confidence': 0.9366627335548401, 'text_box_position': [[24, 332], [117, 332], [117, 353], [24, 353]]}, {'text': '鲸蜡硬脂醇', 'confidence': 0.9033458828926086, 'text_box_position': [[138, 331], [235, 331], [235, 351], [138, 351]]}, {'text': '燕麦B-葡聚', 'confidence': 0.8497812747955322, 'text_box_position': [[248, 332], [345, 332], [345, 352], [248, 352]]}, {'text': '椰油酰胺丙基甜菜碱、', 'confidence': 0.8935506939888, 'text_box_position': [[54, 363], [232, 363], [232, 383], [54, 383]]}, {'text': '糖、', 'confidence': 0.8750994205474854, 'text_box_position': [[25, 364], [62, 364], [62, 383], [25, 383]]}, {'text': '泛酯', 'confidence': 0.5581164956092834, 'text_box_position': [[244, 363], [281, 363], [281, 382], [244, 382]]}, {'text': '(成品包材)', 'confidence': 0.9566792845726013, 'text_box_position': [[368, 367], [475, 367], [475, 388], [368, 388]]}, {'text': '【主要功能】', 'confidence': 0.9493741393089294, 'text_box_position': [[24, 395], [119, 395], [119, 416], [24, 416]]}, {'text': ':可紧致头发磷层', 'confidence': 0.9692543745040894, 'text_box_position': [[128, 397], [273, 397], [273, 414], [128, 414]]}, {'text': '美,从而达到', 'confidence': 0.8662520051002502, 'text_box_position': [[265, 395], [361, 395], [361, 415], [265, 415]]}, {'text': '即时持久改善头发光泽的效果,给干燥的头', 'confidence': 0.9690631031990051, 'text_box_position': [[25, 425], [372, 425], [372, 448], [25, 448]]}, {'text': '发足够的滋养', 'confidence': 0.8946213126182556, 'text_box_position': [[26, 457], [136, 457], [136, 477], [26, 477]]}]}]
```shell
# 查看预测结果
```
![png](../imgs/ocr_res.jpg)
### 口罩检测
```shell
# 下载待测试的图片
$ wget https://paddlehub.bj.bcebos.com/resources/test_mask_detection.jpg
# 通过命令行方式实现文字识别任务
$ hub run pyramidbox_lite_mobile_mask --input_path test_mask_detection.jpg --visualization=True --output_dir='detection_result'
```
--2020-07-22 15:08:11-- https://paddlehub.bj.bcebos.com/resources/test_mask_detection.jpg
Resolving paddlehub.bj.bcebos.com (paddlehub.bj.bcebos.com)... 182.61.200.229, 182.61.200.195
Connecting to paddlehub.bj.bcebos.com (paddlehub.bj.bcebos.com)|182.61.200.229|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 299133 (292K) [image/jpeg]
Saving to: ‘test_mask_detection.jpg’
test_mask_detection 100%[===================>] 292.12K --.-KB/s in 0.06s
2020-07-22 15:08:11 (4.55 MB/s) - ‘test_mask_detection.jpg’ saved [299133/299133]
Install Module pyramidbox_lite_mobile_mask
Downloading pyramidbox_lite_mobile_mask
[==================================================] 100.00%
Uncompress /home/aistudio/.paddlehub/tmp/tmp8oes9jid/pyramidbox_lite_mobile_mask
[==================================================] 100.00%
Successfully installed pyramidbox_lite_mobile_mask-1.3.0
Downloading pyramidbox_lite_mobile
[==================================================] 100.00%
Uncompress /home/aistudio/.paddlehub/tmp/tmpvhjhlr10/pyramidbox_lite_mobile
[==================================================] 100.00%
[{'data': [{'label': 'MASK', 'confidence': 0.9992434978485107, 'top': 181, 'bottom': 440, 'left': 457, 'right': 654}, {'label': 'MASK', 'confidence': 0.9224318265914917, 'top': 340, 'bottom': 578, 'left': 945, 'right': 1125}, {'label': 'NO MASK', 'confidence': 0.9996706247329712, 'top': 292, 'bottom': 500, 'left': 1166, 'right': 1323}], 'path': 'test_mask_detection.jpg'}]
```shell
# 查看预测结果
```
![png](../imgs/test_mask_detection_result.jpg)
### PaddleHub命令行工具简介
PaddleHub的命令行工具在开发时借鉴了Anaconda和PIP等软件包管理的理念,可以方便快捷的完成模型的搜索、下载、安装、升级、预测等功能。 下面概要介绍一下PaddleHub支持的12个命令,详细介绍可查看[命令行参考](../tutorial/cmdintro.md)章节。:
* install:用于将Module安装到本地,默认安装在{HUB_HOME}/.paddlehub/modules目录下;
* uninstall:卸载本地Module;
* show:用于查看本地已安装Module的属性或者指定目录下确定的Module的属性,包括其名字、版本、描述、作者等信息;
* download:用于下载百度飞桨PaddleHub提供的Module;
* search:通过关键字在服务端检索匹配的Module,当想要查找某个特定模型的Module时,使用search命令可以快速得到结果,例如hub search ssd命令,会查找所有包含了ssd字样的Module,命令支持正则表达式,例如hub search ^s.\*搜索所有以s开头的资源;
* list:列出本地已经安装的Module;
* run:用于执行Module的预测;
* version:显示PaddleHub版本信息;
* help:显示帮助信息;
* clear:PaddleHub在使用过程中会产生一些缓存数据,这部分数据默认存放在${HUB_HOME}/.paddlehub/cache目录下,用户可以通过clear命令来清空缓存;
* autofinetune:用于自动调整Fine-tune任务的超参数,具体使用详情参考[PaddleHub AutoDL Finetuner](https://github.com/PaddlePaddle/PaddleHub/blob/release/v1.5/docs/tutorial/autofinetune.md)使用教程;
* config:用于查看和设置Paddlehub相关设置,包括对server地址、日志级别的设置;
* serving:用于一键部署Module预测服务,详细用法见[PaddleHub Serving一键服务部署](https://github.com/PaddlePaddle/PaddleHub/blob/release/v1.5/docs/tutorial/serving.md)
## 小结
PaddleHub的产品理念是模型即软件,通过Python API或命令行实现模型调用,可快速体验或集成飞桨特色预训练模型。
此外,当用户想用少量数据来优化预训练模型时,PaddleHub也支持迁移学习,通过Fine-tune API,内置多种优化策略,只需少量代码即可完成预训练模型的Fine-tuning。具体可通过后面迁移学习的章节了解。
>值得注意的是,不是所有的Module都支持通过命令行预测 (例如BERT/ERNIE Transformer类模型,一般需要搭配任务进行Fine-tune), 也不是所有的Module都可用于Fine-tune(例如一般不建议用户使用词法分析LAC模型Fine-tune)。建议提前阅读[预训练模型的介绍文档](https://www.paddlepaddle.org.cn/hublist)了解使用场景。
# PaddleHub更多体验Demos
## PaddleHub官方Demo全集
官方Demo在AI Studio中均可在线运行,类似notebook的环境,只需通过浏览器即可访问,无需准备环境,非常方便开发者快速体验。
并将持续更新,建议收藏。
[https://aistudio.baidu.com/aistudio/personalcenter/thirdview/79927](https://aistudio.baidu.com/aistudio/personalcenter/thirdview/79927)
## PaddleHub开发者趣味实践作品
以下为前期PaddleHub课程或活动中,开发者们基于PaddleHub创作的趣味实践作品,均收录在AI Studio中,可在线运行,欢迎访问,希望对您有所启发。
1. [布剪刀石头【人脸识别切换本地窗口】](https://aistudio.baidu.com/aistudio/projectdetail/507630)
1. [秋水中的鱼【yesok舞蹈背景抠图转换并动漫风格迁移】](http://aistudio.baidu.com/aistudio/projectdetail/517066)
1. [Ninetailskim【在人脸上玩复古windows弹球】](https://aistudio.baidu.com/aistudio/projectdetail/518861)
1. [乌拉__【监控口罩,语音提醒,后台记录】](https://aistudio.baidu.com/aistudio/projectdetail/506931)
1. [九品炼丹师【影流之绿蛙蔡徐坤,人脸识别加头饰+ 人像分割变分身】](https://aistudio.baidu.com/aistudio/projectdetail/505168)
1. [七年期限【风格迁移以及本地部署】](https://aistudio.baidu.com/aistudio/projectdetail/520453)
1. [Fanas无敌【口红试色项目】](https://aistudio.baidu.com/aistudio/projectdetail/516520)
1. [skywalk163【用paddlehub统计飞桨源代码词频以及词云与人像展示】](https://aistudio.baidu.com/aistudio/projectdetail/519841)
1. [AIStudio261428【人脸识别+漫画表情包】](https://aistudio.baidu.com/aistudio/projectdetail/519616)
1. [土豆芽【六一儿童节邀请卡通人物来做客】](https://aistudio.baidu.com/aistudio/projectdetail/520925)
1. [大熊猫的感觉【变化的口罩】](https://aistudio.baidu.com/aistudio/projectdetail/520996)
1. [kly1997【一键旅游+戴墨镜】](https://aistudio.baidu.com/aistudio/projectdetail/518117)
1. [寞寞_默默【穿越到油画中】](https://aistudio.baidu.com/aistudio/projectdetail/516332)
1. [isse7【创意项目:风格“鬼脸”变换】](https://aistudio.baidu.com/aistudio/projectdetail/515307)
1. [Pda【人脸趣味变】](https://aistudio.baidu.com/aistudio/projectdetail/516306)
1. [Kgkzhiwen【我的新衣】](https://aistudio.baidu.com/aistudio/projectdetail/516663)
1. [哎呀呀好好学习【脸型自动调整】](https://aistudio.baidu.com/aistudio/projectdetail/513640)
1. [Tfboy【证件照换底】](https://aistudio.baidu.com/aistudio/projectdetail/509443)
1. [Leigangblog【我是明星脸】](https://aistudio.baidu.com/aistudio/projectdetail/505537)
1. [wpb3dm【时装模特换装】](https://aistudio.baidu.com/aistudio/projectdetail/519349)
1. [lsvine_bai【女友秒变神秘金发女神】](https://aistudio.baidu.com/aistudio/projectdetail/521784)
1. [Lemonadeqk【简单追星】](https://aistudio.baidu.com/aistudio/projectdetail/520488)
1. [XM1436gr【利用PaddleHub关键点检测实现AI换卡通脸】](https://aistudio.baidu.com/aistudio/projectdetail/514547)
1. [旺仔【人人都是圆眼萌仔】](https://aistudio.baidu.com/aistudio/projectdetail/519222)
1. [Arrowarcher【AI一键换发】](https://aistudio.baidu.com/aistudio/projectdetail/508270)
1. [WHY197598【移物换景基础】](https://aistudio.baidu.com/aistudio/projectdetail/517961)
1. [署名景逸【基于paddlehub人脸关键点检测的疲劳检测】](https://aistudio.baidu.com/aistudio/projectdetail/506024)
1. [thunder95【PaddleHub目光表情投票】](https://aistudio.baidu.com/aistudio/projectdetail/514205)
1. [上弦月C 【坟头蹦迪毕业照】](https://aistudio.baidu.com/aistudio/projectdetail/511253)
1. [如意_鸡蛋【左看像周润发,右看像刘德华】](https://aistudio.baidu.com/aistudio/projectdetail/507231)
# 通过Python代码调用方式使用PaddleHub
本页面的代码/命令可在[AIStudio](https://aistudio.baidu.com/aistudio/projectdetail/635335)上在线运行,类似notebook的环境,只需通过浏览器即可访问,无需准备环境,非常方便开发者快速体验。
## 计算机视觉任务的PaddleHub示例
先以计算机视觉任务为例,我们选用一张测试图片test.jpg,分别实现如下四项功能:
* 人像扣图([deeplabv3p_xception65_humanseg](https://www.paddlepaddle.org.cn/hubdetail?name=deeplabv3p_xception65_humanseg&en_category=ImageSegmentation)
* 人体部位分割([ace2p](https://www.paddlepaddle.org.cn/hubdetail?name=ace2p&en_category=ImageSegmentation)
* 人脸检测([ultra_light_fast_generic_face_detector_1mb_640](https://www.paddlepaddle.org.cn/hubdetail?name=ultra_light_fast_generic_face_detector_1mb_640&en_category=FaceDetection)
* 关键点检测([human_pose_estimation_resnet50_mpii](https://www.paddlepaddle.org.cn/hubdetail?name=human_pose_estimation_resnet50_mpii&en_category=KeyPointDetection)
>注:如果需要查找PaddleHub中可以调用哪些预训练模型,获取模型名称(如deeplabv3p_xception65_humanseg,后续代码中通过该名称调用模型),请参考[官网文档](https://www.paddlepaddle.org.cn/hublist),文档中已按照模型类别分好类,方便查找,并且提供了详细的模型介绍。
### 体验前请提前安装好PaddleHub
```shell
# 安装最新版本,使用清华源更稳定、更迅速
$ pip install paddlehub --upgrade -i https://pypi.tuna.tsinghua.edu.cn/simple
```
### 原图展示
```shell
# 下载待测试图片
$ wget https://paddlehub.bj.bcebos.com/resources/test_image.jpg
```
--2020-07-22 12:22:19-- https://paddlehub.bj.bcebos.com/resources/test_image.jpg
Resolving paddlehub.bj.bcebos.com (paddlehub.bj.bcebos.com)... 182.61.200.195, 182.61.200.229
Connecting to paddlehub.bj.bcebos.com (paddlehub.bj.bcebos.com)|182.61.200.195|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 967120 (944K) [image/jpeg]
Saving to: ‘test_image.jpg.1’
test_image.jpg.1 100%[===================>] 944.45K 5.51MB/s in 0.2s
2020-07-22 12:22:19 (5.51 MB/s) - ‘test_image.jpg.1’ saved [967120/967120]
![png](../imgs/humanseg_test.png)
### 人像扣图
PaddleHub采用模型即软件的设计理念,所有的预训练模型与Python软件包类似,具备版本的概念,通过`hub install``hub uninstall`命令可以便捷地完成模型的安装、升级和卸载。
> 使用如下命令默认下载最新版本的模型,如果需要指定版本,可在后面接版本号,如`==1.1.1`。
```shell
#安装预训练模型,deeplabv3p_xception65_humanseg是模型名称
$ hub install deeplabv3p_xception65_humanseg
```
Downloading deeplabv3p_xception65_humanseg
[==================================================] 100.00%
Uncompress /home/aistudio/.paddlehub/tmp/tmpo32jeve0/deeplabv3p_xception65_humanseg
[==================================================] 100.00%
Successfully installed deeplabv3p_xception65_humanseg-1.1.1
```python
# 导入paddlehub库
import paddlehub as hub
# 指定模型名称、待预测的图片路径、输出结果的路径,执行并输出预测结果
module = hub.Module(name="deeplabv3p_xception65_humanseg")
res = module.segmentation(paths = ["./test_image.jpg"], visualization=True, output_dir='humanseg_output')
```
[32m[2020-07-22 12:22:49,474] [ INFO] - Installing deeplabv3p_xception65_humanseg module [0m
Downloading deeplabv3p_xception65_humanseg
[==================================================] 100.00%
Uncompress /home/aistudio/.paddlehub/tmp/tmpzrrl1duq/deeplabv3p_xception65_humanseg
[==================================================] 100.00%
[32m[2020-07-22 12:23:11,811] [ INFO] - Successfully installed deeplabv3p_xception65_humanseg-1.1.1 [0m
![png](../imgs/output_8_3.png)
可以看到,使用Python代码调用PaddleHub只需要三行代码即可实现:
```
import paddlehub as hub # 导入PaddleHub代码库
module = hub.Module(name="deeplabv3p_xception65_humanseg") # 指定模型名称
res = module.segmentation(paths = ["./test.jpg"], visualization=True, output_dir='humanseg_output') # 指定模型的输入和输出路径,执行并输出预测结果,其中visualization=True表示将结果可视化输出
```
* 模型名称均通过`hub.Module` API来指定;
* `module.segmentation`用于执行图像分割类的预测任务,不同类型任务设计了不同的预测API,比如人脸检测任务采用`face_detection`函数执行预测,建议调用预训练模型之前先仔细查阅对应的模型介绍文档。
* 预测结果保存在`output_dir='humanseg_output'`目录下,可以到该路径下查看输出的图片。
其他任务的实现方式,均可参考这个“套路”。看一下接下来几个任务如何实现。
### 人体部位分割
```shell
#安装预训练模型
$ hub install ace2p
```
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/externals/joblib/externals/cloudpickle/cloudpickle.py:47: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp
Downloading ace2p
[==================================================] 100.00%
Uncompress /home/aistudio/.paddlehub/tmp/tmpfsovt3f8/ace2p
[==================================================] 100.00%
Successfully installed ace2p-1.1.0
```python
# 导入paddlehub库
import paddlehub as hub
# 指定模型名称、待预测的图片路径、输出结果的路径,执行并输出预测结果
module = hub.Module(name="ace2p")
res = module.segmentation(paths = ["./test_image.jpg"], visualization=True, output_dir='ace2p_output')
```
[32m[2020-07-22 12:23:58,027] [ INFO] - Installing ace2p module [0m
Downloading ace2p
[==================================================] 100.00%
Uncompress /home/aistudio/.paddlehub/tmp/tmptrogpj6j/ace2p
[==================================================] 100.00%
[32m[2020-07-22 12:24:22,575] [ INFO] - Successfully installed ace2p-1.1.0 [0m
![png](../imgs/output_12_3.png)
### 人脸检测
```shell
#安装预训练模型
$ hub install ultra_light_fast_generic_face_detector_1mb_640
```
Downloading ultra_light_fast_generic_face_detector_1mb_640
[==================================================] 100.00%
Uncompress /home/aistudio/.paddlehub/tmp/tmpz82xnmy6/ultra_light_fast_generic_face_detector_1mb_640
[==================================================] 100.00%
Successfully installed ultra_light_fast_generic_face_detector_1mb_640-1.1.2
```python
# 导入paddlehub库
import paddlehub as hub
# 指定模型名称、待预测的图片路径、输出结果的路径,执行并输出预测结果
module = hub.Module(name="ultra_light_fast_generic_face_detector_1mb_640")
res = module.face_detection(paths = ["./test_image.jpg"], visualization=True, output_dir='face_detection_output')
```
[32m[2020-07-22 12:25:12,948] [ INFO] - Installing ultra_light_fast_generic_face_detector_1mb_640 module [0m
Downloading ultra_light_fast_generic_face_detector_1mb_640
[==================================================] 100.00%
Uncompress /home/aistudio/.paddlehub/tmp/tmpw44mo56p/ultra_light_fast_generic_face_detector_1mb_640
[==================================================] 100.00%
[32m[2020-07-22 12:25:14,698] [ INFO] - Successfully installed ultra_light_fast_generic_face_detector_1mb_640-1.1.2
![png](../imgs/output_15_3.png)
### 关键点检测
```shell
#安装预训练模型
$ hub install human_pose_estimation_resnet50_mpii
```
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/sklearn/externals/joblib/externals/cloudpickle/cloudpickle.py:47: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp
Downloading human_pose_estimation_resnet50_mpii
[==================================================] 100.00%
Uncompress /home/aistudio/.paddlehub/tmp/tmpn_ppwkzq/human_pose_estimation_resnet50_mpii
[======== ] 17.99%
```python
# 导入paddlehub库
import paddlehub as hub
# 指定模型名称、待预测的图片路径、输出结果的路径,执行并输出预测结果
module = hub.Module(name="human_pose_estimation_resnet50_mpii")
res = module.keypoint_detection(paths = ["./test_image.jpg"], visualization=True, output_dir='keypoint_output')
```
[32m[2020-07-23 11:27:33,989] [ INFO] - Installing human_pose_estimation_resnet50_mpii module [0m
[32m[2020-07-23 11:27:33,992] [ INFO] - Module human_pose_estimation_resnet50_mpii already installed in /home/aistudio/.paddlehub/modules/human_pose_estimation_resnet50_mpii [0m
image saved in keypoint_output/test_imagetime=1595474855.jpg
![png](../imgs/output_18_2.png)
## 自然语言处理任务的PaddleHub示例
再看两个自然语言处理任务的示例,下面以中文分词和情感分类的任务为例介绍。
* 中文分词([lac](https://www.paddlepaddle.org.cn/hubdetail?name=lac&en_category=LexicalAnalysis)
* 情感分析([senta_bilstm](https://www.paddlepaddle.org.cn/hubdetail?name=senta_bilstm&en_category=SentimentAnalysis)
### 中文分词
```shell
#安装预训练模型
$ hub install lac
```
2020-07-22 10:03:09,866-INFO: font search path ['/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/mpl-data/fonts/ttf', '/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/mpl-data/fonts/afm', '/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/matplotlib/mpl-data/fonts/pdfcorefonts']
2020-07-22 10:03:10,208-INFO: generated new fontManager
Downloading lac
[==================================================] 100.00%
Uncompress /home/aistudio/.paddlehub/tmp/tmp8ukaz690/lac
[==================================================] 100.00%
Successfully installed lac-2.1.1
```python
# 导入paddlehub库
import paddlehub as hub
# 指定模型名称、待分词的文本,执行并输出预测结果
lac = hub.Module(name="lac")
test_text = ["1996年,曾经是微软员工的加布·纽维尔和麦克·哈灵顿一同创建了Valve软件公司。他们在1996年下半年从id software取得了雷神之锤引擎的使用许可,用来开发半条命系列。"]
res = lac.lexical_analysis(texts = test_text)
# 打印预测结果
print("中文词法分析结果:", res)
```
[32m[2020-07-22 10:03:18,439] [ INFO] - Installing lac module[0m
[32m[2020-07-22 10:03:18,531] [ INFO] - Module lac already installed in /home/aistudio/.paddlehub/modules/lac [0m
中文词法分析结果: [{'word': ['1996年', ',', '曾经', '是', '微软', '员工', '的', '加布·纽维尔', '和', '麦克·哈灵顿', '一同', '创建', '了', 'Valve软件公司', '。', '他们', '在', '1996年下半年', '从', 'id', ' ', 'software', '取得', '了', '雷神之锤', '引擎', '的', '使用', '许可', ',', '用来', '开发', '半条命', '系列', '。'], 'tag': ['TIME', 'w', 'd', 'v', 'ORG', 'n', 'u', 'PER', 'c', 'PER', 'd', 'v', 'u', 'ORG', 'w', 'r', 'p', 'TIME', 'p', 'nz', 'w', 'n', 'v', 'u', 'n', 'n', 'u', 'vn', 'vn', 'w', 'v', 'v', 'n', 'n', 'w']}]
可以看到,与计算机视觉任务相比,输入和输出接口(这里需要输入文本,以函数参数的形式传入)存在差异,这与任务类型相关,具体可查看对应预训练模型的API介绍。
### 情感分类
```shell
#安装预训练模型
$ hub install senta_bilstm
```
Module senta_bilstm-1.1.0 already installed in /home/aistudio/.paddlehub/modules/senta_bilstm
```python
import paddlehub as hub
senta = hub.Module(name="senta_bilstm")
test_text = ["味道不错,确实不算太辣,适合不能吃辣的人。就在长江边上,抬头就能看到长江的风景。鸭肠、黄鳝都比较新鲜。"]
res = senta.sentiment_classify(texts = test_text)
print("情感分析结果:", res)
```
[32m[2020-07-22 10:34:06,922] [ INFO] - Installing senta_bilstm module [0m
[32m[2020-07-22 10:34:06,984] [ INFO] - Module senta_bilstm already installed in /home/aistudio/.paddlehub/modules/senta_bilstm
[32m[2020-07-22 10:34:08,937] [ INFO] - Installing lac module[0m
[32m[2020-07-22 10:34:08,939] [ INFO] - Module lac already installed in /home/aistudio/.paddlehub/modules/lac [0m
情感分析结果: [{'text': '味道不错,确实不算太辣,适合不能吃辣的人。就在长江边上,抬头就能看到长江的风景。鸭肠、黄鳝都比较新鲜。', 'sentiment_label': 1, 'sentiment_key': 'positive', 'positive_probs': 0.9771, 'negative_probs': 0.0229}]
## 总结
PaddleHub提供了丰富的预训练模型,包括图像分类、语义模型、视频分类、图像生成、图像分割、文本审核、关键点检测等主流模型,只需要3行Python代码即可快速调用,即时输出预测结果,非常方便。您可以尝试一下,从[预训练模型列表](https://www.paddlepaddle.org.cn/hublist)中选择一些模型体验一下。
快速体验
==================
PaddleHub有两种使用方式:Python代码调用和命令行调用。
命令行方式只需要一行命令即可快速体验PaddleHub提供的预训练模型的效果,是快速体验的绝佳选择;Python代码调用方式也仅需要三行代码,如果需要使用自己的数据Fine-tune并生成模型,则采用该方式。
命令行示例:
::
$ hub run chinese_ocr_db_crnn_mobile --input_path test_ocr.jpg
Python代码示例:
::
import paddlehub as hub
ocr = hub.Module(name="chinese_ocr_db_crnn_mobile")
result = ocr.recognize_text(paths = "./test_image.jpg"], visualization=True, output_dir='ocr_output')
样例结果示例:
.. image:: ../imgs/ocr_res.jpg
本章节提供这两种方法的快速体验方法,方便快速上手,同时提供了丰富的体验Demo,覆盖多场景领域,欢迎体验。具体使用时,当然还需要进一步了解详细的API接口参数、命令行参数解释,可参考后面的Python API接口和命令行参考章节。
.. toctree::
:maxdepth: 1
通过命令行调用方式使用PaddleHub<cmd_quick_run>
通过Python代码调用方式使用PaddleHub<python_use_hub>
PaddleHub更多体验Demos<more_demos>
部署预训练模型
==================
本文介绍了如何生成可用于服务化部署的模型、如何实现服务化部署以及如何获取ERNIE/BERT Embedding服务的方法。
详细信息,参考以下教程:
.. toctree::
:maxdepth: 1
Fine-tune模型转化为PaddleHub Module<finetuned_model_to_module.md>
服务化部署<serving>
文本Embedding服务<bert_service>
此差异已折叠。
# 自定义数据
训练一个新任务时,如果从零开始训练时,这将是一个耗时的过程,并且效果可能达不到理想的效果,此时您可以利用PaddleHub提供的预训练模型进行具体任务的Fine-tune。您只需要对自定义数据进行相应的预处理,随后输入预训练模型中,即可得到相应的结果。
训练一个新任务时,如果从零开始训练时,这将是一个耗时的过程,并且效果可能达不到理想的效果,此时您可以利用PaddleHub提供的预训练模型进行具体任务的Fine-tune。您只需要对自定义数据进行相应的预处理,随后输入预训练模型中,即可得到相应的结果。请参考如下内容设置数据集的结构。
## 一、NLP类任务如何自定义数据
本文以预训练模型ERNIE对文本分类任务进行Fine-tune为例,说明如何利用PaddleHub适配自定义数据完成Fine-tune。
### 数据准备
数据目录如下所示。
```
├─data: 数据目录
......@@ -30,49 +30,13 @@ text_a label
1.接电源没有几分钟,电源适配器热的不行. 2.摄像头用不起来. 3.机盖的钢琴漆,手不能摸,一摸一个印. 4.硬盘分区不好办. 0
```
### 自定义数据加载
加载文本类自定义数据集,用户仅需要继承基类BaseNLPDatast,修改数据集存放地址以及类别即可。具体使用如下:
**NOTE:**
* 数据集文件编码格式建议为utf8格式。
* 如果相应的数据集文件没有上述的列说明,如train.tsv文件没有第一行的`text_a label`,则train_file_with_header=False。
* 如果您还有预测数据(没有文本类别),可以将预测数据存放在predict.tsv文件,文件格式和train.tsv类似。去掉label一列即可。
* 分类任务中,数据集的label必须从0开始计数
```python
from paddlehub.dataset.base_nlp_dataset import BaseNLPDataset
class DemoDataset(BaseNLPDataset):
"""DemoDataset"""
def __init__(self):
# 数据集存放位置
self.dataset_dir = "path/to/dataset"
super(DemoDataset, self).__init__(
base_path=self.dataset_dir,
train_file="train.tsv",
dev_file="dev.tsv",
test_file="test.tsv",
# 如果还有预测数据(不需要文本类别label),可以放在predict.tsv
predict_file="predict.tsv",
train_file_with_header=True,
dev_file_with_header=True,
test_file_with_header=True,
predict_file_with_header=True,
# 数据集类别集合
label_list=["0", "1"])
dataset = DemoDataset()
```
之后,您就可以通过DemoDataset()获取自定义数据集了。进而配合ClassifyReader以及预训练模型如ERNIE完成文本分类任务。
## 二、CV类任务如何自定义数据
利用PaddleHub迁移CV类任务使用自定义数据时,用户需要自己切分数据集,将数据集且分为训练集、验证集和测试集。
### 数据准备
需要三个文本文件来记录对应的图片路径和标签,此外还需要一个标签文件用于记录标签的名称。
数据目录如下所示。需要三个文本文件来记录对应的图片路径和标签,此外还需要一个标签文件用于记录标签的名称。
```
├─data: 数据目录
  ├─train_list.txt:训练集数据列表
......@@ -108,33 +72,3 @@ cat
dog
```
### 自定义数据加载
加载图像类自定义数据集,用户仅需要继承基类BaseCVDatast,修改数据集存放地址即可。具体使用如下:
**NOTE:**
* 数据集文件编码格式建议为utf8格式。
* dataset_dir为数据集实际路径,需要填写全路径,以下示例以`/test/data`为例。
* 训练/验证/测试集的数据列表文件中的图片路径需要相对于dataset_dir的相对路径,例如图片的实际位置为`/test/data/dog/dog1.jpg`。base_path为`/test/data`,则文件中填写的路径应该为`dog/dog1.jpg`
* 如果您还有预测数据(没有文本类别),可以将预测数据存放在predict_list.txt文件,文件格式和train_list.txt类似。去掉label一列即可
* 如果您的数据集类别较少,可以不用定义label_list.txt,可以选择定义label_list=["数据集所有类别"]。
* 分类任务中,数据集的label必须从0开始计数
```python
from paddlehub.dataset.base_cv_dataset import BaseCVDataset
class DemoDataset(BaseCVDataset):
def __init__(self):
# 数据集存放位置
self.dataset_dir = "/test/data"
super(DemoDataset, self).__init__(
base_path=self.dataset_dir,
train_list_file="train_list.txt",
validate_list_file="validate_list.txt",
test_list_file="test_list.txt",
predict_file="predict_list.txt",
label_list_file="label_list.txt",
# label_list=["数据集所有类别"])
dataset = DemoDataset()
```
教程
迁移学习
==================
以下是关于PaddleHub的使用教程,介绍了命令行使用、如何自定义数据完成Finetune、如何自定义迁移任务、如何服务化部署预训练模型、如何获取ERNIE/BERT Embedding、如何用word2vec完成语义相似度计算、ULMFit优化策略介绍、如何使用超参优化AutoDL Finetuner、如何用Hook机制改写Task内置方法
本文将介绍介绍了如何自定义数据、如何完成Finetune以及如何使用超参优化AutoDL Finetuner
详细信息,参考以下教程:
......@@ -9,13 +9,8 @@
.. toctree::
:maxdepth: 1
命令行工具<cmdintro>
自定义数据<how_to_load_data>
Fine-tune模型转化为PaddleHub Module<finetuned_model_to_module.md>
自定义任务<how_to_define_task>
服务化部署<serving>
文本Embedding服务<bert_service>
语义相似度计算<sentence_sim>
ULMFit优化策略<strategy_exp>
如何迁移学习<how_to_finetune>
超参优化<autofinetune>
Hook机制<hook>
## 模型概述
HumanSeg_lite是基于ShuffleNetV2网络结构的基础上进行优化的人像分割模型,进一步减小了网络规模,网络大小只有541K,量化后只有187K,适用于手机自拍人像分割等实时分割场景。
## 命令行预测
```
hub run humanseg_lite --input_path "/PATH/TO/IMAGE"
```
## API
```python
def segment(images=None,
paths=None,
batch_size=1,
use_gpu=False,
visualization=False,
output_dir='humanseg_lite_output')
```
预测API,用于人像分割。
**参数**
* images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式;
* paths (list\[str\]): 图片的路径;
* batch\_size (int): batch 的大小;
* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置;
* visualization (bool): 是否将识别结果保存为图片文件;
* output\_dir (str): 图片的保存路径。
**返回**
* res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,关键字有 'save\_path', 'data',对应的取值为:
* save\_path (str, optional): 可视化图片的保存路径(仅当visualization=True时存在);
* data (numpy.ndarray): 人像分割结果,仅包含Alpha通道,取值为0-255 (0为全透明,255为不透明),也即取值越大的像素点越可能为人体,取值越小的像素点越可能为背景。
```python
def video_stream_segment(self,
frame_org,
frame_id,
prev_gray,
prev_cfd,
use_gpu=False):
```
预测API,用于逐帧对视频人像分割。
**参数**
* frame_org (numpy.ndarray): 单帧图片数据,ndarray.shape 为 \[H, W, C\],BGR格式;
* frame_id (int): 当前帧的编号;
* prev_gray (numpy.ndarray): 前一帧输入网络图像的灰度图;
* prev_cfd (numpy.ndarray): 前一帧光流追踪图和预测结果融合图
* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置;
**返回**
* img_matting (numpy.ndarray): 人像分割结果,仅包含Alpha通道,取值为0-1 (0为全透明,1为不透明)。
* cur_gray (numpy.ndarray): 当前帧输入网络图像的灰度图;
* optflow_map (numpy.ndarray): 当前帧光流追踪图和预测结果融合图
```python
def video_segment(self,
video_path=None,
use_gpu=False,
save_dir='humanseg_lite_video_result'):
```
预测API,用于视频人像分割。
**参数**
* video\_path (str): 待分割视频路径。若为None,则从本地摄像头获取视频,并弹出窗口显示在线分割结果。
* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置;
* save\_dir (str): 视频保存路径,仅在video\_path不为None时启用,保存离线视频处理结果。
```python
def save_inference_model(dirname='humanseg_lite_model',
model_filename=None,
params_filename=None,
combined=True)
```
将模型保存到指定路径。
**参数**
* dirname: 存在模型的目录名称
* model\_filename: 模型文件名称,默认为\_\_model\_\_
* params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效)
* combined: 是否将参数保存到统一的一个文件中
## 代码示例
图片分割及视频分割代码示例:
```python
import cv2
import paddlehub as hub
human_seg = hub.Module('humanseg_lite')
im = cv2.imread('/PATH/TO/IMAGE')
#visualization=True可以用于查看超分图片效果,可设置为False提升运行速度。
res = human_seg.segment(images=[im],visualization=True)
print(res[0]['data'])
human_seg.video_segment('/PATH/TO/VIDEO')
human_seg.save_inference_model('/PATH/TO/SAVE/MODEL')
```
视频流预测代码示例:
```python
import cv2
import numpy as np
import paddlehub as hub
human_seg = hub.Module('humanseg_lite')
cap_video = cv2.VideoCapture('\PATH\TO\VIDEO')
fps = cap_video.get(cv2.CAP_PROP_FPS)
save_path = 'humanseg_lite_video.avi'
width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap_out = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps, (width, height))
prev_gray = None
prev_cfd = None
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
[img_matting, prev_gray, prev_cfd] = human_seg.video_stream_segment(frame_org=frame_org, frame_id=cap_video.get(1), prev_gray=prev_gray, prev_cfd=prev_cfd)
img_matting = np.repeat(img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org + (1 - img_matting) * bg_im).astype(np.uint8)
cap_out.write(comb)
else:
break
cap_video.release()
cap_out.release()
```
## 服务部署
PaddleHub Serving可以部署一个人像分割的在线服务。
## 第一步:启动PaddleHub Serving
运行启动命令:
```shell
$ hub serving start -m humanseg_lite
```
这样就完成了一个人像分割的服务化API的部署,默认端口号为8866。
**NOTE:** 如使用GPU预测,则需要在启动服务之前,设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。
## 第二步:发送预测请求
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# 发送HTTP请求
org_im = cv2.imread('PATH/TO/IMAGE')
data = {'images':[cv2_to_base64(org_im)]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/humanseg_lite"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# 保存图片
mask =cv2.cvtColor(base64_to_cv2(r.json()["results"][0]['data']), cv2.COLOR_BGR2GRAY)
rgba = np.concatenate((org_im, np.expand_dims(mask, axis=2)), axis=2)
cv2.imwrite("segment_human_lite.png", rgba)
```
### 查看代码
https://github.com/PaddlePaddle/PaddleSeg/tree/develop/contrib/HumanSeg
### 依赖
paddlepaddle >= 1.8.0
paddlehub >= 1.7.1
# -*- coding:utf-8 -*-
import os
import time
from collections import OrderedDict
import cv2
import numpy as np
from PIL import Image
__all__ = ['reader', 'preprocess_v']
def preprocess_v(img, w, h):
img = cv2.resize(img, (w, h), cv2.INTER_LINEAR).astype(np.float32)
img_mean = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img_std = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img = img.transpose((2, 0, 1)) / 255
img -= img_mean
img /= img_std
return img
def reader(images=None, paths=None):
"""
Preprocess to yield image.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C]
paths (list[str]): paths to images.
Yield:
each (collections.OrderedDict): info of original image, preprocessed image.
"""
component = list()
if paths:
for im_path in paths:
each = OrderedDict()
assert os.path.isfile(
im_path), "The {} isn't a valid file path.".format(im_path)
#print(im_path)
im = cv2.imread(im_path).astype('float32')
each['org_im'] = im
each['org_im_path'] = im_path
each['org_im_shape'] = im.shape
component.append(each)
if images is not None:
assert type(images) is list, "images should be a list."
for im in images:
each = OrderedDict()
each['org_im'] = im
each['org_im_path'] = 'ndarray_time={}'.format(
round(time.time(), 6) * 1e6)
each['org_im_shape'] = im.shape
component.append(each)
for element in component:
img = element['org_im'].copy()
img = cv2.resize(img, (192, 192)).astype(np.float32)
img_mean = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img_std = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img = img.transpose((2, 0, 1)) / 255
img -= img_mean
img /= img_std
element['image'] = img
yield element
# -*- coding:utf-8 -*-
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import ast
import os
import os.path as osp
import argparse
import cv2
import numpy as np
import paddle.fluid as fluid
import paddlehub as hub
from paddle.fluid.core import PaddleTensor, AnalysisConfig, create_paddle_predictor
from paddlehub.module.module import moduleinfo, runnable, serving
from humanseg_lite.processor import postprocess, base64_to_cv2, cv2_to_base64, check_dir
from humanseg_lite.data_feed import reader, preprocess_v
from humanseg_lite.optimal import postprocess_v, threshold_mask
@moduleinfo(
name="humanseg_lite",
type="CV/semantic_segmentation",
author="paddlepaddle",
author_email="",
summary="humanseg_lite is a semantic segmentation model.",
version="1.1.0")
class ShufflenetHumanSeg(hub.Module):
def _initialize(self):
self.default_pretrained_model_path = os.path.join(
self.directory, "humanseg_lite_inference")
self._set_config()
def _set_config(self):
"""
predictor config setting
"""
self.model_file_path = os.path.join(self.default_pretrained_model_path,
'__model__')
self.params_file_path = os.path.join(self.default_pretrained_model_path,
'__params__')
cpu_config = AnalysisConfig(self.model_file_path, self.params_file_path)
cpu_config.disable_glog_info()
cpu_config.disable_gpu()
self.cpu_predictor = create_paddle_predictor(cpu_config)
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
use_gpu = True
except:
use_gpu = False
if use_gpu:
gpu_config = AnalysisConfig(self.model_file_path,
self.params_file_path)
gpu_config.disable_glog_info()
gpu_config.enable_use_gpu(
memory_pool_init_size_mb=1000, device_id=0)
self.gpu_predictor = create_paddle_predictor(gpu_config)
def segment(self,
images=None,
paths=None,
batch_size=1,
use_gpu=False,
visualization=False,
output_dir='humanseg_lite_output'):
"""
API for human segmentation.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C], the color space is BGR.
paths (list[str]): The paths of images.
batch_size (int): batch size.
use_gpu (bool): Whether to use gpu.
visualization (bool): Whether to save image or not.
output_dir (str): The path to store output images.
Returns:
res (list[dict]): each element in the list is a dict, the keys and values are:
save_path (str, optional): the path to save images. (Exists only if visualization is True)
data (numpy.ndarray): data of post processed image.
"""
if use_gpu:
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
except:
raise RuntimeError(
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES as cuda_device_id."
)
all_data = list()
for yield_data in reader(images, paths):
all_data.append(yield_data)
total_num = len(all_data)
loop_num = int(np.ceil(total_num / batch_size))
res = list()
for iter_id in range(loop_num):
batch_data = list()
handle_id = iter_id * batch_size
for image_id in range(batch_size):
try:
batch_data.append(all_data[handle_id + image_id])
except:
pass
# feed batch image
batch_image = np.array([data['image'] for data in batch_data])
batch_image = PaddleTensor(batch_image.copy())
output = self.gpu_predictor.run([
batch_image
]) if use_gpu else self.cpu_predictor.run([batch_image])
output = output[1].as_ndarray()
output = np.expand_dims(output[:, 1, :, :], axis=1)
# postprocess one by one
for i in range(len(batch_data)):
out = postprocess(
data_out=output[i],
org_im=batch_data[i]['org_im'],
org_im_shape=batch_data[i]['org_im_shape'],
org_im_path=batch_data[i]['org_im_path'],
output_dir=output_dir,
visualization=visualization)
res.append(out)
return res
def video_stream_segment(self,
frame_org,
frame_id,
prev_gray,
prev_cfd,
use_gpu=False):
"""
API for human video segmentation.
Args:
frame_org (numpy.ndarray): frame data, shape of each is [H, W, C], the color space is BGR.
frame_id (int): index of the frame to be decoded.
prev_gray (numpy.ndarray): gray scale image of last frame, shape of each is [H, W]
prev_cfd (numpy.ndarray): fusion image from optical flow image and segment result, shape of each is [H, W]
use_gpu (bool): Whether to use gpu.
Returns:
img_matting (numpy.ndarray): data of segmentation mask.
cur_gray (numpy.ndarray): gray scale image of current frame, shape of each is [H, W]
optflow_map (numpy.ndarray): optical flow image of current frame, shape of each is [H, W]
"""
resize_h = 192
resize_w = 192
is_init = True
width = int(frame_org.shape[0])
height = int(frame_org.shape[1])
disflow = cv2.DISOpticalFlow_create(
cv2.DISOPTICAL_FLOW_PRESET_ULTRAFAST)
frame = preprocess_v(frame_org, resize_w, resize_h)
image = PaddleTensor(np.array([frame.copy()]))
output = self.gpu_predictor.run(
[image]) if use_gpu else self.cpu_predictor.run([image])
score_map = output[1].as_ndarray()
frame = np.transpose(frame, axes=[1, 2, 0])
score_map = np.transpose(np.squeeze(score_map, 0), axes=[1, 2, 0])
cur_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cur_gray = cv2.resize(cur_gray, (resize_w, resize_h))
score_map = 255 * score_map[:, :, 1]
if frame_id == 1:
prev_gray = np.zeros((resize_h, resize_w), np.uint8)
prev_cfd = np.zeros((resize_h, resize_w), np.float32)
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
else:
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
optflow_map = threshold_mask(optflow_map, thresh_bg=0.2, thresh_fg=0.8)
img_matting = cv2.resize(optflow_map, (height, width), cv2.INTER_LINEAR)
return [img_matting, cur_gray, optflow_map]
def video_segment(self,
video_path=None,
use_gpu=False,
save_dir='humanseg_lite_video_result'):
"""
API for human video segmentation.
Args:
video_path (str): The path to take the video under preprocess. If video_path is None, it will capture
the vedio from your camera.
use_gpu (bool): Whether to use gpu.
save_dir (str): The path to store output video.
"""
if use_gpu:
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
except:
raise RuntimeError(
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. "
"If you wanna use gpu, please set CUDA_VISIBLE_DEVICES as cuda_device_id."
)
resize_h = 192
resize_w = 192
if not video_path:
cap_video = cv2.VideoCapture(0)
else:
cap_video = cv2.VideoCapture(video_path)
if not cap_video.isOpened():
raise IOError("Error opening video stream or file, "
"--video_path whether existing: {}"
" or camera whether working".format(video_path))
width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
disflow = cv2.DISOpticalFlow_create(
cv2.DISOPTICAL_FLOW_PRESET_ULTRAFAST)
prev_gray = np.zeros((resize_h, resize_w), np.uint8)
prev_cfd = np.zeros((resize_h, resize_w), np.float32)
is_init = True
fps = cap_video.get(cv2.CAP_PROP_FPS)
if video_path is not None:
print('Please wait. It is computing......')
if not osp.exists(save_dir):
os.makedirs(save_dir)
save_path = osp.join(save_dir, 'result' + '.avi')
cap_out = cv2.VideoWriter(
save_path, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps,
(width, height))
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
frame = preprocess_v(frame_org, resize_w, resize_h)
image = PaddleTensor(np.array([frame.copy()]))
output = self.gpu_predictor.run(
[image]) if use_gpu else self.cpu_predictor.run([image])
score_map = output[1].as_ndarray()
frame = np.transpose(frame, axes=[1, 2, 0])
score_map = np.transpose(
np.squeeze(score_map, 0), axes=[1, 2, 0])
cur_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cur_gray = cv2.resize(cur_gray, (resize_w, resize_h))
score_map = 255 * score_map[:, :, 1]
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
prev_gray = cur_gray.copy()
prev_cfd = optflow_map.copy()
optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
optflow_map = threshold_mask(
optflow_map, thresh_bg=0.2, thresh_fg=0.8)
img_matting = cv2.resize(optflow_map, (width, height),
cv2.INTER_LINEAR)
img_matting = np.repeat(
img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org +
(1 - img_matting) * bg_im).astype(np.uint8)
cap_out.write(comb)
else:
break
cap_video.release()
cap_out.release()
else:
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
frame = preprocess_v(frame_org, resize_w, resize_h)
image = PaddleTensor(np.array([frame.copy()]))
output = self.gpu_predictor.run(
[image]) if use_gpu else self.cpu_predictor.run([image])
score_map = output[1].as_ndarray()
frame = np.transpose(frame, axes=[1, 2, 0])
score_map = np.transpose(
np.squeeze(score_map, 0), axes=[1, 2, 0])
cur_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cur_gray = cv2.resize(cur_gray, (resize_w, resize_h))
score_map = 255 * score_map[:, :, 1]
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
prev_gray = cur_gray.copy()
prev_cfd = optflow_map.copy()
optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
optflow_map = threshold_mask(
optflow_map, thresh_bg=0.2, thresh_fg=0.8)
img_matting = cv2.resize(optflow_map, (width, height),
cv2.INTER_LINEAR)
img_matting = np.repeat(
img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org +
(1 - img_matting) * bg_im).astype(np.uint8)
cv2.imshow('HumanSegmentation', comb)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
else:
break
cap_video.release()
def save_inference_model(self,
dirname='humanseg_lite_model',
model_filename=None,
params_filename=None,
combined=True):
if combined:
model_filename = "__model__" if not model_filename else model_filename
params_filename = "__params__" if not params_filename else params_filename
place = fluid.CPUPlace()
exe = fluid.Executor(place)
program, feeded_var_names, target_vars = fluid.io.load_inference_model(
dirname=self.default_pretrained_model_path,
model_filename=model_filename,
params_filename=params_filename,
executor=exe)
fluid.io.save_inference_model(
dirname=dirname,
main_program=program,
executor=exe,
feeded_var_names=feeded_var_names,
target_vars=target_vars,
model_filename=model_filename,
params_filename=params_filename)
@serving
def serving_method(self, images, **kwargs):
"""
Run as a service.
"""
images_decode = [base64_to_cv2(image) for image in images]
results = self.segment(images=images_decode, **kwargs)
results = [{
'data': cv2_to_base64(result['data'])
} for result in results]
return results
@runnable
def run_cmd(self, argvs):
"""
Run as a command.
"""
self.parser = argparse.ArgumentParser(
description="Run the {} module.".format(self.name),
prog='hub run {}'.format(self.name),
usage='%(prog)s',
add_help=True)
self.arg_input_group = self.parser.add_argument_group(
title="Input options", description="Input data. Required")
self.arg_config_group = self.parser.add_argument_group(
title="Config options",
description=
"Run configuration for controlling module behavior, not required.")
self.add_module_config_arg()
self.add_module_input_arg()
args = self.parser.parse_args(argvs)
results = self.segment(
paths=[args.input_path],
batch_size=args.batch_size,
use_gpu=args.use_gpu,
output_dir=args.output_dir,
visualization=args.visualization)
if args.save_dir is not None:
check_dir(args.save_dir)
self.save_inference_model(args.save_dir)
return results
def add_module_config_arg(self):
"""
Add the command config options.
"""
self.arg_config_group.add_argument(
'--use_gpu',
type=ast.literal_eval,
default=False,
help="whether use GPU or not")
self.arg_config_group.add_argument(
'--output_dir',
type=str,
default='humanseg_lite_output',
help="The directory to save output images.")
self.arg_config_group.add_argument(
'--save_dir',
type=str,
default='humanseg_lite_model',
help="The directory to save model.")
self.arg_config_group.add_argument(
'--visualization',
type=ast.literal_eval,
default=False,
help="whether to save output as images.")
self.arg_config_group.add_argument(
'--batch_size',
type=ast.literal_eval,
default=1,
help="batch size.")
def add_module_input_arg(self):
"""
Add the command input options.
"""
self.arg_input_group.add_argument(
'--input_path', type=str, help="path to image.")
if __name__ == "__main__":
m = ShufflenetHumanSeg()
#shuffle.video_segment()
img = cv2.imread('photo.jpg')
# res = m.segment(images=[img], visualization=True)
# print(res[0]['data'])
# m.video_segment('')
cap_video = cv2.VideoCapture('video_test.mp4')
fps = cap_video.get(cv2.CAP_PROP_FPS)
save_path = 'result_frame.avi'
width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap_out = cv2.VideoWriter(save_path,
cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps,
(width, height))
prev_gray = None
prev_cfd = None
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
[img_matting, prev_gray, prev_cfd] = m.video_stream_segment(
frame_org=frame_org,
frame_id=cap_video.get(1),
prev_gray=prev_gray,
prev_cfd=prev_cfd)
img_matting = np.repeat(img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org + (1 - img_matting) * bg_im).astype(
np.uint8)
cap_out.write(comb)
else:
break
cap_video.release()
cap_out.release()
# -*- coding:utf-8 -*
import numpy as np
def human_seg_tracking(pre_gray, cur_gray, prev_cfd, dl_weights, disflow):
"""计算光流跟踪匹配点和光流图
输入参数:
pre_gray: 上一帧灰度图
cur_gray: 当前帧灰度图
prev_cfd: 上一帧光流图
dl_weights: 融合权重图
disflow: 光流数据结构
返回值:
is_track: 光流点跟踪二值图,即是否具有光流点匹配
track_cfd: 光流跟踪图
"""
check_thres = 8
h, w = pre_gray.shape[:2]
track_cfd = np.zeros_like(prev_cfd)
is_track = np.zeros_like(pre_gray)
flow_fw = disflow.calc(pre_gray, cur_gray, None)
flow_bw = disflow.calc(cur_gray, pre_gray, None)
flow_fw = np.round(flow_fw).astype(np.int)
flow_bw = np.round(flow_bw).astype(np.int)
y_list = np.array(range(h))
x_list = np.array(range(w))
yv, xv = np.meshgrid(y_list, x_list)
yv, xv = yv.T, xv.T
cur_x = xv + flow_fw[:, :, 0]
cur_y = yv + flow_fw[:, :, 1]
# 超出边界不跟踪
not_track = (cur_x < 0) + (cur_x >= w) + (cur_y < 0) + (cur_y >= h)
flow_bw[~not_track] = flow_bw[cur_y[~not_track], cur_x[~not_track]]
not_track += (np.square(flow_fw[:, :, 0] + flow_bw[:, :, 0]) +
np.square(flow_fw[:, :, 1] + flow_bw[:, :, 1])) >= check_thres
track_cfd[cur_y[~not_track], cur_x[~not_track]] = prev_cfd[~not_track]
is_track[cur_y[~not_track], cur_x[~not_track]] = 1
not_flow = np.all(
np.abs(flow_fw) == 0, axis=-1) * np.all(
np.abs(flow_bw) == 0, axis=-1)
dl_weights[cur_y[not_flow], cur_x[not_flow]] = 0.05
return track_cfd, is_track, dl_weights
def human_seg_track_fuse(track_cfd, dl_cfd, dl_weights, is_track):
"""光流追踪图和人像分割结构融合
输入参数:
track_cfd: 光流追踪图
dl_cfd: 当前帧分割结果
dl_weights: 融合权重图
is_track: 光流点匹配二值图
返回
cur_cfd: 光流跟踪图和人像分割结果融合图
"""
fusion_cfd = dl_cfd.copy()
is_track = is_track.astype(np.bool)
fusion_cfd[is_track] = dl_weights[is_track] * dl_cfd[is_track] + (
1 - dl_weights[is_track]) * track_cfd[is_track]
# 确定区域
index_certain = ((dl_cfd > 0.9) + (dl_cfd < 0.1)) * is_track
index_less01 = (dl_weights < 0.1) * index_certain
fusion_cfd[index_less01] = 0.3 * dl_cfd[index_less01] + 0.7 * track_cfd[
index_less01]
index_larger09 = (dl_weights >= 0.1) * index_certain
fusion_cfd[index_larger09] = 0.4 * dl_cfd[index_larger09] + 0.6 * track_cfd[
index_larger09]
return fusion_cfd
def threshold_mask(img, thresh_bg, thresh_fg):
dst = (img / 255.0 - thresh_bg) / (thresh_fg - thresh_bg)
dst[np.where(dst > 1)] = 1
dst[np.where(dst < 0)] = 0
return dst.astype(np.float32)
def postprocess_v(cur_gray, scoremap, prev_gray, pre_cfd, disflow, is_init):
"""光流优化
Args:
cur_gray : 当前帧灰度图
pre_gray : 前一帧灰度图
pre_cfd :前一帧融合结果
scoremap : 当前帧分割结果
difflow : 光流
is_init : 是否第一帧
Returns:
fusion_cfd : 光流追踪图和预测结果融合图
"""
h, w = scoremap.shape
cur_cfd = scoremap.copy()
if is_init:
if h <= 64 or w <= 64:
disflow.setFinestScale(1)
elif h <= 160 or w <= 160:
disflow.setFinestScale(2)
else:
disflow.setFinestScale(3)
fusion_cfd = cur_cfd
else:
weights = np.ones((h, w), np.float32) * 0.3
track_cfd, is_track, weights = human_seg_tracking(
prev_gray, cur_gray, pre_cfd, weights, disflow)
fusion_cfd = human_seg_track_fuse(track_cfd, cur_cfd, weights, is_track)
return fusion_cfd
# -*- coding:utf-8 -*-
import os
import time
import base64
import cv2
import numpy as np
__all__ = ['cv2_to_base64', 'base64_to_cv2', 'postprocess']
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
def postprocess(data_out, org_im, org_im_shape, org_im_path, output_dir,
visualization):
"""
Postprocess output of network. one image at a time.
Args:
data_out (numpy.ndarray): output of network.
org_im (numpy.ndarray): original image.
org_im_shape (list): shape pf original image.
org_im_path (list): path of riginal image.
output_dir (str): output directory to store image.
visualization (bool): whether to save image or not.
Returns:
result (dict): The data of processed image.
"""
result = dict()
for logit in data_out:
logit = (logit * 255).astype(np.uint8)
logit = cv2.resize(logit, (org_im_shape[1], org_im_shape[0]))
rgba = np.concatenate((org_im, np.expand_dims(logit, axis=2)), axis=2)
if visualization:
check_dir(output_dir)
save_im_path = get_save_image_name(org_im, org_im_path, output_dir)
cv2.imwrite(save_im_path, rgba)
result['save_path'] = save_im_path
result['data'] = logit
else:
result['data'] = logit
print("result['data'] shape", result['data'].shape)
return result
def check_dir(dir_path):
if not os.path.exists(dir_path):
os.makedirs(dir_path)
elif os.path.isfile(dir_path):
os.remove(dir_path)
os.makedirs(dir_path)
def get_save_image_name(org_im, org_im_path, output_dir):
"""
Get save image name from source image path.
"""
# name prefix of orginal image
org_im_name = os.path.split(org_im_path)[-1]
im_prefix = os.path.splitext(org_im_name)[0]
ext = '.png'
# save image path
save_im_path = os.path.join(output_dir, im_prefix + ext)
if os.path.exists(save_im_path):
save_im_path = os.path.join(
output_dir, im_prefix + 'time={}'.format(int(time.time())) + ext)
return save_im_path
## 模型概述
HumanSeg-mobile是基于HRNet(Deep High-Resolution Representation Learning for Visual Recognition)的人像分割网络。HRNet在特征提取过程中保持了高分辨率的信息,保持了物体的细节信息,并可通过控制每个分支的通道数调整模型的大小。HumanSeg-mobile采用了HRNet_w18_small_v1的网络结构,模型大小只有5.8M, 适用于移动端或服务端CPU的前置摄像头场景。
## 命令行预测
```
hub run humanseg_mobile --input_path "/PATH/TO/IMAGE"
```
## API
```python
def segment(images=None,
paths=None,
batch_size=1,
use_gpu=False,
visualization=False,
output_dir='humanseg_mobile_output')
```
预测API,用于人像分割。
**参数**
* images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式;
* paths (list\[str\]): 图片的路径;
* batch\_size (int): batch 的大小;
* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置;
* visualization (bool): 是否将识别结果保存为图片文件;
* output\_dir (str): 图片的保存路径。
**返回**
* res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,关键字有 'save\_path', 'data',对应的取值为:
* save\_path (str, optional): 可视化图片的保存路径(仅当visualization=True时存在);
* data (numpy.ndarray): 人像分割结果,仅包含Alpha通道,取值为0-255 (0为全透明,255为不透明),也即取值越大的像素点越可能为人体,取值越小的像素点越可能为背景。
```python
def video_stream_segment(self,
frame_org,
frame_id,
prev_gray,
prev_cfd,
use_gpu=False):
```
预测API,用于逐帧对视频人像分割。
**参数**
* frame_org (numpy.ndarray): 单帧图片数据,ndarray.shape 为 \[H, W, C\],BGR格式;
* frame_id (int): 当前帧的编号;
* prev_gray (numpy.ndarray): 前一帧输入网络图像的灰度图;
* prev_cfd (numpy.ndarray): 前一帧光流追踪图和预测结果融合图
* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置;
**返回**
* img_matting (numpy.ndarray): 人像分割结果,仅包含Alpha通道,取值为0-1 (0为全透明,1为不透明)。
* cur_gray (numpy.ndarray): 当前帧输入网络图像的灰度图;
* optflow_map (numpy.ndarray): 当前帧光流追踪图和预测结果融合图
```python
def video_segment(self,
video_path=None,
use_gpu=False,
save_dir='humanseg_mobile_video_result'):
```
预测API,用于视频人像分割。
**参数**
* video\_path (str): 待分割视频路径。若为None,则从本地摄像头获取视频,并弹出窗口显示在线分割结果。
* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置;
* save\_dir (str): 视频保存路径,仅在video\_path不为None时启用,保存离线视频处理结果。
```python
def save_inference_model(dirname='humanseg_mobile_model',
model_filename=None,
params_filename=None,
combined=True)
```
将模型保存到指定路径。
**参数**
* dirname: 存在模型的目录名称
* model\_filename: 模型文件名称,默认为\_\_model\_\_
* params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效)
* combined: 是否将参数保存到统一的一个文件中
## 代码示例
图片分割及视频分割代码示例:
```python
import cv2
import paddlehub as hub
human_seg = hub.Module('humanseg_mobile')
im = cv2.imread('/PATH/TO/IMAGE')
#visualization=True可以用于查看超分图片效果,可设置为False提升运行速度。
res = human_seg.segment(images=[im],visualization=True)
print(res[0]['data'])
human_seg.video_segment('/PATH/TO/VIDEO')
human_seg.save_inference_model('/PATH/TO/SAVE/MODEL')
```
视频流预测代码示例:
```python
import cv2
import numpy as np
import paddlehub as hub
human_seg = hub.Module('humanseg_mobile')
cap_video = cv2.VideoCapture('\PATH\TO\VIDEO')
fps = cap_video.get(cv2.CAP_PROP_FPS)
save_path = 'humanseg_mobile_video.avi'
width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap_out = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps, (width, height))
prev_gray = None
prev_cfd = None
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
[img_matting, prev_gray, prev_cfd] = human_seg.video_stream_segment(frame_org=frame_org, frame_id=cap_video.get(1), prev_gray=prev_gray, prev_cfd=prev_cfd)
img_matting = np.repeat(img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org + (1 - img_matting) * bg_im).astype(np.uint8)
cap_out.write(comb)
else:
break
cap_video.release()
cap_out.release()
```
## 服务部署
PaddleHub Serving可以部署一个人像分割的在线服务。
## 第一步:启动PaddleHub Serving
运行启动命令:
```shell
$ hub serving start -m humanseg_mobile
```
这样就完成了一个人像分割的服务化API的部署,默认端口号为8866。
**NOTE:** 如使用GPU预测,则需要在启动服务之前,设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。
## 第二步:发送预测请求
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# 发送HTTP请求
org_im = cv2.imread('/PATH/TO/IMAGE')
data = {'images':[cv2_to_base64(org_im)]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/humanseg_mobile"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# 保存图片
mask =cv2.cvtColor(base64_to_cv2(r.json()["results"][0]['data']), cv2.COLOR_BGR2GRAY)
rgba = np.concatenate((org_im, np.expand_dims(mask, axis=2)), axis=2)
cv2.imwrite("segment_human_mobile.png", rgba)
```
### 查看代码
<https://github.com/PaddlePaddle/PaddleSeg/tree/develop/contrib/HumanSeg>
### 依赖
paddlepaddle >= 1.8.0
paddlehub >= 1.7.1
# -*- coding:utf-8 -*-
import os
import time
from collections import OrderedDict
import cv2
import numpy as np
__all__ = ['reader', 'preprocess_v']
def preprocess_v(img, w, h):
img = cv2.resize(img, (w, h), cv2.INTER_LINEAR).astype(np.float32)
img_mean = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img_std = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img = img.transpose((2, 0, 1)) / 255
img -= img_mean
img /= img_std
return img
def reader(images=None, paths=None):
"""
Preprocess to yield image.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C]
paths (list[str]): paths to images.
Yield:
each (collections.OrderedDict): info of original image, preprocessed image.
"""
component = list()
if paths:
for im_path in paths:
each = OrderedDict()
assert os.path.isfile(
im_path), "The {} isn't a valid file path.".format(im_path)
#print(im_path)
im = cv2.imread(im_path).astype('float32')
each['org_im'] = im
each['org_im_path'] = im_path
each['org_im_shape'] = im.shape
component.append(each)
if images is not None:
assert type(images) is list, "images should be a list."
for im in images:
each = OrderedDict()
each['org_im'] = im
each['org_im_path'] = 'ndarray_time={}'.format(
round(time.time(), 6) * 1e6)
each['org_im_shape'] = im.shape
component.append(each)
for element in component:
img = element['org_im'].copy()
img = cv2.resize(img, (192, 192)).astype(np.float32)
img_mean = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img_std = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img = img.transpose((2, 0, 1)) / 255
img -= img_mean
img /= img_std
element['image'] = img
yield element
# -*- coding:utf-8 -*-
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import ast
import os
import os.path as osp
import argparse
import cv2
import numpy as np
import paddle.fluid as fluid
import paddlehub as hub
from paddle.fluid.core import PaddleTensor, AnalysisConfig, create_paddle_predictor
from paddlehub.module.module import moduleinfo, runnable, serving
from humanseg_mobile.processor import postprocess, base64_to_cv2, cv2_to_base64, check_dir
from humanseg_mobile.data_feed import reader, preprocess_v
from humanseg_mobile.optimal import postprocess_v, threshold_mask
@moduleinfo(
name="humanseg_mobile",
type="CV/semantic_segmentation",
author="paddlepaddle",
author_email="",
summary="HRNet_w18_samll_v1 is a semantic segmentation model.",
version="1.1.0")
class HRNetw18samllv1humanseg(hub.Module):
def _initialize(self):
self.default_pretrained_model_path = os.path.join(
self.directory, "humanseg_mobile_inference")
self._set_config()
def _set_config(self):
"""
predictor config setting
"""
self.model_file_path = os.path.join(self.default_pretrained_model_path,
'__model__')
self.params_file_path = os.path.join(self.default_pretrained_model_path,
'__params__')
cpu_config = AnalysisConfig(self.model_file_path, self.params_file_path)
cpu_config.disable_glog_info()
cpu_config.disable_gpu()
self.cpu_predictor = create_paddle_predictor(cpu_config)
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
use_gpu = True
except:
use_gpu = False
if use_gpu:
gpu_config = AnalysisConfig(self.model_file_path,
self.params_file_path)
gpu_config.disable_glog_info()
gpu_config.enable_use_gpu(
memory_pool_init_size_mb=1000, device_id=0)
self.gpu_predictor = create_paddle_predictor(gpu_config)
def segment(self,
images=None,
paths=None,
batch_size=1,
use_gpu=False,
visualization=False,
output_dir='humanseg_mobile_output'):
"""
API for human segmentation.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C], the color space is BGR.
paths (list[str]): The paths of images.
batch_size (int): batch size.
use_gpu (bool): Whether to use gpu.
visualization (bool): Whether to save image or not.
output_dir (str): The path to store output images.
Returns:
res (list[dict]): each element in the list is a dict, the keys and values are:
save_path (str, optional): the path to save images. (Exists only if visualization is True)
data (numpy.ndarray): data of post processed image.
"""
if use_gpu:
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
except:
raise RuntimeError(
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly."
"If you wanna use gpu, please set CUDA_VISIBLE_DEVICES as cuda_device_id."
)
# compatibility with older versions
all_data = list()
for yield_data in reader(images, paths):
all_data.append(yield_data)
total_num = len(all_data)
loop_num = int(np.ceil(total_num / batch_size))
res = list()
for iter_id in range(loop_num):
batch_data = list()
handle_id = iter_id * batch_size
for image_id in range(batch_size):
try:
batch_data.append(all_data[handle_id + image_id])
except:
pass
# feed batch image
batch_image = np.array([data['image'] for data in batch_data])
batch_image = PaddleTensor(batch_image.copy())
output = self.gpu_predictor.run([
batch_image
]) if use_gpu else self.cpu_predictor.run([batch_image])
output = output[1].as_ndarray()
output = np.expand_dims(output[:, 1, :, :], axis=1)
# postprocess one by one
for i in range(len(batch_data)):
out = postprocess(
data_out=output[i],
org_im=batch_data[i]['org_im'],
org_im_shape=batch_data[i]['org_im_shape'],
org_im_path=batch_data[i]['org_im_path'],
output_dir=output_dir,
visualization=visualization)
res.append(out)
return res
def video_stream_segment(self,
frame_org,
frame_id,
prev_gray,
prev_cfd,
use_gpu=False):
"""
API for human video segmentation.
Args:
frame_org (numpy.ndarray): frame data, shape of each is [H, W, C], the color space is BGR.
frame_id (int): index of the frame to be decoded.
prev_gray (numpy.ndarray): gray scale image of last frame, shape of each is [H, W]
prev_cfd (numpy.ndarray): fusion image from optical flow image and segment result, shape of each is [H, W]
use_gpu (bool): Whether to use gpu.
Returns:
img_matting (numpy.ndarray): data of segmentation mask.
cur_gray (numpy.ndarray): gray scale image of current frame, shape of each is [H, W]
optflow_map (numpy.ndarray): optical flow image of current frame, shape of each is [H, W]
"""
resize_h = 192
resize_w = 192
is_init = True
width = int(frame_org.shape[0])
height = int(frame_org.shape[1])
disflow = cv2.DISOpticalFlow_create(
cv2.DISOPTICAL_FLOW_PRESET_ULTRAFAST)
frame = preprocess_v(frame_org, resize_w, resize_h)
image = PaddleTensor(np.array([frame.copy()]))
output = self.gpu_predictor.run(
[image]) if use_gpu else self.cpu_predictor.run([image])
score_map = output[1].as_ndarray()
frame = np.transpose(frame, axes=[1, 2, 0])
score_map = np.transpose(np.squeeze(score_map, 0), axes=[1, 2, 0])
cur_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cur_gray = cv2.resize(cur_gray, (resize_w, resize_h))
score_map = 255 * score_map[:, :, 1]
if frame_id == 1:
prev_gray = np.zeros((resize_h, resize_w), np.uint8)
prev_cfd = np.zeros((resize_h, resize_w), np.float32)
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
else:
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
optflow_map = threshold_mask(optflow_map, thresh_bg=0.2, thresh_fg=0.8)
img_matting = cv2.resize(optflow_map, (height, width), cv2.INTER_LINEAR)
return [img_matting, cur_gray, optflow_map]
def video_segment(self,
video_path=None,
use_gpu=False,
save_dir='humanseg_mobile_video_result'):
"""
API for human video segmentation.
Args:
video_path (str): The path to take the video under preprocess. If video_path is None, it will capture
the vedio from your camera.
use_gpu (bool): Whether to use gpu.
save_dir (str): The path to store output video.
"""
if use_gpu:
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
except:
raise RuntimeError(
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. "
"If you wanna use gpu, please set CUDA_VISIBLE_DEVICES as cuda_device_id."
)
resize_h = 192
resize_w = 192
if not video_path:
cap_video = cv2.VideoCapture(0)
else:
cap_video = cv2.VideoCapture(video_path)
if not cap_video.isOpened():
raise IOError("Error opening video stream or file, "
"--video_path whether existing: {}"
" or camera whether working".format(video_path))
width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
disflow = cv2.DISOpticalFlow_create(
cv2.DISOPTICAL_FLOW_PRESET_ULTRAFAST)
prev_gray = np.zeros((resize_h, resize_w), np.uint8)
prev_cfd = np.zeros((resize_h, resize_w), np.float32)
is_init = True
fps = cap_video.get(cv2.CAP_PROP_FPS)
if video_path is not None:
print('Please wait. It is computing......')
if not osp.exists(save_dir):
os.makedirs(save_dir)
save_path = osp.join(save_dir, 'result' + '.avi')
cap_out = cv2.VideoWriter(
save_path, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps,
(width, height))
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
frame = preprocess_v(frame_org, resize_w, resize_h)
image = PaddleTensor(np.array([frame.copy()]))
output = self.gpu_predictor.run(
[image]) if use_gpu else self.cpu_predictor.run([image])
score_map = output[1].as_ndarray()
frame = np.transpose(frame, axes=[1, 2, 0])
score_map = np.transpose(
np.squeeze(score_map, 0), axes=[1, 2, 0])
cur_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cur_gray = cv2.resize(cur_gray, (resize_w, resize_h))
score_map = 255 * score_map[:, :, 1]
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
prev_gray = cur_gray.copy()
prev_cfd = optflow_map.copy()
optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
optflow_map = threshold_mask(
optflow_map, thresh_bg=0.2, thresh_fg=0.8)
img_matting = cv2.resize(optflow_map, (width, height),
cv2.INTER_LINEAR)
img_matting = np.repeat(
img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org +
(1 - img_matting) * bg_im).astype(np.uint8)
cap_out.write(comb)
else:
break
cap_video.release()
cap_out.release()
else:
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
frame = preprocess_v(frame_org, resize_w, resize_h)
image = PaddleTensor(np.array([frame.copy()]))
output = self.gpu_predictor.run(
[image]) if use_gpu else self.cpu_predictor.run([image])
score_map = output[1].as_ndarray()
frame = np.transpose(frame, axes=[1, 2, 0])
score_map = np.transpose(
np.squeeze(score_map, 0), axes=[1, 2, 0])
cur_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cur_gray = cv2.resize(cur_gray, (resize_w, resize_h))
score_map = 255 * score_map[:, :, 1]
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
prev_gray = cur_gray.copy()
prev_cfd = optflow_map.copy()
optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
optflow_map = threshold_mask(
optflow_map, thresh_bg=0.2, thresh_fg=0.8)
img_matting = cv2.resize(optflow_map, (width, height),
cv2.INTER_LINEAR)
img_matting = np.repeat(
img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org +
(1 - img_matting) * bg_im).astype(np.uint8)
cv2.imshow('HumanSegmentation', comb)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
else:
break
cap_video.release()
def save_inference_model(self,
dirname='humanseg_mobile_model',
model_filename=None,
params_filename=None,
combined=True):
if combined:
model_filename = "__model__" if not model_filename else model_filename
params_filename = "__params__" if not params_filename else params_filename
place = fluid.CPUPlace()
exe = fluid.Executor(place)
program, feeded_var_names, target_vars = fluid.io.load_inference_model(
dirname=self.default_pretrained_model_path,
model_filename=model_filename,
params_filename=params_filename,
executor=exe)
fluid.io.save_inference_model(
dirname=dirname,
main_program=program,
executor=exe,
feeded_var_names=feeded_var_names,
target_vars=target_vars,
model_filename=model_filename,
params_filename=params_filename)
@serving
def serving_method(self, images, **kwargs):
"""
Run as a service.
"""
images_decode = [base64_to_cv2(image) for image in images]
results = self.segment(images=images_decode, **kwargs)
results = [{
'data': cv2_to_base64(result['data'])
} for result in results]
return results
@runnable
def run_cmd(self, argvs):
"""
Run as a command.
"""
self.parser = argparse.ArgumentParser(
description="Run the {} module.".format(self.name),
prog='hub run {}'.format(self.name),
usage='%(prog)s',
add_help=True)
self.arg_input_group = self.parser.add_argument_group(
title="Input options", description="Input data. Required")
self.arg_config_group = self.parser.add_argument_group(
title="Config options",
description=
"Run configuration for controlling module behavior, not required.")
self.add_module_config_arg()
self.add_module_input_arg()
args = self.parser.parse_args(argvs)
results = self.segment(
paths=[args.input_path],
batch_size=args.batch_size,
use_gpu=args.use_gpu,
output_dir=args.output_dir,
visualization=args.visualization)
if args.save_dir is not None:
check_dir(args.save_dir)
self.save_inference_model(args.save_dir)
return results
def add_module_config_arg(self):
"""
Add the command config options.
"""
self.arg_config_group.add_argument(
'--use_gpu',
type=ast.literal_eval,
default=False,
help="whether use GPU or not")
self.arg_config_group.add_argument(
'--output_dir',
type=str,
default='humanseg_mobile_output',
help="The directory to save output images.")
self.arg_config_group.add_argument(
'--save_dir',
type=str,
default='humanseg_mobile_model',
help="The directory to save model.")
self.arg_config_group.add_argument(
'--visualization',
type=ast.literal_eval,
default=False,
help="whether to save output as images.")
self.arg_config_group.add_argument(
'--batch_size',
type=ast.literal_eval,
default=1,
help="batch size.")
def add_module_input_arg(self):
"""
Add the command input options.
"""
self.arg_input_group.add_argument(
'--input_path', type=str, help="path to image.")
if __name__ == "__main__":
m = HRNetw18samllv1humanseg()
img = cv2.imread('photo.jpg')
#res = m.segment(images=[img], visualization=True)
#print(res[0]['data'])
#m.video_segment('')
cap_video = cv2.VideoCapture('video_test.mp4')
fps = cap_video.get(cv2.CAP_PROP_FPS)
save_path = 'result_frame.avi'
width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap_out = cv2.VideoWriter(save_path,
cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps,
(width, height))
prev_gray = None
prev_cfd = None
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
[img_matting, prev_gray, prev_cfd] = m.video_stream_segment(
frame_org=frame_org,
frame_id=cap_video.get(1),
prev_gray=prev_gray,
prev_cfd=prev_cfd)
img_matting = np.repeat(img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org + (1 - img_matting) * bg_im).astype(
np.uint8)
cap_out.write(comb)
else:
break
cap_video.release()
cap_out.release()
# -*- coding:utf-8 -*-
import numpy as np
def human_seg_tracking(pre_gray, cur_gray, prev_cfd, dl_weights, disflow):
"""计算光流跟踪匹配点和光流图
输入参数:
pre_gray: 上一帧灰度图
cur_gray: 当前帧灰度图
prev_cfd: 上一帧光流图
dl_weights: 融合权重图
disflow: 光流数据结构
返回值:
is_track: 光流点跟踪二值图,即是否具有光流点匹配
track_cfd: 光流跟踪图
"""
check_thres = 8
h, w = pre_gray.shape[:2]
track_cfd = np.zeros_like(prev_cfd)
is_track = np.zeros_like(pre_gray)
flow_fw = disflow.calc(pre_gray, cur_gray, None)
flow_bw = disflow.calc(cur_gray, pre_gray, None)
flow_fw = np.round(flow_fw).astype(np.int)
flow_bw = np.round(flow_bw).astype(np.int)
y_list = np.array(range(h))
x_list = np.array(range(w))
yv, xv = np.meshgrid(y_list, x_list)
yv, xv = yv.T, xv.T
cur_x = xv + flow_fw[:, :, 0]
cur_y = yv + flow_fw[:, :, 1]
# 超出边界不跟踪
not_track = (cur_x < 0) + (cur_x >= w) + (cur_y < 0) + (cur_y >= h)
flow_bw[~not_track] = flow_bw[cur_y[~not_track], cur_x[~not_track]]
not_track += (np.square(flow_fw[:, :, 0] + flow_bw[:, :, 0]) +
np.square(flow_fw[:, :, 1] + flow_bw[:, :, 1])) >= check_thres
track_cfd[cur_y[~not_track], cur_x[~not_track]] = prev_cfd[~not_track]
is_track[cur_y[~not_track], cur_x[~not_track]] = 1
not_flow = np.all(
np.abs(flow_fw) == 0, axis=-1) * np.all(
np.abs(flow_bw) == 0, axis=-1)
dl_weights[cur_y[not_flow], cur_x[not_flow]] = 0.05
return track_cfd, is_track, dl_weights
def human_seg_track_fuse(track_cfd, dl_cfd, dl_weights, is_track):
"""光流追踪图和人像分割结构融合
输入参数:
track_cfd: 光流追踪图
dl_cfd: 当前帧分割结果
dl_weights: 融合权重图
is_track: 光流点匹配二值图
返回
cur_cfd: 光流跟踪图和人像分割结果融合图
"""
fusion_cfd = dl_cfd.copy()
is_track = is_track.astype(np.bool)
fusion_cfd[is_track] = dl_weights[is_track] * dl_cfd[is_track] + (
1 - dl_weights[is_track]) * track_cfd[is_track]
# 确定区域
index_certain = ((dl_cfd > 0.9) + (dl_cfd < 0.1)) * is_track
index_less01 = (dl_weights < 0.1) * index_certain
fusion_cfd[index_less01] = 0.3 * dl_cfd[index_less01] + 0.7 * track_cfd[
index_less01]
index_larger09 = (dl_weights >= 0.1) * index_certain
fusion_cfd[index_larger09] = 0.4 * dl_cfd[index_larger09] + 0.6 * track_cfd[
index_larger09]
return fusion_cfd
def threshold_mask(img, thresh_bg, thresh_fg):
dst = (img / 255.0 - thresh_bg) / (thresh_fg - thresh_bg)
dst[np.where(dst > 1)] = 1
dst[np.where(dst < 0)] = 0
return dst.astype(np.float32)
def postprocess_v(cur_gray, scoremap, prev_gray, pre_cfd, disflow, is_init):
"""光流优化
Args:
cur_gray : 当前帧灰度图
pre_gray : 前一帧灰度图
pre_cfd :前一帧融合结果
scoremap : 当前帧分割结果
difflow : 光流
is_init : 是否第一帧
Returns:
fusion_cfd : 光流追踪图和预测结果融合图
"""
h, w = scoremap.shape
cur_cfd = scoremap.copy()
if is_init:
if h <= 64 or w <= 64:
disflow.setFinestScale(1)
elif h <= 160 or w <= 160:
disflow.setFinestScale(2)
else:
disflow.setFinestScale(3)
fusion_cfd = cur_cfd
else:
weights = np.ones((h, w), np.float32) * 0.3
track_cfd, is_track, weights = human_seg_tracking(
prev_gray, cur_gray, pre_cfd, weights, disflow)
fusion_cfd = human_seg_track_fuse(track_cfd, cur_cfd, weights, is_track)
return fusion_cfd
# -*- coding:utf-8 -*-
import os
import time
import base64
import cv2
import numpy as np
__all__ = ['cv2_to_base64', 'base64_to_cv2', 'postprocess']
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
def postprocess(data_out,
org_im,
org_im_shape,
org_im_path,
output_dir,
visualization,
thresh=120):
"""
Postprocess output of network. one image at a time.
Args:
data_out (numpy.ndarray): output of network.
org_im (numpy.ndarray): original image.
org_im_shape (list): shape pf original image.
org_im_path (list): path of riginal image.
output_dir (str): output directory to store image.
visualization (bool): whether to save image or not.
thresh (float): threshold.
Returns:
result (dict): The data of processed image.
"""
result = dict()
for logit in data_out:
logit = (logit * 255).astype(np.uint8)
logit = cv2.resize(logit, (org_im_shape[1], org_im_shape[0]))
rgba = np.concatenate((org_im, np.expand_dims(logit, axis=2)), axis=2)
if visualization:
check_dir(output_dir)
save_im_path = get_save_image_name(org_im, org_im_path, output_dir)
cv2.imwrite(save_im_path, rgba)
result['save_path'] = save_im_path
result['data'] = logit
else:
result['data'] = logit
return result
def check_dir(dir_path):
if not os.path.exists(dir_path):
os.makedirs(dir_path)
elif os.path.isfile(dir_path):
os.remove(dir_path)
os.makedirs(dir_path)
def get_save_image_name(org_im, org_im_path, output_dir):
"""
Get save image name from source image path.
"""
# name prefix of orginal image
org_im_name = os.path.split(org_im_path)[-1]
im_prefix = os.path.splitext(org_im_name)[0]
ext = '.png'
# save image path
save_im_path = os.path.join(output_dir, im_prefix + ext)
if os.path.exists(save_im_path):
save_im_path = os.path.join(
output_dir, im_prefix + 'time={}'.format(int(time.time())) + ext)
return save_im_path
## 模型概述
高精度模型,适用于服务端GPU且背景复杂的人像场景, 模型结构为Deeplabv3+/Xcetion65, 模型大小为158M,网络结构如图:
<p align="center">
<img src="https://paddlehub.bj.bcebos.com/paddlehub-img/deeplabv3plus.png" hspace='10'/> <br />
</p>
## 命令行预测
```
hub run humanseg_server --input_path "/PATH/TO/IMAGE"
```
## API
```python
def segment(self,
images=None,
paths=None,
batch_size=1,
use_gpu=False,
visualization=False,
output_dir='humanseg_server_output'):
```
预测API,用于人像分割。
**参数**
* images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式;
* paths (list\[str\]): 图片的路径;
* batch\_size (int): batch 的大小;
* use\_gpu (bool): 是否使用 GPU;
* visualization (bool): 是否将识别结果保存为图片文件;
* output\_dir (str): 图片的保存路径。
**返回**
* res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,关键字有 'save\_path', 'data',对应的取值为:
* save\_path (str, optional): 可视化图片的保存路径(仅当visualization=True时存在);
* data (numpy.ndarray): 人像分割结果,仅包含Alpha通道,取值为0-255 (0为全透明,255为不透明),也即取值越大的像素点越可能为人体,取值越小的像素点越可能为背景。
```python
def video_stream_segment(self,
frame_org,
frame_id,
prev_gray,
prev_cfd,
use_gpu=False):
```
预测API,用于逐帧对视频人像分割。
**参数**
* frame_org (numpy.ndarray): 单帧图片数据,ndarray.shape 为 \[H, W, C\],BGR格式;
* frame_id (int): 当前帧的编号;
* prev_gray (numpy.ndarray): 前一帧输入网络图像的灰度图;
* prev_cfd (numpy.ndarray): 前一帧光流追踪图和预测结果融合图;
* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。
**返回**
* img_matting (numpy.ndarray): 人像分割结果,仅包含Alpha通道,取值为0-1 (0为全透明,1为不透明);
* cur_gray (numpy.ndarray): 当前帧输入分割网络图像的灰度图;
* optflow_map (numpy.ndarray): 当前帧光流追踪图和预测结果融合图。
```python
def video_segment(self,
video_path=None,
use_gpu=False,
save_dir='humanseg_server_video'):
```
预测API,用于视频人像分割。
**参数**
* video\_path (str): 待分割视频路径。若为None,则从本地摄像头获取视频,并弹出窗口显示在线分割结果;
* use\_gpu (bool): 是否使用GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置;
* save\_dir (str): 视频保存路径,仅在video\_path不为None时启用,保存离线视频处理结果。
```python
def save_inference_model(dirname,
model_filename=None,
params_filename=None,
combined=True):
```
将模型保存到指定路径。
**参数**
* dirname: 存在模型的目录名称
* model\_filename: 模型文件名称,默认为\_\_model\_\_
* params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效)
* combined: 是否将参数保存到统一的一个文件中
## 代码示例
图片分割及视频分割代码示例:
```python
import cv2
import paddlehub as hub
human_seg = hub.Module('humanseg_server')
im = cv2.imread('/PATH/TO/IMAGE')
#visualization=True可以用于查看超分图片效果,可设置为False提升运行速度。
res = human_seg.segment(images=[im],visualization=True)
print(res[0]['data'])
human_seg.video_segment('/PATH/TO/VIDEO')
human_seg.save_inference_model('/PATH/TO/SAVE/MODEL')
```
视频流预测代码示例:
```python
import cv2
import numpy as np
import paddlehub as hub
human_seg = hub.Module('humanseg_server')
cap_video = cv2.VideoCapture('\PATH\TO\VIDEO')
fps = cap_video.get(cv2.CAP_PROP_FPS)
save_path = 'humanseg_server_video.avi'
width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap_out = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps, (width, height))
prev_gray = None
prev_cfd = None
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
[img_matting, prev_gray, prev_cfd] = human_seg.video_stream_segment(frame_org=frame_org, frame_id=cap_video.get(1), prev_gray=prev_gray, prev_cfd=prev_cfd)
img_matting = np.repeat(img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org + (1 - img_matting) * bg_im).astype(np.uint8)
cap_out.write(comb)
else:
break
cap_video.release()
cap_out.release()
```
## 服务部署
PaddleHub Serving可以部署一个人像分割的在线服务。
## 第一步:启动PaddleHub Serving
运行启动命令:
```shell
$ hub serving start -m humanseg_server
```
这样就完成了一个人像分割的服务化API的部署,默认端口号为8866。
**NOTE:** 如使用GPU预测,则需要在启动服务之前,设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。
## 第二步:发送预测请求
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# 发送HTTP请求
org_im = cv2.imread('PATH/TO/IMAGE')
data = {'images':[cv2_to_base64(org_im)]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/humanseg_server"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# 保存图片
mask =cv2.cvtColor(base64_to_cv2(r.json()["results"][0]['data']), cv2.COLOR_BGR2GRAY)
rgba = np.concatenate((org_im, np.expand_dims(mask, axis=2)), axis=2)
cv2.imwrite("segment_human_server.png", rgba)
```
### 查看代码
https://github.com/PaddlePaddle/PaddleSeg/tree/develop/contrib/HumanSeg
### 依赖
paddlepaddle >= 1.8.0
paddlehub >= 1.7.1
# coding=utf-8
import os
import time
from collections import OrderedDict
import cv2
import numpy as np
from PIL import Image
__all__ = ['reader', 'preprocess_v']
def preprocess_v(img, w, h):
img = cv2.resize(img, (w, h), cv2.INTER_LINEAR).astype(np.float32)
img_mean = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img_std = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img = img.transpose((2, 0, 1)) / 255
img -= img_mean
img /= img_std
return img
def reader(images=None, paths=None):
"""
Preprocess to yield image.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C]
paths (list[str]): paths to images.
Yield:
each (collections.OrderedDict): info of original image, preprocessed image.
"""
component = list()
if paths:
for im_path in paths:
each = OrderedDict()
assert os.path.isfile(
im_path), "The {} isn't a valid file path.".format(im_path)
im = cv2.imread(im_path).astype('float32')
each['org_im'] = im
each['org_im_path'] = im_path
each['org_im_shape'] = im.shape
component.append(each)
if images is not None:
assert type(images) is list, "images should be a list."
for im in images:
each = OrderedDict()
each['org_im'] = im
each['org_im_path'] = 'ndarray_time={}'.format(
round(time.time(), 6) * 1e6)
each['org_im_shape'] = im.shape
component.append(each)
for element in component:
img = element['org_im'].copy()
img = cv2.resize(img, (513, 513)).astype(np.float32)
img_mean = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img_std = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img = img.transpose((2, 0, 1)) / 255
img -= img_mean
img /= img_std
element['image'] = img
yield element
# -*- coding:utf-8 -*-
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import ast
import os
import os.path as osp
import argparse
import cv2
import numpy as np
import paddle.fluid as fluid
import paddlehub as hub
from paddle.fluid.core import PaddleTensor, AnalysisConfig, create_paddle_predictor
from paddlehub.module.module import moduleinfo, runnable, serving
from humanseg_server.processor import postprocess, base64_to_cv2, cv2_to_base64, check_dir
from humanseg_server.data_feed import reader, preprocess_v
from humanseg_server.optimal import postprocess_v, threshold_mask
@moduleinfo(
name="humanseg_server",
type="CV/semantic_segmentation",
author="baidu-vis",
author_email="",
summary="DeepLabv3+ is a semantic segmentation model.",
version="1.1.0")
class DeeplabV3pXception65HumanSeg(hub.Module):
def _initialize(self):
self.default_pretrained_model_path = os.path.join(
self.directory, "humanseg_server_inference")
self._set_config()
def _set_config(self):
"""
predictor config setting
"""
self.model_file_path = os.path.join(self.default_pretrained_model_path,
'__model__')
self.params_file_path = os.path.join(self.default_pretrained_model_path,
'__params__')
cpu_config = AnalysisConfig(self.model_file_path, self.params_file_path)
cpu_config.disable_glog_info()
cpu_config.disable_gpu()
self.cpu_predictor = create_paddle_predictor(cpu_config)
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
use_gpu = True
except:
use_gpu = False
if use_gpu:
gpu_config = AnalysisConfig(self.model_file_path,
self.params_file_path)
gpu_config.disable_glog_info()
gpu_config.enable_use_gpu(
memory_pool_init_size_mb=1000, device_id=0)
self.gpu_predictor = create_paddle_predictor(gpu_config)
def segment(self,
images=None,
paths=None,
batch_size=1,
use_gpu=False,
visualization=False,
output_dir='humanseg_server_output'):
"""
API for human segmentation.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C], the color space is BGR.
paths (list[str]): The paths of images.
batch_size (int): batch size.
use_gpu (bool): Whether to use gpu.
visualization (bool): Whether to save image or not.
output_dir (str): The path to store output images.
Returns:
res (list[dict]): each element in the list is a dict, the keys and values are:
save_path (str, optional): the path to save images. (Exists only if visualization is True)
data (numpy.ndarray): data of post processed image.
"""
if use_gpu:
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
except:
raise RuntimeError(
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES as cuda_device_id."
)
# compatibility with older versions
all_data = list()
for yield_data in reader(images, paths):
all_data.append(yield_data)
total_num = len(all_data)
loop_num = int(np.ceil(total_num / batch_size))
res = list()
for iter_id in range(loop_num):
batch_data = list()
handle_id = iter_id * batch_size
for image_id in range(batch_size):
try:
batch_data.append(all_data[handle_id + image_id])
except:
pass
# feed batch image
batch_image = np.array([data['image'] for data in batch_data])
batch_image = PaddleTensor(batch_image.copy())
output = self.gpu_predictor.run([
batch_image
]) if use_gpu else self.cpu_predictor.run([batch_image])
output = output[1].as_ndarray()
output = np.expand_dims(output[:, 1, :, :], axis=1)
# postprocess one by one
for i in range(len(batch_data)):
out = postprocess(
data_out=output[i],
org_im=batch_data[i]['org_im'],
org_im_shape=batch_data[i]['org_im_shape'],
org_im_path=batch_data[i]['org_im_path'],
output_dir=output_dir,
visualization=visualization)
res.append(out)
return res
def video_stream_segment(self,
frame_org,
frame_id,
prev_gray,
prev_cfd,
use_gpu=False):
"""
API for human video segmentation.
Args:
frame_org (numpy.ndarray): frame data, shape of each is [H, W, C], the color space is BGR.
frame_id (int): index of the frame to be decoded.
prev_gray (numpy.ndarray): gray scale image of last frame, shape of each is [H, W]
prev_cfd (numpy.ndarray): fusion image from optical flow image and segment result, shape of each is [H, W]
use_gpu (bool): Whether to use gpu.
Returns:
img_matting (numpy.ndarray): data of segmentation mask.
cur_gray (numpy.ndarray): gray scale image of current frame, shape of each is [H, W]
optflow_map (numpy.ndarray): optical flow image of current frame, shape of each is [H, W]
"""
resize_h = 512
resize_w = 512
is_init = True
width = int(frame_org.shape[0])
height = int(frame_org.shape[1])
disflow = cv2.DISOpticalFlow_create(
cv2.DISOPTICAL_FLOW_PRESET_ULTRAFAST)
frame = preprocess_v(frame_org, resize_w, resize_h)
image = PaddleTensor(np.array([frame.copy()]))
output = self.gpu_predictor.run(
[image]) if use_gpu else self.cpu_predictor.run([image])
score_map = output[1].as_ndarray()
frame = np.transpose(frame, axes=[1, 2, 0])
score_map = np.transpose(np.squeeze(score_map, 0), axes=[1, 2, 0])
cur_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cur_gray = cv2.resize(cur_gray, (resize_w, resize_h))
score_map = 255 * score_map[:, :, 1]
if frame_id == 1:
prev_gray = np.zeros((resize_h, resize_w), np.uint8)
prev_cfd = np.zeros((resize_h, resize_w), np.float32)
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
else:
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
optflow_map = threshold_mask(optflow_map, thresh_bg=0.2, thresh_fg=0.8)
img_matting = cv2.resize(optflow_map, (height, width), cv2.INTER_LINEAR)
return [img_matting, cur_gray, optflow_map]
def video_segment(self,
video_path=None,
use_gpu=False,
save_dir='humanseg_server_video'):
resize_h = 512
resize_w = 512
if not video_path:
cap_video = cv2.VideoCapture(0)
else:
cap_video = cv2.VideoCapture(video_path)
if not cap_video.isOpened():
raise IOError("Error opening video stream or file, "
"--video_path whether existing: {}"
" or camera whether working".format(video_path))
width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
disflow = cv2.DISOpticalFlow_create(
cv2.DISOPTICAL_FLOW_PRESET_ULTRAFAST)
prev_gray = np.zeros((resize_h, resize_w), np.uint8)
prev_cfd = np.zeros((resize_h, resize_w), np.float32)
is_init = True
fps = cap_video.get(cv2.CAP_PROP_FPS)
if video_path is not None:
print('Please wait. It is computing......')
if not osp.exists(save_dir):
os.makedirs(save_dir)
save_path = osp.join(save_dir, 'result' + '.avi')
cap_out = cv2.VideoWriter(
save_path, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps,
(width, height))
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
frame = preprocess_v(frame_org, resize_w, resize_h)
image = PaddleTensor(np.array([frame.copy()]))
output = self.gpu_predictor.run(
[image]) if use_gpu else self.cpu_predictor.run([image])
score_map = output[1].as_ndarray()
frame = np.transpose(frame, axes=[1, 2, 0])
score_map = np.transpose(
np.squeeze(score_map, 0), axes=[1, 2, 0])
cur_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cur_gray = cv2.resize(cur_gray, (resize_w, resize_h))
score_map = 255 * score_map[:, :, 1]
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
prev_gray = cur_gray.copy()
prev_cfd = optflow_map.copy()
optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
optflow_map = threshold_mask(
optflow_map, thresh_bg=0.2, thresh_fg=0.8)
img_matting = cv2.resize(optflow_map, (width, height),
cv2.INTER_LINEAR)
img_matting = np.repeat(
img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org +
(1 - img_matting) * bg_im).astype(np.uint8)
cap_out.write(comb)
else:
break
cap_video.release()
cap_out.release()
else:
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
frame = preprocess_v(frame_org, resize_w, resize_h)
image = PaddleTensor(np.array([frame.copy()]))
output = self.gpu_predictor.run(
[image]) if use_gpu else self.cpu_predictor.run([image])
score_map = output[1].as_ndarray()
frame = np.transpose(frame, axes=[1, 2, 0])
score_map = np.transpose(
np.squeeze(score_map, 0), axes=[1, 2, 0])
cur_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cur_gray = cv2.resize(cur_gray, (resize_w, resize_h))
score_map = 255 * score_map[:, :, 1]
optflow_map = postprocess_v(cur_gray, score_map, prev_gray,
prev_cfd, disflow, is_init)
prev_gray = cur_gray.copy()
prev_cfd = optflow_map.copy()
optflow_map = cv2.GaussianBlur(optflow_map, (3, 3), 0)
optflow_map = threshold_mask(
optflow_map, thresh_bg=0.2, thresh_fg=0.8)
img_matting = cv2.resize(optflow_map, (width, height),
cv2.INTER_LINEAR)
img_matting = np.repeat(
img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org +
(1 - img_matting) * bg_im).astype(np.uint8)
cv2.imshow('HumanSegmentation', comb)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
else:
break
cap_video.release()
def save_inference_model(self,
dirname='humanseg_server_model',
model_filename=None,
params_filename=None,
combined=True):
if combined:
model_filename = "__model__" if not model_filename else model_filename
params_filename = "__params__" if not params_filename else params_filename
place = fluid.CPUPlace()
exe = fluid.Executor(place)
program, feeded_var_names, target_vars = fluid.io.load_inference_model(
dirname=self.default_pretrained_model_path,
model_filename=model_filename,
params_filename=params_filename,
executor=exe)
fluid.io.save_inference_model(
dirname=dirname,
main_program=program,
executor=exe,
feeded_var_names=feeded_var_names,
target_vars=target_vars,
model_filename=model_filename,
params_filename=params_filename)
@serving
def serving_method(self, images, **kwargs):
"""
Run as a service.
"""
images_decode = [base64_to_cv2(image) for image in images]
results = self.segment(images=images_decode, **kwargs)
results = [{
'data': cv2_to_base64(result['data'])
} for result in results]
return results
@runnable
def run_cmd(self, argvs):
"""
Run as a command.
"""
self.parser = argparse.ArgumentParser(
description="Run the {} module.".format(self.name),
prog='hub run {}'.format(self.name),
usage='%(prog)s',
add_help=True)
self.arg_input_group = self.parser.add_argument_group(
title="Input options", description="Input data. Required")
self.arg_config_group = self.parser.add_argument_group(
title="Config options",
description=
"Run configuration for controlling module behavior, not required.")
self.add_module_config_arg()
self.add_module_input_arg()
args = self.parser.parse_args(argvs)
results = self.segment(
paths=[args.input_path],
batch_size=args.batch_size,
use_gpu=args.use_gpu,
output_dir=args.output_dir,
visualization=args.visualization)
if args.save_dir is not None:
check_dir(args.save_dir)
self.save_inference_model(args.save_dir)
return results
def add_module_config_arg(self):
"""
Add the command config options.
"""
self.arg_config_group.add_argument(
'--use_gpu',
type=ast.literal_eval,
default=False,
help="whether use GPU or not")
self.arg_config_group.add_argument(
'--output_dir',
type=str,
default='humanseg_server_output',
help="The directory to save output images.")
self.arg_config_group.add_argument(
'--save_dir',
type=str,
default='humanseg_server_model',
help="The directory to save model.")
self.arg_config_group.add_argument(
'--visualization',
type=ast.literal_eval,
default=False,
help="whether to save output as images.")
self.arg_config_group.add_argument(
'--batch_size',
type=ast.literal_eval,
default=1,
help="batch size.")
def add_module_input_arg(self):
"""
Add the command input options.
"""
self.arg_input_group.add_argument(
'--input_path', type=str, help="path to image.")
if __name__ == "__main__":
m = DeeplabV3pXception65HumanSeg()
# img = cv2.imread('photo.jpg')
# res = m.segment(images=[img])
# print(res[0]['data'])
# m.save_inference_model()
#m.video_segment(video_path='video_test.mp4')
img = cv2.imread('photo.jpg')
# res = m.segment(images=[img], visualization=True)
# print(res[0]['data'])
# m.video_segment('')
cap_video = cv2.VideoCapture('video_test.mp4')
fps = cap_video.get(cv2.CAP_PROP_FPS)
save_path = 'result_frame.avi'
width = int(cap_video.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap_video.get(cv2.CAP_PROP_FRAME_HEIGHT))
cap_out = cv2.VideoWriter(save_path,
cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), fps,
(width, height))
prev_gray = None
prev_cfd = None
while cap_video.isOpened():
ret, frame_org = cap_video.read()
if ret:
[img_matting, prev_gray, prev_cfd] = m.video_stream_segment(
frame_org=frame_org,
frame_id=cap_video.get(1),
prev_gray=prev_gray,
prev_cfd=prev_cfd)
img_matting = np.repeat(img_matting[:, :, np.newaxis], 3, axis=2)
bg_im = np.ones_like(img_matting) * 255
comb = (img_matting * frame_org + (1 - img_matting) * bg_im).astype(
np.uint8)
cap_out.write(comb)
else:
break
cap_video.release()
cap_out.release()
# -*- coding:utf-8 -*-
import numpy as np
def human_seg_tracking(pre_gray, cur_gray, prev_cfd, dl_weights, disflow):
"""计算光流跟踪匹配点和光流图
输入参数:
pre_gray: 上一帧灰度图
cur_gray: 当前帧灰度图
prev_cfd: 上一帧光流图
dl_weights: 融合权重图
disflow: 光流数据结构
返回值:
is_track: 光流点跟踪二值图,即是否具有光流点匹配
track_cfd: 光流跟踪图
"""
check_thres = 8
h, w = pre_gray.shape[:2]
track_cfd = np.zeros_like(prev_cfd)
is_track = np.zeros_like(pre_gray)
flow_fw = disflow.calc(pre_gray, cur_gray, None)
flow_bw = disflow.calc(cur_gray, pre_gray, None)
flow_fw = np.round(flow_fw).astype(np.int)
flow_bw = np.round(flow_bw).astype(np.int)
y_list = np.array(range(h))
x_list = np.array(range(w))
yv, xv = np.meshgrid(y_list, x_list)
yv, xv = yv.T, xv.T
cur_x = xv + flow_fw[:, :, 0]
cur_y = yv + flow_fw[:, :, 1]
# 超出边界不跟踪
not_track = (cur_x < 0) + (cur_x >= w) + (cur_y < 0) + (cur_y >= h)
flow_bw[~not_track] = flow_bw[cur_y[~not_track], cur_x[~not_track]]
not_track += (np.square(flow_fw[:, :, 0] + flow_bw[:, :, 0]) +
np.square(flow_fw[:, :, 1] + flow_bw[:, :, 1])) >= check_thres
track_cfd[cur_y[~not_track], cur_x[~not_track]] = prev_cfd[~not_track]
is_track[cur_y[~not_track], cur_x[~not_track]] = 1
not_flow = np.all(
np.abs(flow_fw) == 0, axis=-1) * np.all(
np.abs(flow_bw) == 0, axis=-1)
dl_weights[cur_y[not_flow], cur_x[not_flow]] = 0.05
return track_cfd, is_track, dl_weights
def human_seg_track_fuse(track_cfd, dl_cfd, dl_weights, is_track):
"""光流追踪图和人像分割结构融合
输入参数:
track_cfd: 光流追踪图
dl_cfd: 当前帧分割结果
dl_weights: 融合权重图
is_track: 光流点匹配二值图
返回
cur_cfd: 光流跟踪图和人像分割结果融合图
"""
fusion_cfd = dl_cfd.copy()
is_track = is_track.astype(np.bool)
fusion_cfd[is_track] = dl_weights[is_track] * dl_cfd[is_track] + (
1 - dl_weights[is_track]) * track_cfd[is_track]
# 确定区域
index_certain = ((dl_cfd > 0.9) + (dl_cfd < 0.1)) * is_track
index_less01 = (dl_weights < 0.1) * index_certain
fusion_cfd[index_less01] = 0.3 * dl_cfd[index_less01] + 0.7 * track_cfd[
index_less01]
index_larger09 = (dl_weights >= 0.1) * index_certain
fusion_cfd[index_larger09] = 0.4 * dl_cfd[index_larger09] + 0.6 * track_cfd[
index_larger09]
return fusion_cfd
def threshold_mask(img, thresh_bg, thresh_fg):
dst = (img / 255.0 - thresh_bg) / (thresh_fg - thresh_bg)
dst[np.where(dst > 1)] = 1
dst[np.where(dst < 0)] = 0
return dst.astype(np.float32)
def postprocess_v(cur_gray, scoremap, prev_gray, pre_cfd, disflow, is_init):
"""光流优化
Args:
cur_gray : 当前帧灰度图
pre_gray : 前一帧灰度图
pre_cfd :前一帧融合结果
scoremap : 当前帧分割结果
difflow : 光流
is_init : 是否第一帧
Returns:
fusion_cfd : 光流追踪图和预测结果融合图
"""
h, w = scoremap.shape
cur_cfd = scoremap.copy()
if is_init:
if h <= 64 or w <= 64:
disflow.setFinestScale(1)
elif h <= 160 or w <= 160:
disflow.setFinestScale(2)
else:
disflow.setFinestScale(3)
fusion_cfd = cur_cfd
else:
weights = np.ones((h, w), np.float32) * 0.3
track_cfd, is_track, weights = human_seg_tracking(
prev_gray, cur_gray, pre_cfd, weights, disflow)
fusion_cfd = human_seg_track_fuse(track_cfd, cur_cfd, weights, is_track)
return fusion_cfd
# -*- coding:utf-8 -*-
import os
import time
import base64
import cv2
import numpy as np
__all__ = ['cv2_to_base64', 'base64_to_cv2', 'postprocess']
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
def postprocess(data_out, org_im, org_im_shape, org_im_path, output_dir,
visualization):
"""
Postprocess output of network. one image at a time.
Args:
data_out (numpy.ndarray): output of network.
org_im (numpy.ndarray): original image.
org_im_shape (list): shape pf original image.
org_im_path (list): path of riginal image.
output_dir (str): output directory to store image.
visualization (bool): whether to save image or not.
Returns:
result (dict): The data of processed image.
"""
result = dict()
for logit in data_out:
logit = (logit * 255).astype(np.uint8)
logit = cv2.resize(logit, (org_im_shape[1], org_im_shape[0]))
rgba = np.concatenate((org_im, np.expand_dims(logit, axis=2)), axis=2)
if visualization:
check_dir(output_dir)
save_im_path = get_save_image_name(org_im, org_im_path, output_dir)
cv2.imwrite(save_im_path, rgba)
result['save_path'] = save_im_path
result['data'] = rgba[:, :, 3]
else:
result['data'] = rgba[:, :, 3]
return result
def check_dir(dir_path):
if not os.path.exists(dir_path):
os.makedirs(dir_path)
elif os.path.isfile(dir_path):
os.remove(dir_path)
os.makedirs(dir_path)
def get_save_image_name(org_im, org_im_path, output_dir):
"""
Get save image name from source image path.
"""
# name prefix of orginal image
org_im_name = os.path.split(org_im_path)[-1]
im_prefix = os.path.splitext(org_im_name)[0]
ext = '.png'
# save image path
save_im_path = os.path.join(output_dir, im_prefix + ext)
if os.path.exists(save_im_path):
save_im_path = os.path.join(
output_dir, im_prefix + 'time={}'.format(int(time.time())) + ext)
return save_im_path
## 模型概述
HUmanSeg_lite是在ShuffleNetV2网络结构的基础上进行优化,进一步减小了网络规模,网络大小只有541K,量化后只有187K,适用于手机自拍人像分割,且能在移动端进行实时分割。
## 命令行预测
```
hub run humanseg_lite --input_path "/PATH/TO/IMAGE"
```
## API
```python
def segment(images=None,
paths=None,
batch_size=1,
use_gpu=False,
visualization=True,
output_dir='humanseg_output')
```
预测API,用于人像分割。
**参数**
* images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式;
* paths (list\[str\]): 图片的路径;
* batch\_size (int): batch 的大小;
* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置;
* visualization (bool): 是否将识别结果保存为图片文件;
* output\_dir (str): 图片的保存路径。
**返回**
* res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,关键字有 'save\_path', 'data',对应的取值为:
* save\_path (str, optional): 可视化图片的保存路径(仅当visualization=True时存在);
* data (numpy.ndarray): 人像分割结果,仅包含Alpha通道,取值为0-255 (0为全透明,255为不透明),也即取值越大的像素点越可能为人体,取值越小的像素点越可能为背景。
```python
def save_inference_model(dirname,
model_filename=None,
params_filename=None,
combined=True)
```
将模型保存到指定路径。
**参数**
* dirname: 存在模型的目录名称
* model\_filename: 模型文件名称,默认为\_\_model\_\_
* params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效)
* combined: 是否将参数保存到统一的一个文件中
## 代码示例
```python
import cv2
import paddlehub as hub
human_seg = hub.Module('humanseg_lite')
im = cv2.imread('/PATH/TO/IMAGE')
res = human_seg.segment(images=[im],visualization=True)
```
## 服务部署
PaddleHub Serving可以部署一个人像分割的在线服务。
## 第一步:启动PaddleHub Serving
运行启动命令:
```shell
$ hub serving start -m humanseg_lite
```
这样就完成了一个人像分割的服务化API的部署,默认端口号为8866。
**NOTE:** 如使用GPU预测,则需要在启动服务之前,设置CUDA\_VISIBLE\_DEVICES环境变量,否则不用设置。
## 第二步:发送预测请求
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# 发送HTTP请求
data = {'images':[cv2_to_base64(cv2.imread('/PATH/TO/IMAGE'))]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/humanseg_lite"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# 打印预测结果
print(base64_to_cv2(r.json()["results"][0]['data']))
```
### 依赖
paddlepaddle >= 1.8.1
paddlehub >= 1.7.1
# -*- coding:utf-8 -*-
import os
import time
from collections import OrderedDict
import cv2
import numpy as np
from PIL import Image
__all__ = ['reader']
def reader(images=None, paths=None):
"""
Preprocess to yield image.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C]
paths (list[str]): paths to images.
Yield:
each (collections.OrderedDict): info of original image, preprocessed image.
"""
component = list()
if paths:
for im_path in paths:
each = OrderedDict()
assert os.path.isfile(
im_path), "The {} isn't a valid file path.".format(im_path)
#print(im_path)
im = cv2.imread(im_path).astype('float32')
each['org_im'] = im
each['org_im_path'] = im_path
each['org_im_shape'] = im.shape
component.append(each)
if images is not None:
assert type(images) is list, "images should be a list."
for im in images:
each = OrderedDict()
each['org_im'] = im
each['org_im_path'] = 'ndarray_time={}'.format(
round(time.time(), 6) * 1e6)
each['org_im_shape'] = im.shape
component.append(each)
for element in component:
img = element['org_im'].copy()
img = cv2.resize(img, (192, 192)).astype(np.float32)
img_mean = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img_std = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img = img.transpose((2, 0, 1)) / 255
img -= img_mean
img /= img_std
element['image'] = img
yield element
# -*- coding:utf-8 -*-
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import ast
import os
import argparse
import numpy as np
import paddle.fluid as fluid
import paddlehub as hub
from paddle.fluid.core import PaddleTensor, AnalysisConfig, create_paddle_predictor
from paddlehub.module.module import moduleinfo, runnable, serving
from humanseg_lite.processor import postprocess, base64_to_cv2, cv2_to_base64, check_dir
from humanseg_lite.data_feed import reader
@moduleinfo(
name="humanseg_lite",
type="CV/semantic_segmentation",
author="paddlepaddle",
author_email="",
summary="humanseg_lite is a semantic segmentation model.",
version="1.0.0")
class ShufflenetHumanSeg(hub.Module):
def _initialize(self):
self.default_pretrained_model_path = os.path.join(
self.directory, "humanseg_lite_inference")
self._set_config()
def _set_config(self):
"""
predictor config setting
"""
self.model_file_path = os.path.join(self.default_pretrained_model_path,
'__model__')
self.params_file_path = os.path.join(self.default_pretrained_model_path,
'__params__')
cpu_config = AnalysisConfig(self.model_file_path, self.params_file_path)
cpu_config.disable_glog_info()
cpu_config.disable_gpu()
self.cpu_predictor = create_paddle_predictor(cpu_config)
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
use_gpu = True
except:
use_gpu = False
if use_gpu:
gpu_config = AnalysisConfig(self.model_file_path,
self.params_file_path)
gpu_config.disable_glog_info()
gpu_config.enable_use_gpu(
memory_pool_init_size_mb=1000, device_id=0)
self.gpu_predictor = create_paddle_predictor(gpu_config)
def segment(self,
images=None,
paths=None,
data=None,
batch_size=1,
use_gpu=False,
visualization=True,
output_dir='humanseg_output'):
"""
API for human segmentation.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C], the color space is BGR.
paths (list[str]): The paths of images.
data (dict): key is 'image', the corresponding value is the path to image.
batch_size (int): batch size.
use_gpu (bool): Whether to use gpu.
visualization (bool): Whether to save image or not.
output_dir (str): The path to store output images.
Returns:
res (list[dict]): each element in the list is a dict, the keys and values are:
save_path (str, optional): the path to save images. (Exists only if visualization is True)
data (numpy.ndarray): data of post processed image.
"""
if use_gpu:
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
except:
raise RuntimeError(
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES as cuda_device_id."
)
if data and 'image' in data:
if paths is None:
paths = list()
paths += data['image']
all_data = list()
for yield_data in reader(images, paths):
all_data.append(yield_data)
total_num = len(all_data)
loop_num = int(np.ceil(total_num / batch_size))
res = list()
for iter_id in range(loop_num):
batch_data = list()
handle_id = iter_id * batch_size
for image_id in range(batch_size):
try:
batch_data.append(all_data[handle_id + image_id])
except:
pass
# feed batch image
batch_image = np.array([data['image'] for data in batch_data])
batch_image = PaddleTensor(batch_image.copy())
output = self.gpu_predictor.run([
batch_image
]) if use_gpu else self.cpu_predictor.run([batch_image])
output = np.expand_dims(output[0].as_ndarray(), axis=1)
# postprocess one by one
for i in range(len(batch_data)):
out = postprocess(
data_out=output[i],
org_im=batch_data[i]['org_im'],
org_im_shape=batch_data[i]['org_im_shape'],
org_im_path=batch_data[i]['org_im_path'],
output_dir=output_dir,
visualization=visualization)
res.append(out)
return res
def save_inference_model(self,
dirname,
model_filename=None,
params_filename=None,
combined=True):
if combined:
model_filename = "__model__" if not model_filename else model_filename
params_filename = "__params__" if not params_filename else params_filename
place = fluid.CPUPlace()
exe = fluid.Executor(place)
program, feeded_var_names, target_vars = fluid.io.load_inference_model(
dirname=self.default_pretrained_model_path,
model_filename=model_filename,
params_filename=params_filename,
executor=exe)
fluid.io.save_inference_model(
dirname=dirname,
main_program=program,
executor=exe,
feeded_var_names=feeded_var_names,
target_vars=target_vars,
model_filename=model_filename,
params_filename=params_filename)
@serving
def serving_method(self, images, **kwargs):
"""
Run as a service.
"""
images_decode = [base64_to_cv2(image) for image in images]
results = self.segment(images=images_decode, **kwargs)
results = [{
'data': cv2_to_base64(result['data'])
} for result in results]
return results
@runnable
def run_cmd(self, argvs):
"""
Run as a command.
"""
self.parser = argparse.ArgumentParser(
description="Run the {} module.".format(self.name),
prog='hub run {}'.format(self.name),
usage='%(prog)s',
add_help=True)
self.arg_input_group = self.parser.add_argument_group(
title="Input options", description="Input data. Required")
self.arg_config_group = self.parser.add_argument_group(
title="Config options",
description=
"Run configuration for controlling module behavior, not required.")
self.add_module_config_arg()
self.add_module_input_arg()
args = self.parser.parse_args(argvs)
results = self.segment(
paths=[args.input_path],
batch_size=args.batch_size,
use_gpu=args.use_gpu,
output_dir=args.output_dir,
visualization=args.visualization)
if args.save_dir is not None:
check_dir(args.save_dir)
self.save_inference_model(args.save_dir)
return results
def add_module_config_arg(self):
"""
Add the command config options.
"""
self.arg_config_group.add_argument(
'--use_gpu',
type=ast.literal_eval,
default=False,
help="whether use GPU or not")
self.arg_config_group.add_argument(
'--output_dir',
type=str,
default='humanseg_output',
help="The directory to save output images.")
self.arg_config_group.add_argument(
'--save_dir',
type=str,
default='humanseg_model',
help="The directory to save model.")
self.arg_config_group.add_argument(
'--visualization',
type=ast.literal_eval,
default=True,
help="whether to save output as images.")
self.arg_config_group.add_argument(
'--batch_size',
type=ast.literal_eval,
default=1,
help="batch size.")
def add_module_input_arg(self):
"""
Add the command input options.
"""
self.arg_input_group.add_argument(
'--input_path', type=str, help="path to image.")
if __name__ == "__main__":
m = ShufflenetHumanSeg()
import cv2
img = cv2.imread('./meditation.jpg')
res = m.segment(images=[img])
# -*- coding:utf-8 -*-
import os
import time
import base64
import cv2
import numpy as np
__all__ = ['cv2_to_base64', 'base64_to_cv2', 'postprocess']
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
def postprocess(data_out,
org_im,
org_im_shape,
org_im_path,
output_dir,
visualization,
thresh=120):
"""
Postprocess output of network. one image at a time.
Args:
data_out (numpy.ndarray): output of network.
org_im (numpy.ndarray): original image.
org_im_shape (list): shape pf original image.
org_im_path (list): path of riginal image.
output_dir (str): output directory to store image.
visualization (bool): whether to save image or not.
thresh (float): threshold.
Returns:
result (dict): The data of processed image.
"""
result = dict()
for logit in data_out:
logit = np.squeeze(logit * 255, axis=2).astype(np.uint8)
logit = cv2.resize(logit, (org_im_shape[1], org_im_shape[0]))
rgba = np.concatenate((org_im, np.expand_dims(logit, axis=2)), axis=2)
if visualization:
check_dir(output_dir)
save_im_path = get_save_image_name(org_im, org_im_path, output_dir)
cv2.imwrite(save_im_path, rgba)
result['save_path'] = save_im_path
result['data'] = logit
else:
result['data'] = logit
return result
def check_dir(dir_path):
if not os.path.exists(dir_path):
os.makedirs(dir_path)
elif os.path.isfile(dir_path):
os.remove(dir_path)
os.makedirs(dir_path)
def get_save_image_name(org_im, org_im_path, output_dir):
"""
Get save image name from source image path.
"""
# name prefix of orginal image
org_im_name = os.path.split(org_im_path)[-1]
im_prefix = os.path.splitext(org_im_name)[0]
ext = '.png'
# save image path
save_im_path = os.path.join(output_dir, im_prefix + ext)
if os.path.exists(save_im_path):
save_im_path = os.path.join(
output_dir, im_prefix + 'time={}'.format(int(time.time())) + ext)
return save_im_path
## 模型概述
HumanSeg-mobile是基于HRNet(Deep High-Resolution Representation Learning for Visual Recognition)的人像分割网络。HRNet在特征提取过程中保持了高分辨率的信息,保持了物体的细节信息,并可通过控制每个分支的通道数调整模型的大小。HumanSeg-mobile采用了HRNet_w18_small_v1的网络结构,模型大小只有5.8M, 适用于移动端或服务端CPU的前置摄像头场景。
## 命令行预测
```
hub run humanseg_mobile --input_path "/PATH/TO/IMAGE"
```
## API
```python
def segment(images=None,
paths=None,
batch_size=1,
use_gpu=False,
visualization=True,
output_dir='humanseg_output')
```
预测API,用于人像分割。
**参数**
* images (list\[numpy.ndarray\]): 图片数据,ndarray.shape 为 \[H, W, C\],BGR格式;
* paths (list\[str\]): 图片的路径;
* batch\_size (int): batch 的大小;
* use\_gpu (bool): 是否使用 GPU预测,如果使用GPU预测,则在预测之前,请设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置;
* visualization (bool): 是否将识别结果保存为图片文件;
* output\_dir (str): 图片的保存路径。
**返回**
* res (list\[dict\]): 识别结果的列表,列表中每一个元素为 dict,关键字有 'save\_path', 'data',对应的取值为:
* save\_path (str, optional): 可视化图片的保存路径(仅当visualization=True时存在);
* data (numpy.ndarray): 人像分割结果,仅包含Alpha通道,取值为0-255 (0为全透明,255为不透明),也即取值越大的像素点越可能为人体,取值越小的像素点越可能为背景。
```python
def save_inference_model(dirname,
model_filename=None,
params_filename=None,
combined=True)
```
将模型保存到指定路径。
**参数**
* dirname: 存在模型的目录名称
* model\_filename: 模型文件名称,默认为\_\_model\_\_
* params\_filename: 参数文件名称,默认为\_\_params\_\_(仅当`combined`为True时生效)
* combined: 是否将参数保存到统一的一个文件中
## 代码示例
```python
import cv2
import paddlehub as hub
human_seg = hub.Module('humanseg_mobile')
im = cv2.imread('/PATH/TO/IMAGE')
res = human_seg.segment(images=[im],visualization=True)
```
## 服务部署
PaddleHub Serving可以部署一个人像分割的在线服务。
## 第一步:启动PaddleHub Serving
运行启动命令:
```shell
$ hub serving start -m humanseg_mobile
```
这样就完成了一个人像分割的服务化API的部署,默认端口号为8866。
**NOTE:** 如使用GPU预测,则需要在启动服务之前,设置CUDA_VISIBLE_DEVICES环境变量,否则不用设置。
## 第二步:发送预测请求
配置好服务端,以下数行代码即可实现发送预测请求,获取预测结果
```python
import requests
import json
import base64
import cv2
import numpy as np
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
# 发送HTTP请求
data = {'images':[cv2_to_base64(cv2.imread('/PATH/TO/IMAGE'))]}
headers = {"Content-type": "application/json"}
url = "http://127.0.0.1:8866/predict/humanseg_mobile"
r = requests.post(url=url, headers=headers, data=json.dumps(data))
# 打印预测结果
print(base64_to_cv2(r.json()["results"][0]['data']))
```
### 依赖
paddlepaddle >= 1.8.1
paddlehub >= 1.7.1
# -*- coding:utf-8 -*-
import os
import time
from collections import OrderedDict
import cv2
import numpy as np
__all__ = ['reader']
def reader(images=None, paths=None):
"""
Preprocess to yield image.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C]
paths (list[str]): paths to images.
Yield:
each (collections.OrderedDict): info of original image, preprocessed image.
"""
component = list()
if paths:
for im_path in paths:
each = OrderedDict()
assert os.path.isfile(
im_path), "The {} isn't a valid file path.".format(im_path)
#print(im_path)
im = cv2.imread(im_path).astype('float32')
each['org_im'] = im
each['org_im_path'] = im_path
each['org_im_shape'] = im.shape
component.append(each)
if images is not None:
assert type(images) is list, "images should be a list."
for im in images:
each = OrderedDict()
each['org_im'] = im
each['org_im_path'] = 'ndarray_time={}'.format(
round(time.time(), 6) * 1e6)
each['org_im_shape'] = im.shape
component.append(each)
for element in component:
img = element['org_im'].copy()
img = cv2.resize(img, (192, 192)).astype(np.float32)
img_mean = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img_std = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img = img.transpose((2, 0, 1)) / 255
img -= img_mean
img /= img_std
element['image'] = img
yield element
# -*- coding:utf-8 -*-
# Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import ast
import os
import argparse
import numpy as np
import paddle.fluid as fluid
import paddlehub as hub
from paddle.fluid.core import PaddleTensor, AnalysisConfig, create_paddle_predictor
from paddlehub.module.module import moduleinfo, runnable, serving
from humanseg_mobile.processor import postprocess, base64_to_cv2, cv2_to_base64, check_dir
from humanseg_mobile.data_feed import reader
@moduleinfo(
name="humanseg_mobile",
type="CV/semantic_segmentation",
author="paddlepaddle",
author_email="",
summary="HRNet_w18_samll_v1 is a semantic segmentation model.",
version="1.0.0")
class HRNetw18samllv1humanseg(hub.Module):
def _initialize(self):
self.default_pretrained_model_path = os.path.join(
self.directory, "humanseg_mobile_inference")
self._set_config()
def _set_config(self):
"""
predictor config setting
"""
self.model_file_path = os.path.join(self.default_pretrained_model_path,
'__model__')
self.params_file_path = os.path.join(self.default_pretrained_model_path,
'__params__')
cpu_config = AnalysisConfig(self.model_file_path, self.params_file_path)
cpu_config.disable_glog_info()
cpu_config.disable_gpu()
self.cpu_predictor = create_paddle_predictor(cpu_config)
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
use_gpu = True
except:
use_gpu = False
if use_gpu:
gpu_config = AnalysisConfig(self.model_file_path,
self.params_file_path)
gpu_config.disable_glog_info()
gpu_config.enable_use_gpu(
memory_pool_init_size_mb=1000, device_id=0)
self.gpu_predictor = create_paddle_predictor(gpu_config)
def segment(self,
images=None,
paths=None,
data=None,
batch_size=1,
use_gpu=False,
visualization=False,
output_dir='humanseg_output'):
"""
API for human segmentation.
Args:
images (list(numpy.ndarray)): images data, shape of each is [H, W, C], the color space is BGR.
paths (list[str]): The paths of images.
data (dict): key is 'image', the corresponding value is the path to image.
batch_size (int): batch size.
use_gpu (bool): Whether to use gpu.
visualization (bool): Whether to save image or not.
output_dir (str): The path to store output images.
Returns:
res (list[dict]): each element in the list is a dict, the keys and values are:
save_path (str, optional): the path to save images. (Exists only if visualization is True)
data (numpy.ndarray): data of post processed image.
"""
if use_gpu:
try:
_places = os.environ["CUDA_VISIBLE_DEVICES"]
int(_places[0])
except:
raise RuntimeError(
"Environment Variable CUDA_VISIBLE_DEVICES is not set correctly. If you wanna use gpu, please set CUDA_VISIBLE_DEVICES as cuda_device_id."
)
# compatibility with older versions
if data and 'image' in data:
if paths is None:
paths = list()
paths += data['image']
all_data = list()
for yield_data in reader(images, paths):
all_data.append(yield_data)
total_num = len(all_data)
loop_num = int(np.ceil(total_num / batch_size))
res = list()
for iter_id in range(loop_num):
batch_data = list()
handle_id = iter_id * batch_size
for image_id in range(batch_size):
try:
batch_data.append(all_data[handle_id + image_id])
except:
pass
# feed batch image
batch_image = np.array([data['image'] for data in batch_data])
batch_image = PaddleTensor(batch_image.copy())
output = self.gpu_predictor.run([
batch_image
]) if use_gpu else self.cpu_predictor.run([batch_image])
output = np.expand_dims(output[0].as_ndarray(), axis=1)
# postprocess one by one
for i in range(len(batch_data)):
out = postprocess(
data_out=output[i],
org_im=batch_data[i]['org_im'],
org_im_shape=batch_data[i]['org_im_shape'],
org_im_path=batch_data[i]['org_im_path'],
output_dir=output_dir,
visualization=visualization)
res.append(out)
return res
def save_inference_model(self,
dirname,
model_filename=None,
params_filename=None,
combined=True):
if combined:
model_filename = "__model__" if not model_filename else model_filename
params_filename = "__params__" if not params_filename else params_filename
place = fluid.CPUPlace()
exe = fluid.Executor(place)
program, feeded_var_names, target_vars = fluid.io.load_inference_model(
dirname=self.default_pretrained_model_path,
model_filename=model_filename,
params_filename=params_filename,
executor=exe)
fluid.io.save_inference_model(
dirname=dirname,
main_program=program,
executor=exe,
feeded_var_names=feeded_var_names,
target_vars=target_vars,
model_filename=model_filename,
params_filename=params_filename)
@serving
def serving_method(self, images, **kwargs):
"""
Run as a service.
"""
images_decode = [base64_to_cv2(image) for image in images]
results = self.segment(images=images_decode, **kwargs)
results = [{
'data': cv2_to_base64(result['data'])
} for result in results]
return results
@runnable
def run_cmd(self, argvs):
"""
Run as a command.
"""
self.parser = argparse.ArgumentParser(
description="Run the {} module.".format(self.name),
prog='hub run {}'.format(self.name),
usage='%(prog)s',
add_help=True)
self.arg_input_group = self.parser.add_argument_group(
title="Input options", description="Input data. Required")
self.arg_config_group = self.parser.add_argument_group(
title="Config options",
description=
"Run configuration for controlling module behavior, not required.")
self.add_module_config_arg()
self.add_module_input_arg()
args = self.parser.parse_args(argvs)
results = self.segment(
paths=[args.input_path],
batch_size=args.batch_size,
use_gpu=args.use_gpu,
output_dir=args.output_dir,
visualization=args.visualization)
if args.save_dir is not None:
check_dir(args.save_dir)
self.save_inference_model(args.save_dir)
return results
def add_module_config_arg(self):
"""
Add the command config options.
"""
self.arg_config_group.add_argument(
'--use_gpu',
type=ast.literal_eval,
default=False,
help="whether use GPU or not")
self.arg_config_group.add_argument(
'--output_dir',
type=str,
default='humanseg_output',
help="The directory to save output images.")
self.arg_config_group.add_argument(
'--save_dir',
type=str,
default='humanseg_model',
help="The directory to save model.")
self.arg_config_group.add_argument(
'--visualization',
type=ast.literal_eval,
default=False,
help="whether to save output as images.")
self.arg_config_group.add_argument(
'--batch_size',
type=ast.literal_eval,
default=1,
help="batch size.")
def add_module_input_arg(self):
"""
Add the command input options.
"""
self.arg_input_group.add_argument(
'--input_path', type=str, help="path to image.")
if __name__ == "__main__":
m = HRNetw18samllv1humanseg()
import cv2
img = cv2.imread('./meditation.jpg')
res = m.segment(images=[img], visualization=True)
print(res[0]['data'])
# -*- coding:utf-8 -*-
import os
import time
import base64
import cv2
import numpy as np
__all__ = ['cv2_to_base64', 'base64_to_cv2', 'postprocess']
def cv2_to_base64(image):
data = cv2.imencode('.jpg', image)[1]
return base64.b64encode(data.tostring()).decode('utf8')
def base64_to_cv2(b64str):
data = base64.b64decode(b64str.encode('utf8'))
data = np.fromstring(data, np.uint8)
data = cv2.imdecode(data, cv2.IMREAD_COLOR)
return data
def postprocess(data_out,
org_im,
org_im_shape,
org_im_path,
output_dir,
visualization,
thresh=120):
"""
Postprocess output of network. one image at a time.
Args:
data_out (numpy.ndarray): output of network.
org_im (numpy.ndarray): original image.
org_im_shape (list): shape pf original image.
org_im_path (list): path of riginal image.
output_dir (str): output directory to store image.
visualization (bool): whether to save image or not.
thresh (float): threshold.
Returns:
result (dict): The data of processed image.
"""
result = dict()
for logit in data_out:
logit = np.squeeze(logit * 255, axis=2).astype(np.uint8)
logit = cv2.resize(logit, (org_im_shape[1], org_im_shape[0]))
rgba = np.concatenate((org_im, np.expand_dims(logit, axis=2)), axis=2)
if visualization:
check_dir(output_dir)
save_im_path = get_save_image_name(org_im, org_im_path, output_dir)
cv2.imwrite(save_im_path, rgba)
result['save_path'] = save_im_path
result['data'] = logit
else:
result['data'] = logit
return result
def check_dir(dir_path):
if not os.path.exists(dir_path):
os.makedirs(dir_path)
elif os.path.isfile(dir_path):
os.remove(dir_path)
os.makedirs(dir_path)
def get_save_image_name(org_im, org_im_path, output_dir):
"""
Get save image name from source image path.
"""
# name prefix of orginal image
org_im_name = os.path.split(org_im_path)[-1]
im_prefix = os.path.splitext(org_im_name)[0]
ext = '.png'
# save image path
save_im_path = os.path.join(output_dir, im_prefix + ext)
if os.path.exists(save_im_path):
save_im_path = os.path.join(
output_dir, im_prefix + 'time={}'.format(int(time.time())) + ext)
return save_im_path
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册