提交 e52e6c8f 编写于 作者: H hypox64

Merge branch 'dev'

......@@ -143,6 +143,7 @@ video_tmp/
result/
nohup.out
#./
/.vscode
/pix2pix
/pix2pixHD
/tmp
......@@ -183,4 +184,5 @@ nohup.out
*.MP4
*.JPEG
*.exe
*.npy
\ No newline at end of file
*.npy
*.psd
\ No newline at end of file
![image](./imgs/hand.gif)
# <img src="./imgs/icon.jpg" width="48">DeepMosaics
You can use it to automatically remove the mosaics in images and videos, or add mosaics to them.<br>
This project is based on "semantic segmentation" and "Image-to-Image Translation".<br>
* [中文版README](./README_CN.md)<br>
<div align="center">
<img src="./imgs/logo.png" width="250"><br><br>
<img src="https://badgen.net/github/stars/hypox64/deepmosaics?icon=github&color=4ab8a1">&emsp;<img src="https://badgen.net/github/forks/hypox64/deepmosaics?icon=github&color=4ab8a1">&emsp;<a href="https://github.com/HypoX64/DeepMosaics/releases"><img src=https://img.shields.io/github/downloads/hypox64/deepmosaics/total></a>&emsp;<a href="https://github.com/HypoX64/DeepMosaics/releases"><img src=https://img.shields.io/github/v/release/hypox64/DeepMosaics></a>&emsp;<img src=https://img.shields.io/github/license/hypox64/deepmosaics>
</div>
### More Examples
# DeepMosaics
**English | [中文](./README_CN.md)**<br>
You can use it to automatically remove the mosaics in images and videos, or add mosaics to them.<br>This project is based on "semantic segmentation" and "Image-to-Image Translation".<br>Try it at this [website](http://118.89.27.46:5000/)!<br>
### Examples
![image](./imgs/hand.gif)
origin | auto add mosaic | auto clean mosaic
:-:|:-:|:-:
![image](./imgs/example/lena.jpg) | ![image](./imgs/example/lena_add.jpg) | ![image](./imgs/example/lena_clean.jpg)
......@@ -30,18 +32,21 @@ An interesting example:[Ricardo Milos to cat](https://www.bilibili.com/video/BV1
## Run DeepMosaics
You can either run DeepMosaics via a pre-built binary package, or from source.<br>
### Try it on web
You can simply try to remove the mosaic on the face at this [website](http://118.89.27.46:5000/).<br>
### Pre-built binary package
For Windows, we bulid a GUI version for easy testing.<br>
Download this version, and a pre-trained model via [[Google Drive]](https://drive.google.com/open?id=1LTERcN33McoiztYEwBxMuRjjgxh4DEPs) [[百度云,提取码1x0a]](https://pan.baidu.com/s/10rN3U3zd5TmfGpO_PEShqQ) <br>
* [[How to use]](./docs/exe_help.md)<br>
* [[Help document]](./docs/exe_help.md)<br>
* Video tutorial => [[youtube]](https://www.youtube.com/watch?v=1kEmYawJ_vk) [[bilibili]](https://www.bilibili.com/video/BV1QK4y1a7Av)
![image](./imgs/GUI.png)<br>
Attentions:<br>
- Requires Windows_x86_64, Windows10 is better.<br>
- Different pre-trained models are suitable for different effects.[[Introduction to pre-trained models]](./docs/pre-trained_models_introduction.md)<br>
- Run time depends on computers performance(The current version does not support gpu, if you need to use gpu please run source).<br>
- Run time depends on computers performance (GPU version has better performance but requires CUDA to be installed).<br>
- If output video cannot be played, you can try with [potplayer](https://daumpotplayer.com/download/).<br>
- GUI version updates slower than source.<br>
......@@ -67,11 +72,11 @@ You can download pre_trained models and put them into './pretrained_models'.<br>
#### Simple Example
* Add Mosaic (output media will save in './result')<br>
```bash
python deepmosaic.py --media_path ./imgs/ruoruo.jpg --model_path ./pretrained_models/mosaic/add_face.pth --use_gpu 0
python deepmosaic.py --media_path ./imgs/ruoruo.jpg --model_path ./pretrained_models/mosaic/add_face.pth --gpu_id 0
```
* Clean Mosaic (output media will save in './result')<br>
```bash
python deepmosaic.py --media_path ./result/ruoruo_add.jpg --model_path ./pretrained_models/mosaic/clean_face_HD.pth --use_gpu 0
python deepmosaic.py --media_path ./result/ruoruo_add.jpg --model_path ./pretrained_models/mosaic/clean_face_HD.pth --gpu_id 0
```
#### More Parameters
If you want to test other images or videos, please refer to this file.<br>
......@@ -81,5 +86,4 @@ If you want to test other images or videos, please refer to this file.<br>
If you want to train with your own dataset, please refer to [training_with_your_own_dataset.md](./docs/training_with_your_own_dataset.md)
## Acknowledgements
This code borrows heavily from [[pytorch-CycleGAN-and-pix2pix]](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix) [[Pytorch-UNet]](https://github.com/milesial/Pytorch-UNet) [[pix2pixHD]](https://github.com/NVIDIA/pix2pixHD) [[BiSeNet]](https://github.com/ooooverflow/BiSeNet).
This code borrows heavily from [[pytorch-CycleGAN-and-pix2pix]](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix) [[Pytorch-UNet]](https://github.com/milesial/Pytorch-UNet) [[pix2pixHD]](https://github.com/NVIDIA/pix2pixHD) [[BiSeNet]](https://github.com/ooooverflow/BiSeNet) [[DFDNet]](https://github.com/csxmli2016/DFDNet) [[GFRNet_pytorch_new]](https://github.com/sonack/GFRNet_pytorch_new).
![image](./imgs/hand.gif)
# <img src="./imgs/icon.jpg" width="48">DeepMosaics
这是一个通过深度学习自动的为图片/视频添加马赛克,或消除马赛克的项目.<br>它基于“语义分割”以及“图像翻译”.<br>
<div align="center">
<img src="./imgs/logo.png" width="250"><br><br>
<img src="https://badgen.net/github/stars/hypox64/deepmosaics?icon=github&color=4ab8a1">&emsp;<img src="https://badgen.net/github/forks/hypox64/deepmosaics?icon=github&color=4ab8a1">&emsp;<a href="https://github.com/HypoX64/DeepMosaics/releases"><img src=https://img.shields.io/github/downloads/hypox64/deepmosaics/total></a>&emsp;<a href="https://github.com/HypoX64/DeepMosaics/releases"><img src=https://img.shields.io/github/v/release/hypox64/DeepMosaics></a>&emsp;<img src=https://img.shields.io/github/license/hypox64/deepmosaics>
</div>
# DeepMosaics
**[English](./README.md) | 中文**<br>
### 更多例子
这是一个通过深度学习自动的为图片/视频添加马赛克,或消除马赛克的项目.<br>它基于“语义分割”以及“图像翻译”.<br>现在可以在这个[网站](http://118.89.27.46:5000/)尝试使用该项目清除马赛克!<br>
### 例子
![image](./imgs/hand.gif)
原始 | 自动打码 | 自动去码
:-:|:-:|:-:
![image](./imgs/example/lena.jpg) | ![image](./imgs/example/lena_add.jpg) | ![image](./imgs/example/lena_clean.jpg)
......@@ -26,19 +32,20 @@
## 如何运行
可以通过我们预编译好的二进制包或源代码运行.<br>
### 在网页中运行
打开[这个网站](http://118.89.27.46:5000/)上传照片,将获得去除马赛克后的结果,受限与当地法律,目前只支持人脸.<br>
### 预编译的程序包
对于Windows用户,我们提供了包含GUI界面的免安装软件包.<br>
可以通过下面两种方式进行下载: [[Google Drive]](https://drive.google.com/open?id=1LTERcN33McoiztYEwBxMuRjjgxh4DEPs) [[百度云,提取码1x0a]](https://pan.baidu.com/s/10rN3U3zd5TmfGpO_PEShqQ) <br>
* [[使用教程]](./docs/exe_help_CN.md)<br>
* [[帮助文档]](./docs/exe_help_CN.md)<br>
* [[视频教程]](https://www.bilibili.com/video/BV1QK4y1a7Av)<br>
![image](./imgs/GUI.png)<br>
注意事项:<br>
- 程序的运行要求在64位Windows操作系统,我仅在Windows10运行过,其他版本暂未经过测试<br>
- 请根据需求选择合适的预训练模型进行测试,不同的预期训练模型具有不同的效果.[[预训练模型介绍]](./docs/pre-trained_models_introduction_CN.md)<br>
- 运行时间取决于电脑性能,对于视频文件,我们建议使用源码并在GPU上运行.<br>
- 运行时间取决于电脑性能,对于视频文件,我们建议在GPU上运行.<br>
- 如果输出的视频无法播放,这边建议您尝试[potplayer](https://daumpotplayer.com/download/).<br>
- 相比于源码,该版本的更新将会延后.
......@@ -62,13 +69,13 @@ cd DeepMosaics
[[预训练模型介绍]](./docs/pre-trained_models_introduction_CN.md)<br>
#### 简单的例子
* 为视频添加马赛克,例子中认为脸是需要打码的区域 ,可以通过切换预训练模型切换自动打码区域(输出结果将储存到 './result')<br>
* 为视频或照片添加马赛克,例子中认为脸是需要打码的区域 ,可以通过切换预训练模型切换自动打码区域(输出结果将储存到 './result')<br>
```bash
python deepmosaic.py --media_path ./imgs/ruoruo.jpg --model_path ./pretrained_models/mosaic/add_face.pth --use_gpu 0
python deepmosaic.py --media_path ./imgs/ruoruo.jpg --model_path ./pretrained_models/mosaic/add_face.pth --gpu_id 0
```
* 将视频中的马赛克移除,对于不同的打码物体需要使用对应的预训练模型进行马赛克消除(输出结果将储存到 './result')<br>
* 将视频或照片中的马赛克移除,对于不同的打码物体需要使用对应的预训练模型进行马赛克消除(输出结果将储存到 './result')<br>
```bash
python deepmosaic.py --media_path ./result/ruoruo_add.jpg --model_path ./pretrained_models/mosaic/clean_face_HD.pth --use_gpu 0
python deepmosaic.py --media_path ./result/ruoruo_add.jpg --model_path ./pretrained_models/mosaic/clean_face_HD.pth --gpu_id 0
```
#### 更多的参数
如果想要测试其他的图片或视频,请参照以下文件输入参数.<br>
......@@ -78,5 +85,5 @@ python deepmosaic.py --media_path ./result/ruoruo_add.jpg --model_path ./pretrai
如果需要使用自己的数据训练模型,请参照 [training_with_your_own_dataset.md](./docs/training_with_your_own_dataset.md)
## 鸣谢
代码大量的参考了以下项目:[[pytorch-CycleGAN-and-pix2pix]](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix) [[Pytorch-UNet]](https://github.com/milesial/Pytorch-UNet) [[pix2pixHD]](https://github.com/NVIDIA/pix2pixHD) [[BiSeNet]](https://github.com/ooooverflow/BiSeNet).
代码大量的参考了以下项目:[[pytorch-CycleGAN-and-pix2pix]](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix) [[Pytorch-UNet]](https://github.com/milesial/Pytorch-UNet) [[pix2pixHD]](https://github.com/NVIDIA/pix2pixHD) [[BiSeNet]](https://github.com/ooooverflow/BiSeNet) [[DFDNet]](https://github.com/csxmli2016/DFDNet) [[GFRNet_pytorch_new]](https://github.com/sonack/GFRNet_pytorch_new).
import os
import time
import torch
import numpy as np
import cv2
......@@ -30,6 +31,7 @@ def video_init(opt,path):
continue_flag = True
if not continue_flag:
print('Step:1/4 -- Convert video to images')
util.file_init(opt)
ffmpeg.video2voice(path,opt.temp_dir+'/voice_tmp.mp3',opt.start_time,opt.last_time)
ffmpeg.video2image(path,opt.temp_dir+'/video2image/output_%06d.'+opt.tempimage_type,fps,opt.start_time,opt.last_time)
......@@ -59,7 +61,7 @@ def addmosaic_video(opt,netS):
if not opt.no_preview:
cv2.namedWindow('preview', cv2.WINDOW_NORMAL)
print('Find ROI location:')
print('Step:2/4 -- Find ROI location')
for i,imagepath in enumerate(imagepaths,1):
img = impro.imread(os.path.join(opt.temp_dir+'/video2image',imagepath))
mask,x,y,size,area = runmodel.get_ROI_position(img,netS,opt)
......@@ -77,7 +79,7 @@ def addmosaic_video(opt,netS):
mask_index = filt.position_medfilt(np.array(positions), 7)
# add mosaic
print('Add Mosaic:')
print('Step:3/4 -- Add Mosaic:')
t1 = time.time()
for i,imagepath in enumerate(imagepaths,1):
mask = impro.imread(os.path.join(opt.temp_dir+'/ROI_mask',imagepaths[mask_index[i-1]]),'gray')
......@@ -100,6 +102,7 @@ def addmosaic_video(opt,netS):
print()
if not opt.no_preview:
cv2.destroyAllWindows()
print('Step:4/4 -- Convert images to video')
ffmpeg.image2video( fps,
opt.temp_dir+'/addmosaic_image/output_%06d.'+opt.tempimage_type,
opt.temp_dir+'/voice_tmp.mp3',
......@@ -119,7 +122,7 @@ def styletransfer_video(opt,netG):
path = opt.media_path
positions = []
fps,imagepaths = video_init(opt,path)[:2]
print('Transfer:')
print('Step:2/4 -- Transfer')
t1 = time.time()
if not opt.no_preview:
cv2.namedWindow('preview', cv2.WINDOW_NORMAL)
......@@ -142,6 +145,7 @@ def styletransfer_video(opt,netG):
if not opt.no_preview:
cv2.destroyAllWindows()
suffix = os.path.basename(opt.model_path).replace('.pth','').replace('style_','')
print('Step:4/4 -- Convert images to video')
ffmpeg.image2video( fps,
opt.temp_dir+'/style_transfer/output_%06d.'+opt.tempimage_type,
opt.temp_dir+'/voice_tmp.mp3',
......@@ -156,8 +160,7 @@ def get_mosaic_positions(opt,netM,imagepaths,savemask=True):
t1 = time.time()
if not opt.no_preview:
cv2.namedWindow('mosaic mask', cv2.WINDOW_NORMAL)
print('Find mosaic location:')
print('Step:2/4 -- Find mosaic location')
for i,imagepath in enumerate(imagepaths,1):
img_origin = impro.imread(os.path.join(opt.temp_dir+'/video2image',imagepath))
x,y,size,mask = runmodel.get_mosaic_position(img_origin,netM,opt)
......@@ -186,7 +189,7 @@ def cleanmosaic_img(opt,netG,netM):
print('Clean Mosaic:',path)
img_origin = impro.imread(path)
x,y,size,mask = runmodel.get_mosaic_position(img_origin,netM,opt)
cv2.imwrite('./mask/'+os.path.basename(path), mask)
#cv2.imwrite('./mask/'+os.path.basename(path), mask)
img_result = img_origin.copy()
if size > 100 :
img_mosaic = img_origin[y-size:y+size,x-size:x+size]
......@@ -199,6 +202,18 @@ def cleanmosaic_img(opt,netG,netM):
print('Do not find mosaic')
impro.imwrite(os.path.join(opt.result_dir,os.path.splitext(os.path.basename(path))[0]+'_clean.jpg'),img_result)
def cleanmosaic_img_server(opt,img_origin,netG,netM):
x,y,size,mask = runmodel.get_mosaic_position(img_origin,netM,opt)
img_result = img_origin.copy()
if size > 100 :
img_mosaic = img_origin[y-size:y+size,x-size:x+size]
if opt.traditional:
img_fake = runmodel.traditional_cleaner(img_mosaic,opt)
else:
img_fake = runmodel.run_pix2pix(img_mosaic,netG,opt)
img_result = impro.replace_mosaic(img_origin,img_fake,mask,x,y,size,opt.no_feather)
return img_result
def cleanmosaic_video_byframe(opt,netG,netM):
path = opt.media_path
fps,imagepaths = video_init(opt,path)[:2]
......@@ -208,7 +223,7 @@ def cleanmosaic_video_byframe(opt,netG,netM):
cv2.namedWindow('clean', cv2.WINDOW_NORMAL)
# clean mosaic
print('Clean Mosaic:')
print('Step:3/4 -- Clean Mosaic:')
length = len(imagepaths)
for i,imagepath in enumerate(imagepaths,0):
x,y,size = positions[i][0],positions[i][1],positions[i][2]
......@@ -237,6 +252,7 @@ def cleanmosaic_video_byframe(opt,netG,netM):
print()
if not opt.no_preview:
cv2.destroyAllWindows()
print('Step:4/4 -- Convert images to video')
ffmpeg.image2video( fps,
opt.temp_dir+'/replace_mosaic/output_%06d.'+opt.tempimage_type,
opt.temp_dir+'/voice_tmp.mp3',
......@@ -244,11 +260,15 @@ def cleanmosaic_video_byframe(opt,netG,netM):
def cleanmosaic_video_fusion(opt,netG,netM):
path = opt.media_path
N = 25
if 'HD' in os.path.basename(opt.model_path):
INPUT_SIZE = 256
else:
INPUT_SIZE = 128
N,T,S = 2,5,3
LEFT_FRAME = (N*S)
POOL_NUM = LEFT_FRAME*2+1
INPUT_SIZE = 256
FRAME_POS = np.linspace(0, (T-1)*S,T,dtype=np.int64)
img_pool = []
previous_frame = None
init_flag = True
fps,imagepaths,height,width = video_init(opt,path)
positions = get_mosaic_positions(opt,netM,imagepaths,savemask=True)
t1 = time.time()
......@@ -256,39 +276,45 @@ def cleanmosaic_video_fusion(opt,netG,netM):
cv2.namedWindow('clean', cv2.WINDOW_NORMAL)
# clean mosaic
print('Clean Mosaic:')
print('Step:3/4 -- Clean Mosaic:')
length = len(imagepaths)
img_pool = []
mosaic_input = np.zeros((INPUT_SIZE,INPUT_SIZE,3*N+1), dtype='uint8')
for i,imagepath in enumerate(imagepaths,0):
x,y,size = positions[i][0],positions[i][1],positions[i][2]
input_stream = []
# image read stream
mask = cv2.imread(os.path.join(opt.temp_dir+'/mosaic_mask',imagepath),0)
if i==0 :
for j in range(0,N):
img_pool.append(impro.imread(os.path.join(opt.temp_dir+'/video2image',imagepaths[np.clip(i+j-12,0,len(imagepaths)-1)])))
else:
if i==0 :# init
for j in range(POOL_NUM):
img_pool.append(impro.imread(os.path.join(opt.temp_dir+'/video2image',imagepaths[np.clip(i+j-LEFT_FRAME,0,len(imagepaths)-1)])))
else: # load next frame
img_pool.pop(0)
img_pool.append(impro.imread(os.path.join(opt.temp_dir+'/video2image',imagepaths[np.clip(i+12,0,len(imagepaths)-1)])))
img_origin = img_pool[12]
img_pool.append(impro.imread(os.path.join(opt.temp_dir+'/video2image',imagepaths[np.clip(i+LEFT_FRAME,0,len(imagepaths)-1)])))
img_origin = img_pool[LEFT_FRAME]
img_result = img_origin.copy()
if size>100:
if size>50:
try:#Avoid unknown errors
#reshape to network input shape
for k in range(N):
mosaic_input[:,:,k*3:(k+1)*3] = impro.resize(img_pool[k][y-size:y+size,x-size:x+size], INPUT_SIZE)
mask_input = impro.resize(mask,np.min(img_origin.shape[:2]))[y-size:y+size,x-size:x+size]
mosaic_input[:,:,-1] = impro.resize(mask_input, INPUT_SIZE)
mosaic_input_tensor = data.im2tensor(mosaic_input,bgr2rgb=False,use_gpu=opt.use_gpu,use_transform = False,is0_1 = False)
unmosaic_pred = netG(mosaic_input_tensor)
img_fake = data.tensor2im(unmosaic_pred,rgb2bgr = False ,is0_1 = False)
for pos in FRAME_POS:
input_stream.append(impro.resize(img_pool[pos][y-size:y+size,x-size:x+size], INPUT_SIZE)[:,:,::-1])
if init_flag:
init_flag = False
previous_frame = input_stream[N]
previous_frame = data.im2tensor(previous_frame,bgr2rgb=True,gpu_id=opt.gpu_id)
input_stream = np.array(input_stream).reshape(1,T,INPUT_SIZE,INPUT_SIZE,3).transpose((0,4,1,2,3))
input_stream = data.to_tensor(data.normalize(input_stream),gpu_id=opt.gpu_id)
with torch.no_grad():
unmosaic_pred = netG(input_stream,previous_frame)
img_fake = data.tensor2im(unmosaic_pred,rgb2bgr = True)
previous_frame = unmosaic_pred
# previous_frame = data.tensor2im(unmosaic_pred,rgb2bgr = True)
mask = cv2.imread(os.path.join(opt.temp_dir+'/mosaic_mask',imagepath),0)
img_result = impro.replace_mosaic(img_origin,img_fake,mask,x,y,size,opt.no_feather)
except Exception as e:
print('Warning:',e)
init_flag = True
print('Error:',e)
else:
init_flag = True
cv2.imwrite(os.path.join(opt.temp_dir+'/replace_mosaic',imagepath),img_result)
os.remove(os.path.join(opt.temp_dir+'/video2image',imagepath))
......@@ -301,6 +327,7 @@ def cleanmosaic_video_fusion(opt,netG,netM):
print()
if not opt.no_preview:
cv2.destroyAllWindows()
print('Step:4/4 -- Convert images to video')
ffmpeg.image2video( fps,
opt.temp_dir+'/replace_mosaic/output_%06d.'+opt.tempimage_type,
opt.temp_dir+'/voice_tmp.mp3',
......
......@@ -11,7 +11,8 @@ class Options():
def initialize(self):
#base
self.parser.add_argument('--use_gpu', type=int,default=0, help='if -1, use cpu')
self.parser.add_argument('--debug', action='store_true', help='if specified, start debug mode')
self.parser.add_argument('--gpu_id', type=str,default='0', help='if -1, use cpu')
self.parser.add_argument('--media_path', type=str, default='./imgs/ruoruo.jpg',help='your videos or images path')
self.parser.add_argument('-ss', '--start_time', type=str, default='00:00:00',help='start position of video, default is the beginning of video')
self.parser.add_argument('-t', '--last_time', type=str, default='00:00:00',help='duration of the video, default is the entire video')
......@@ -59,62 +60,71 @@ class Options():
model_name = os.path.basename(self.opt.model_path)
self.opt.temp_dir = os.path.join(self.opt.temp_dir, 'DeepMosaics_temp')
os.environ["CUDA_VISIBLE_DEVICES"] = str(self.opt.use_gpu)
import torch
if torch.cuda.is_available() and self.opt.use_gpu > -1:
pass
else:
self.opt.use_gpu = -1
if self.opt.gpu_id != '-1':
os.environ["CUDA_VISIBLE_DEVICES"] = str(self.opt.gpu_id)
import torch
if not torch.cuda.is_available():
self.opt.gpu_id = '-1'
# else:
# self.opt.gpu_id = '-1'
if test_flag:
if not os.path.exists(self.opt.media_path):
print('Error: Bad media path!')
print('Error: Media does not exist!')
input('Please press any key to exit.\n')
sys.exit(0)
if self.opt.mode == 'auto':
if 'clean' in model_name or self.opt.traditional:
self.opt.mode = 'clean'
elif 'add' in model_name:
self.opt.mode = 'add'
elif 'style' in model_name or 'edges' in model_name:
self.opt.mode = 'style'
else:
print('Please input running model!')
input('Please press any key to exit.\n')
sys.exit(0)
if self.opt.output_size == 0 and self.opt.mode == 'style':
self.opt.output_size = 512
if 'edges' in model_name or 'edges' in self.opt.preprocess:
self.opt.edges = True
if self.opt.netG == 'auto' and self.opt.mode =='clean':
if 'unet_128' in model_name:
self.opt.netG = 'unet_128'
elif 'resnet_9blocks' in model_name:
self.opt.netG = 'resnet_9blocks'
elif 'HD' in model_name and 'video' not in model_name:
self.opt.netG = 'HD'
elif 'video' in model_name:
self.opt.netG = 'video'
else:
print('Type of Generator error!')
if not os.path.exists(self.opt.model_path):
print('Error: Model does not exist!')
input('Please press any key to exit.\n')
sys.exit(0)
if self.opt.ex_mult == 'auto':
if 'face' in model_name:
self.opt.ex_mult = 1.1
if self.opt.mode == 'auto':
if 'clean' in model_name or self.opt.traditional:
self.opt.mode = 'clean'
elif 'add' in model_name:
self.opt.mode = 'add'
elif 'style' in model_name or 'edges' in model_name:
self.opt.mode = 'style'
else:
print('Please check model_path!')
input('Please press any key to exit.\n')
sys.exit(0)
if self.opt.output_size == 0 and self.opt.mode == 'style':
self.opt.output_size = 512
if 'edges' in model_name or 'edges' in self.opt.preprocess:
self.opt.edges = True
if self.opt.netG == 'auto' and self.opt.mode =='clean':
if 'unet_128' in model_name:
self.opt.netG = 'unet_128'
elif 'resnet_9blocks' in model_name:
self.opt.netG = 'resnet_9blocks'
elif 'HD' in model_name and 'video' not in model_name:
self.opt.netG = 'HD'
elif 'video' in model_name:
self.opt.netG = 'video'
else:
print('Type of Generator error!')
input('Please press any key to exit.\n')
sys.exit(0)
if self.opt.ex_mult == 'auto':
if 'face' in model_name:
self.opt.ex_mult = 1.1
else:
self.opt.ex_mult = 1.5
else:
self.opt.ex_mult = 1.5
else:
self.opt.ex_mult = float(self.opt.ex_mult)
if self.opt.mosaic_position_model_path == 'auto':
_path = os.path.join(os.path.split(self.opt.model_path)[0],'mosaic_position.pth')
self.opt.mosaic_position_model_path = _path
# print(self.opt.mosaic_position_model_path)
self.opt.ex_mult = float(self.opt.ex_mult)
if self.opt.mosaic_position_model_path == 'auto' and self.opt.mode == 'clean':
_path = os.path.join(os.path.split(self.opt.model_path)[0],'mosaic_position.pth')
if os.path.isfile(_path):
self.opt.mosaic_position_model_path = _path
else:
input('Please check mosaic_position_model_path!')
input('Please press any key to exit.\n')
sys.exit(0)
return self.opt
\ No newline at end of file
......@@ -68,15 +68,18 @@ def main():
print('This type of file is not supported')
util.clean_tempfiles(opt, tmp_init = False)
if __name__ == '__main__':
if opt.debug:
main()
sys.exit(0)
try:
main()
print('Finished!')
except Exception as ex:
print('--------------------ERROR--------------------')
print('--------------Environment--------------')
print('DeepMosaics: 0.4.0')
print('DeepMosaics: 0.5.0')
print('Python:',sys.version)
import torch
print('Pytorch:',torch.__version__)
......
## DeepMosaics.exe Instructions
[[中文版]](./exe_help_CN.md)
**[[中文版]](./exe_help_CN.md)**
This is a GUI version compiled in Windows.<br>
Download this version and pre-trained model via [[Google Drive]](https://drive.google.com/open?id=1LTERcN33McoiztYEwBxMuRjjgxh4DEPs) [[百度云,提取码1x0a]](https://pan.baidu.com/s/10rN3U3zd5TmfGpO_PEShqQ) <br>
Video tutorial => [[youtube]](https://www.youtube.com/watch?v=1kEmYawJ_vk) [[bilibili]](https://www.bilibili.com/video/BV1QK4y1a7Av)<br>
Attentions:<br>
- Require Windows_x86_64, Windows10 is better.<br>
......@@ -9,11 +11,29 @@ Attentions:<br>
- Run time depends on computer performance.<br>
- If output video cannot be played, you can try with [potplayer](https://daumpotplayer.com/download/).<br>
- GUI version update slower than source.<br>
### How to install
#### CPU version
* 1.Download and install Microsoft Visual C++
https://aka.ms/vs/16/release/vc_redist.x64.exe
#### GPU version
Only suppport NVidia GPU above gtx1060(Driver:above 460 & CUDA:11.0)
* 1.Download and install Microsoft Visual C++
https://aka.ms/vs/16/release/vc_redist.x64.exe
* 2.Update your gpu drive to 460(or above)
https://www.nvidia.com/en-us/geforce/drivers/
* 3.Download and install CUDA 11.0:
https://developer.nvidia.com/cuda-toolkit-archive
You can also download them on BaiduNetdisk
https://pan.baidu.com/s/10rN3U3zd5TmfGpO_PEShqQ
Password: 1x0a
### How to use
* step 1: Choose image or video.
* step 2: Choose model(Different pre-trained models are suitable for different effects)
* step3: Run program and wait.
* step4: Cheek reult in './result'.
* step 3: Run program and wait.
* step 4: Cheek reult in './result'.
### Introduction to pre-trained models
* Mosaic
......@@ -22,10 +42,10 @@ Attentions:<br>
| :------------------------------: | :---------------------------------------------------------: |
| add_face.pth | Add mosaic to all faces in images/videos. |
| clean_face_HD.pth | Clean mosaic to all faces in images/video.<br>(RAM > 8GB). |
| add_youknow.pth | Add mosaic to all (FBI Warning) in images/videos. |
| clean_youknow_resnet_9blocks.pth | Clean mosaic to all (FBI Warning) in images/videos. |
| clean_youknow_video.pth | Clean mosaic to all (FBI Warning) in videos. |
| clean_youknow_video_HD.pth | Clean mosaic to all (FBI Warning) in videos.<br>(RAM > 8GB) |
| add_youknow.pth | Add mosaic to ... in images/videos. |
| clean_youknow_resnet_9blocks.pth | Clean mosaic to ... in images/videos. |
| clean_youknow_video.pth | Clean mosaic to ... in videos. It is better for processing video mosaics |
* Style Transfer
......@@ -50,8 +70,8 @@ Attentions:<br>
* 7. More options can be input.
* 8. Run program.
* 9. Open help file.
* 10. Sponsor our project.
* 11. Version information.
* 10. Sponsor our project.
* 11. Version information.
* 12. Open the URL on github.
### Introduction to options
......@@ -60,7 +80,7 @@ If you need more effects, use '--option your-parameters' to enter what you need
| Option | Description | Default |
| :----------: | :----------------------------------------: | :-------------------------------------: |
| --use_gpu | if -1, do not use gpu | 0 |
| --gpu_id | if -1, do not use gpu | 0 |
| --media_path | your videos or images path | ./imgs/ruoruo.jpg |
| --mode | program running mode(auto/clean/add/style) | 'auto' |
| --model_path | pretrained model path | ./pretrained_models/mosaic/add_face.pth |
......
## DeepMosaics.exe 使用说明
下载程序以及预训练模型 [[Google Drive]](https://drive.google.com/open?id=1LTERcN33McoiztYEwBxMuRjjgxh4DEPs) [[百度云,提取码1x0a]](https://pan.baidu.com/s/10rN3U3zd5TmfGpO_PEShqQ) <br>
[视频教程](https://www.bilibili.com/video/BV1QK4y1a7Av)<br>
注意事项:<br>
- 程序的运行要求在64位Windows操作系统,我仅在Windows10运行过,其他版本暂未经过测试<br>
- 程序的运行要求在64位Windows操作系统,我们仅在Windows10运行过,其他版本暂未经过测试<br>
- 请根据需求选择合适的预训练模型进行测试<br>
- 运行时间取决于电脑性能,对于视频文件,我们建议使用源码以及GPU运行<br>
- 运行时间取决于电脑性能,对于视频文件,我们建议使用GPU运行<br>
- 如果输出的视频无法播放,这边建议您尝试[potplayer](https://daumpotplayer.com/download/).<br>
- 相比于源码,该版本的更新将会延后.
### 如何安装
#### CPU version
* 1.下载安装 Microsoft Visual C++
https://aka.ms/vs/16/release/vc_redist.x64.exe
#### GPU version
仅支持gtx1060及以上的NVidia显卡(要求460版本以上的驱动以及11.0版本的CUDA, 注意只能是11.0)
* 1.Download and install Microsoft Visual C++
https://aka.ms/vs/16/release/vc_redist.x64.exe
* 2.Update your gpu drive to 460(or above)
https://www.nvidia.com/en-us/geforce/drivers/
* 3.Download and install CUDA 11.0:
https://developer.nvidia.com/cuda-toolkit-archive
当然这些也能在百度云上下载
https://pan.baidu.com/s/10rN3U3zd5TmfGpO_PEShqQ
提取码: 1x0a
### 如何使用
* step 1: 选择需要处理的图片或视频
* step 2: 选择预训练模型(不同的预训练模型有不同的效果)
* step3: 运行程序并等待
* step4: 查看结果(储存在result文件夹下)
* step 3: 运行程序并等待
* step 4: 查看结果(储存在result文件夹下)
## 预训练模型说明
当前的预训练模型分为两类——添加/移除马赛克以及风格转换.
......@@ -23,10 +44,10 @@
| :------------------------------: | :-------------------------------------------: |
| add_face.pth | 对图片或视频中的脸部打码 |
| clean_face_HD.pth | 对图片或视频中的脸部去码<br>(要求内存 > 8GB). |
| add_youknow.pth | 对图片或视频中的十八禁内容打码 |
| clean_youknow_resnet_9blocks.pth | 对图片或视频中的十八禁内容去码 |
| clean_youknow_video.pth | 对视频中的十八禁内容去码 |
| clean_youknow_video_HD.pth | 对视频中的十八禁内容去码<br>(要求内存 > 8GB) |
| add_youknow.pth | 对图片或视频中的...内容打码 |
| clean_youknow_resnet_9blocks.pth | 对图片或视频中的...内容去码 |
| clean_youknow_video.pth | 对视频中的...内容去码,推荐使用带有'video'的模型去除视频中的马赛克 |
* 风格转换
......@@ -52,8 +73,8 @@
* 7. 自行输入更多参数,详见下文
* 8. 运行
* 9. 打开帮助文件
* 10. 支持我们
* 11. 版本信息
* 10. 支持我们
* 11. 版本信息
* 12. 打开项目的github页面
### 参数说明
......@@ -62,7 +83,7 @@
| 选项 | 描述 | 默认 |
| :----------: | :------------------------: | :-------------------------------------: |
| --use_gpu | if -1, do not use gpu | 0 |
| --gpu_id | if -1, do not use gpu | 0 |
| --media_path | 需要处理的视频或者照片的路径 | ./imgs/ruoruo.jpg |
| --mode | 运行模式(auto/clean/add/style) | 'auto' |
| --model_path | 预训练模型的路径 | ./pretrained_models/mosaic/add_face.pth |
......@@ -75,7 +96,7 @@
| --mosaic_mod | 马赛克类型 -> squa_avg/ squa_random/ squa_avg_circle_edge/ rect_avg/random | squa_avg |
| --mosaic_size | 马赛克大小,0则为自动 | 0 |
| --mask_extend | 拓展马赛克区域 | 10 |
| --mask_threshold | 马赛克区域识别阈值 0~255 | 64 |
| --mask_threshold | 马赛克区域识别阈值 0~255,越小越容易被判断为马赛克区域 | 64 |
* 去除马赛克
......
......@@ -5,7 +5,7 @@ If you need more effects, use '--option your-parameters' to enter what you need
| Option | Description | Default |
| :----------: | :------------------------: | :-------------------------------------: |
| --use_gpu | if -1, do not use gpu | 0 |
| --gpu_id | if -1, do not use gpu | 0 |
| --media_path | your videos or images path | ./imgs/ruoruo.jpg |
| --start_time | start position of video, default is the beginning of video | '00:00:00' |
| --last_time | limit the duration of the video, default is the entire video | '00:00:00' |
......
......@@ -5,7 +5,7 @@
| 选项 | 描述 | 默认 |
| :----------: | :------------------------: | :-------------------------------------: |
| --use_gpu | if -1, do not use gpu | 0 |
| --gpu_id | if -1, do not use gpu | 0 |
| --media_path | 需要处理的视频或者照片的路径 | ./imgs/ruoruo.jpg |
| --start_time | 视频开始处理的位置,默认从头开始 | '00:00:00' |
| --last_time | 处理的视频时长,默认是整个视频 | '00:00:00' |
......
......@@ -10,8 +10,8 @@ Download pre-trained model via [[Google Drive]](https://drive.google.com/open?i
| clean_face_HD.pth | Clean mosaic to faces in images/video.<br>(RAM > 8GB). |
| add_youknow.pth | Add mosaic to ... in images/videos. |
| clean_youknow_resnet_9blocks.pth | Clean mosaic to ... in images/videos. |
| clean_youknow_video.pth | Clean mosaic to ... in videos. |
| clean_youknow_video_HD.pth | Clean mosaic to ... in videos.<br>(RAM > 8GB) |
| clean_youknow_video.pth | Clean mosaic to ... in videos. It is better for processing video mosaics |
### Style Transfer
......
......@@ -10,8 +10,8 @@
| clean_face_HD.pth | 对图片或视频中的脸部去码<br>(要求内存 > 8GB). |
| add_youknow.pth | 对图片或视频中的...内容打码 |
| clean_youknow_resnet_9blocks.pth | 对图片或视频中的...内容去码 |
| clean_youknow_video.pth | 对视频中的...内容去码 |
| clean_youknow_video_HD.pth | 对视频中的...内容去码<br>(要求内存 > 8GB) |
| clean_youknow_video.pth | 对视频中的...内容去码,推荐使用带有'video'的模型去除视频中的马赛克 |
### 风格转换
......
......@@ -10,7 +10,11 @@ We will make "face" as an example. If you don't have any picture, you can downlo
- [Pytorch 1.0+](https://pytorch.org/)
- NVIDIA GPU(with more than 6G memory) + CUDA CuDNN<br>
#### Dependencies
This code depends on opencv-python, torchvision, matplotlib available via pip install.
This code depends on opencv-python, torchvision, matplotlib, tensorboardX, scikit-image available via conda install.
```bash
# or
pip install -r requirements.txt
```
#### Clone this repo
```bash
git clone https://github.com/HypoX64/DeepMosaics
......@@ -32,31 +36,31 @@ python draw_mask.py --datadir 'dir for your pictures' --savedir ../datasets/draw
python get_image_from_video.py --datadir 'dir for your videos' --savedir ../datasets/video2image --fps 1
```
### Clean mosaic dataset
We provide several methods for generating clean mosaic datasets. However, for better effect, we recommend train a addmosaic model in a small data first and use it to automatically generate datasets in a big data.(recommend: Method 2(for image) & Method 4(for video))
* Method 1: Use drawn mask to make pix2pix(HD) datasets(Require``` origin_image``` and ```mask```)
We provide several methods for generating clean mosaic datasets. However, for better effect, we recommend train a addmosaic model in a small data first and use it to automatically generate datasets in a big data. (recommend: Method 2(for image) & Method 4(for video))
* Method 1: Use drawn mask to make pix2pix(HD) datasets (Require``` origin_image``` and ```mask```)
```bash
python make_pix2pix_dataset.py --datadir ../datasets/draw/face --hd --outsize 512 --fold 1 --name face --savedir ../datasets/pix2pix/face --mod drawn --minsize 128 --square
```
* Method 2: Use addmosaic model to make pix2pix(HD) datasets(Require addmosaic pre-trained model)
* Method 2: Use addmosaic model to make pix2pix(HD) datasets (Require addmosaic pre-trained model)
```bash
python make_pix2pix_dataset.py --datadir 'dir for your pictures' --hd --outsize 512 --fold 1 --name face --savedir ../datasets/pix2pix/face --mod network --model_path ../pretrained_models/mosaic/add_face.pth --minsize 128 --square --mask_threshold 128
```
* Method 3: Use Irregular Masks to make pix2pix(HD) datasets(Require [Irregular Masks](https://nv-adlr.github.io/publication/partialconv-inpainting))
* Method 3: Use Irregular Masks to make pix2pix(HD) datasets (Require [Irregular Masks](https://nv-adlr.github.io/publication/partialconv-inpainting))
```bash
python make_pix2pix_dataset.py --datadir 'dir for your pictures' --hd --outsize 512 --fold 1 --name face --savedir ../datasets/pix2pix/face --mod irregular --irrholedir ../datasets/Irregular_Holes_mask --square
```
* Method 4: Use addmosaic model to make video datasets(Require addmosaic pre-trained model. This is better for processing video mosaics)
* Method 4: Use addmosaic model to make video datasets (Require addmosaic pre-trained model. This is better for processing video mosaics)
```bash
python make_video_dataset.py --datadir 'dir for your videos' --model_path ../pretrained_models/mosaic/add_face.pth --mask_threshold 96 --savedir ../datasets/video/face
python make_video_dataset.py --model_path ../pretrained_models/mosaic/add_face.pth --gpu_id 0 --datadir 'dir for your videos' --savedir ../datasets/video/face
```
## Training
### Add
```bash
cd train/add
python train.py --use_gpu 0 --dataset ../../datasets/draw/face --savename face --loadsize 512 --finesize 360 --batchsize 16
python train.py --gpu_id 0 --dataset ../../datasets/draw/face --savename face --loadsize 512 --finesize 360 --batchsize 16
```
### Clean
* For image datasets(generated by ```make_pix2pix_dataset.py```)
* For image datasets (generated by ```make_pix2pix_dataset.py```)
We use [pix2pix](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix) or [pix2pixHD](https://github.com/NVIDIA/pix2pixHD) to train model. We just take pix2pixHD as an example.
```bash
git clone https://github.com/NVIDIA/pix2pixHD
......@@ -64,10 +68,10 @@ cd pix2pixHD
pip install dominate
python train.py --name face --resize_or_crop resize_and_crop --loadSize 563 --fineSize 512 --label_nc 0 --no_instance --dataroot ../datasets/pix2pix/face
```
* For video datasets(generated by ```make_video_dataset.py```)
* For video datasets (generated by ```make_video_dataset.py```)
```bash
cd train/clean
python train.py --dataset ../../datasets/video/face --savename face --savefreq 100000 --gan --hd --lr 0.0002 --lambda_gan 1 --use_gpu 0
python train.py --dataset ../../datasets/video/face --savename face --n_blocks 4 --lambda_GAN 0.01 --loadsize 286 --finesize 256 --batchsize 16 --n_layers_D 2 --num_D 3 --n_epoch 200 --gpu_id 4,5,6,7 --load_thread 16
```
## Testing
Put saved network to ```./pretrained_models/mosaic/``` and rename it as ```add_face.pth``` or ```clean_face_HD.pth``` or ```clean_face_video_HD.pth```
Put saved network to ```./pretrained_models/mosaic/``` and rename it as ```add_face.pth``` or ```clean_face_HD.pth``` or ```clean_face_video_HD.pth```and then run ```deepmosaic.py --model_path ./pretrained_models/mosaic/your_model_name```
此差异已折叠。
......@@ -16,7 +16,7 @@ import torch
from models import runmodel,loadmodel
import util.image_processing as impro
from util import util,mosaic,data
from util import degradater, util,mosaic,data
opt.parser.add_argument('--datadir',type=str,default='../datasets/draw/face', help='')
......@@ -87,11 +87,11 @@ for fold in range(opt.fold):
mask = mask_drawn
if 'irregular' in opt.mod:
mask_irr = impro.imread(irrpaths[random.randint(0,12000-1)],'gray')
mask_irr = data.random_transform_single(mask_irr, (img.shape[0],img.shape[1]))
mask_irr = data.random_transform_single_mask(mask_irr, (img.shape[0],img.shape[1]))
mask = mask_irr
if 'network' in opt.mod:
mask_net = runmodel.get_ROI_position(img,net,opt,keepsize=True)[0]
if opt.use_gpu != -1:
if opt.gpu_id != -1:
torch.cuda.empty_cache()
if not opt.all_mosaic_area:
mask_net = impro.find_mostlikely_ROI(mask_net)
......@@ -107,11 +107,11 @@ for fold in range(opt.fold):
saveflag = True
if opt.mod == ['drawn','irregular']:
x,y,size,area = impro.boundingSquare(mask_drawn, random.uniform(1.2,1.6))
x,y,size,area = impro.boundingSquare(mask_drawn, random.uniform(1.1,1.6))
elif opt.mod == ['network','irregular']:
x,y,size,area = impro.boundingSquare(mask_net, random.uniform(1.2,1.6))
x,y,size,area = impro.boundingSquare(mask_net, random.uniform(1.1,1.6))
else:
x,y,size,area = impro.boundingSquare(mask, random.uniform(1.2,1.6))
x,y,size,area = impro.boundingSquare(mask, random.uniform(1.1,1.6))
if area < 1000:
saveflag = False
......@@ -130,11 +130,15 @@ for fold in range(opt.fold):
if saveflag:
# add mosaic
img_mosaic = mosaic.addmosaic_random(img, mask)
# random blur
# random degradater
if random.random()>0.5:
Q = random.randint(1,15)
img = impro.dctblur(img,Q)
img_mosaic = impro.dctblur(img_mosaic,Q)
degradate_params = degradater.get_random_degenerate_params(mod='weaker_2')
img = degradater.degradate(img,degradate_params)
img_mosaic = degradater.degradate(img_mosaic,degradate_params)
# if random.random()>0.5:
# Q = random.randint(1,15)
# img = impro.dctblur(img,Q)
# img_mosaic = impro.dctblur(img_mosaic,Q)
savecnt += 1
......
......@@ -14,7 +14,7 @@ import torch
from models import runmodel,loadmodel
import util.image_processing as impro
from util import util,mosaic,data,ffmpeg
from util import filt, util,mosaic,data,ffmpeg
opt.parser.add_argument('--datadir',type=str,default='your video dir', help='')
......@@ -56,6 +56,7 @@ for videopath in videopaths:
ffmpeg.video2image(videopath, opt.temp_dir+'/video2image/%05d.'+opt.tempimage_type,fps=1,
start_time = util.second2stamp(cut_point*opt.interval),last_time = util.second2stamp(opt.time))
imagepaths = util.Traversal(opt.temp_dir+'/video2image')
imagepaths = sorted(imagepaths)
cnt = 0
for i in range(opt.time):
img = impro.imread(imagepaths[i])
......@@ -92,30 +93,65 @@ for videopath in videopaths:
imagepaths = util.Traversal(opt.temp_dir+'/video2image')
imagepaths = sorted(imagepaths)
imgs=[];masks=[]
mask_flag = False
# mask_flag = False
# for imagepath in imagepaths:
# img = impro.imread(imagepath)
# mask = runmodel.get_ROI_position(img,net,opt,keepsize=True)[0]
# imgs.append(img)
# masks.append(mask)
# if not mask_flag:
# mask_avg = mask.astype(np.float64)
# mask_flag = True
# else:
# mask_avg += mask.astype(np.float64)
# mask_avg = np.clip(mask_avg/len(imagepaths),0,255).astype('uint8')
# mask_avg = impro.mask_threshold(mask_avg,20,64)
# if not opt.all_mosaic_area:
# mask_avg = impro.find_mostlikely_ROI(mask_avg)
# x,y,size,area = impro.boundingSquare(mask_avg,Ex_mul=random.uniform(1.1,1.5))
# for i in range(len(imagepaths)):
# img = impro.resize(imgs[i][y-size:y+size,x-size:x+size],opt.outsize,interpolation=cv2.INTER_CUBIC)
# mask = impro.resize(masks[i][y-size:y+size,x-size:x+size],opt.outsize,interpolation=cv2.INTER_CUBIC)
# impro.imwrite(os.path.join(origindir,'%05d'%(i+1)+'.jpg'), img)
# impro.imwrite(os.path.join(maskdir,'%05d'%(i+1)+'.png'), mask)
ex_mul = random.uniform(1.2,1.7)
positions = []
for imagepath in imagepaths:
img = impro.imread(imagepath)
mask = runmodel.get_ROI_position(img,net,opt,keepsize=True)[0]
imgs.append(img)
masks.append(mask)
if not mask_flag:
mask_avg = mask.astype(np.float64)
mask_flag = True
else:
mask_avg += mask.astype(np.float64)
mask_avg = np.clip(mask_avg/len(imagepaths),0,255).astype('uint8')
mask_avg = impro.mask_threshold(mask_avg,20,64)
if not opt.all_mosaic_area:
mask_avg = impro.find_mostlikely_ROI(mask_avg)
x,y,size,area = impro.boundingSquare(mask_avg,Ex_mul=random.uniform(1.1,1.5))
for i in range(len(imagepaths)):
img = impro.resize(imgs[i][y-size:y+size,x-size:x+size],opt.outsize,interpolation=cv2.INTER_CUBIC)
x,y,size,area = impro.boundingSquare(mask,Ex_mul=ex_mul)
positions.append([x,y,size])
positions =np.array(positions)
for i in range(3):positions[:,i] = filt.medfilt(positions[:,i],opt.medfilt_num)
for i,imagepath in enumerate(imagepaths):
x,y,size = positions[i][0],positions[i][1],positions[i][2]
tmp_cnt = i
while size<opt.minsize//2:
tmp_cnt = tmp_cnt-1
x,y,size = positions[tmp_cnt][0],positions[tmp_cnt][1],positions[tmp_cnt][2]
img = impro.resize(imgs[i][y-size:y+size,x-size:x+size],opt.outsize,interpolation=cv2.INTER_CUBIC)
mask = impro.resize(masks[i][y-size:y+size,x-size:x+size],opt.outsize,interpolation=cv2.INTER_CUBIC)
impro.imwrite(os.path.join(origindir,'%05d'%(i+1)+'.jpg'), img)
impro.imwrite(os.path.join(maskdir,'%05d'%(i+1)+'.png'), mask)
# x_tmp,y_tmp,size_tmp
# for imagepath in imagepaths:
# img = impro.imread(imagepath)
# mask,x,y,halfsize,area = runmodel.get_ROI_position(img,net,opt,keepsize=True)
# if halfsize>opt.minsize//4:
# if not opt.all_mosaic_area:
# mask_avg = impro.find_mostlikely_ROI(mask_avg)
# x,y,size,area = impro.boundingSquare(mask_avg,Ex_mul=ex_mul)
# img = impro.resize(imgs[i][y-size:y+size,x-size:x+size],opt.outsize,interpolation=cv2.INTER_CUBIC)
# mask = impro.resize(masks[i][y-size:y+size,x-size:x+size],opt.outsize,interpolation=cv2.INTER_CUBIC)
# impro.imwrite(os.path.join(origindir,'%05d'%(i+1)+'.jpg'), img)
# impro.imwrite(os.path.join(maskdir,'%05d'%(i+1)+'.png'), mask)
result_cnt+=1
......@@ -124,5 +160,5 @@ for videopath in videopaths:
util.writelog(os.path.join(opt.savedir,'opt.txt'),
videopath+'\n'+str(result_cnt)+'\n'+str(e))
video_cnt +=1
if opt.use_gpu != -1:
if opt.gpu_id != '-1':
torch.cuda.empty_cache()
import torch
import torch.nn as nn
from .pix2pixHD_model import *
from .model_util import *
from models import model_util
class UpBlock(nn.Module):
def __init__(self, in_channel, out_channel, kernel_size=3, padding=1):
super().__init__()
self.convup = nn.Sequential(
nn.Upsample(scale_factor=2, mode='bilinear', align_corners=False),
nn.ReflectionPad2d(padding),
# EqualConv2d(out_channel, out_channel, kernel_size, padding=padding),
SpectralNorm(nn.Conv2d(in_channel, out_channel, kernel_size)),
nn.LeakyReLU(0.2),
# Blur(out_channel),
)
def forward(self, input):
outup = self.convup(input)
return outup
class Encoder2d(nn.Module):
def __init__(self, input_nc, ngf=64, n_downsampling=3, activation = nn.LeakyReLU(0.2)):
super(Encoder2d, self).__init__()
model = [nn.ReflectionPad2d(3), SpectralNorm(nn.Conv2d(input_nc, ngf, kernel_size=7, padding=0)), activation]
### downsample
for i in range(n_downsampling):
mult = 2**i
model += [ nn.ReflectionPad2d(1),
SpectralNorm(nn.Conv2d(ngf * mult, ngf * mult * 2, kernel_size=3, stride=2, padding=0)),
activation]
self.model = nn.Sequential(*model)
def forward(self, input):
return self.model(input)
class Encoder3d(nn.Module):
def __init__(self, input_nc, ngf=64, n_downsampling=3, activation = nn.LeakyReLU(0.2)):
super(Encoder3d, self).__init__()
model = [SpectralNorm(nn.Conv3d(input_nc, ngf, kernel_size=3, padding=1)), activation]
### downsample
for i in range(n_downsampling):
mult = 2**i
model += [ SpectralNorm(nn.Conv3d(ngf * mult, ngf * mult * 2, kernel_size=3, stride=2, padding=1)),
activation]
self.model = nn.Sequential(*model)
def forward(self, input):
return self.model(input)
class BVDNet(nn.Module):
def __init__(self, N=2, n_downsampling=3, n_blocks=4, input_nc=3, output_nc=3,activation=nn.LeakyReLU(0.2)):
super(BVDNet, self).__init__()
ngf = 64
padding_type = 'reflect'
self.N = N
### encoder
self.encoder3d = Encoder3d(input_nc,64,n_downsampling,activation)
self.encoder2d = Encoder2d(input_nc,64,n_downsampling,activation)
### resnet blocks
self.blocks = []
mult = 2**n_downsampling
for i in range(n_blocks):
self.blocks += [ResnetBlockSpectralNorm(ngf * mult, padding_type=padding_type, activation=activation)]
self.blocks = nn.Sequential(*self.blocks)
### decoder
self.decoder = []
for i in range(n_downsampling):
mult = 2**(n_downsampling - i)
self.decoder += [UpBlock(ngf * mult, int(ngf * mult / 2))]
self.decoder += [nn.ReflectionPad2d(3), nn.Conv2d(ngf, output_nc, kernel_size=7, padding=0)]
self.decoder = nn.Sequential(*self.decoder)
self.limiter = nn.Tanh()
def forward(self, stream, previous):
this_shortcut = stream[:,:,self.N]
stream = self.encoder3d(stream)
stream = stream.reshape(stream.size(0),stream.size(1),stream.size(3),stream.size(4))
previous = self.encoder2d(previous)
x = stream + previous
x = self.blocks(x)
x = self.decoder(x)
x = x+this_shortcut
x = self.limiter(x)
return x
def define_G(N=2, n_blocks=1, gpu_id='-1'):
netG = BVDNet(N = N, n_blocks=n_blocks)
netG = model_util.todevice(netG,gpu_id)
netG.apply(model_util.init_weights)
return netG
################################Discriminator################################
def define_D(input_nc=6, ndf=64, n_layers_D=1, use_sigmoid=False, num_D=3, gpu_id='-1'):
netD = MultiscaleDiscriminator(input_nc, ndf, n_layers_D, use_sigmoid, num_D)
netD = model_util.todevice(netD,gpu_id)
netD.apply(model_util.init_weights)
return netD
class MultiscaleDiscriminator(nn.Module):
def __init__(self, input_nc, ndf=64, n_layers=3, use_sigmoid=False, num_D=3):
super(MultiscaleDiscriminator, self).__init__()
self.num_D = num_D
self.n_layers = n_layers
for i in range(num_D):
netD = NLayerDiscriminator(input_nc, ndf, n_layers, use_sigmoid)
setattr(self, 'layer'+str(i), netD.model)
self.downsample = nn.AvgPool2d(3, stride=2, padding=[1, 1], count_include_pad=False)
def singleD_forward(self, model, input):
return [model(input)]
def forward(self, input):
num_D = self.num_D
result = []
input_downsampled = input
for i in range(num_D):
model = getattr(self, 'layer'+str(num_D-1-i))
result.append(self.singleD_forward(model, input_downsampled))
if i != (num_D-1):
input_downsampled = self.downsample(input_downsampled)
return result
# Defines the PatchGAN discriminator with the specified arguments.
class NLayerDiscriminator(nn.Module):
def __init__(self, input_nc, ndf=64, n_layers=3, use_sigmoid=False):
super(NLayerDiscriminator, self).__init__()
self.n_layers = n_layers
kw = 4
padw = int(np.ceil((kw-1.0)/2))
sequence = [[nn.Conv2d(input_nc, ndf, kernel_size=kw, stride=2, padding=padw), nn.LeakyReLU(0.2)]]
nf = ndf
for n in range(1, n_layers):
nf_prev = nf
nf = min(nf * 2, 512)
sequence += [[
SpectralNorm(nn.Conv2d(nf_prev, nf, kernel_size=kw, stride=2, padding=padw)),
nn.LeakyReLU(0.2)
]]
nf_prev = nf
nf = min(nf * 2, 512)
sequence += [[
SpectralNorm(nn.Conv2d(nf_prev, nf, kernel_size=kw, stride=1, padding=padw)),
nn.LeakyReLU(0.2)
]]
sequence += [[nn.Conv2d(nf, 1, kernel_size=kw, stride=1, padding=padw)]]
if use_sigmoid:
sequence += [[nn.Sigmoid()]]
sequence_stream = []
for n in range(len(sequence)):
sequence_stream += sequence[n]
self.model = nn.Sequential(*sequence_stream)
def forward(self, input):
return self.model(input)
class GANLoss(nn.Module):
def __init__(self, mode='D'):
super(GANLoss, self).__init__()
if mode == 'D':
self.lossf = model_util.HingeLossD()
elif mode == 'G':
self.lossf = model_util.HingeLossG()
self.mode = mode
def forward(self, dis_fake = None, dis_real = None):
if isinstance(dis_fake, list):
if self.mode == 'D':
loss = 0
for i in range(len(dis_fake)):
loss += self.lossf(dis_fake[i][-1],dis_real[i][-1])
elif self.mode =='G':
loss = 0
weight = 2**len(dis_fake)
for i in range(len(dis_fake)):
weight = weight/2
loss += weight*self.lossf(dis_fake[i][-1])
return loss
else:
if self.mode == 'D':
return self.lossf(dis_fake[-1],dis_real[-1])
elif self.mode =='G':
return self.lossf(dis_fake[-1])
......@@ -2,7 +2,7 @@
import torch.nn as nn
import torch
import torch.nn.functional as F
from . import components
from . import model_util
import warnings
warnings.filterwarnings(action='ignore')
......@@ -43,7 +43,7 @@ class DiceLoss(nn.Module):
class resnet18(torch.nn.Module):
def __init__(self, pretrained=True):
super().__init__()
self.features = components.resnet18(pretrained=pretrained)
self.features = model_util.resnet18(pretrained=pretrained)
self.conv1 = self.features.conv1
self.bn1 = self.features.bn1
self.relu = self.features.relu
......@@ -70,7 +70,7 @@ class resnet18(torch.nn.Module):
class resnet101(torch.nn.Module):
def __init__(self, pretrained=True):
super().__init__()
self.features = components.resnet101(pretrained=pretrained)
self.features = model_util.resnet101(pretrained=pretrained)
self.conv1 = self.features.conv1
self.bn1 = self.features.bn1
self.relu = self.features.relu
......
import torch
from .pix2pix_model import define_G
from .pix2pixHD_model import define_G as define_G_HD
from .unet_model import UNet
from .video_model import MosaicNet
from .videoHD_model import MosaicNet as MosaicNet_HD
from . import model_util
from .pix2pix_model import define_G as pix2pix_G
from .pix2pixHD_model import define_G as pix2pixHD_G
# from .video_model import MosaicNet
# from .videoHD_model import MosaicNet as MosaicNet_HD
from .BiSeNet_model import BiSeNet
from .BVDNet import define_G as video_G
def show_paramsnumber(net,netname='net'):
parameters = sum(param.numel() for param in net.parameters())
parameters = round(parameters/1e6,2)
print(netname+' parameters: '+str(parameters)+'M')
def __patch_instance_norm_state_dict(state_dict, module, keys, i=0):
"""Fix InstanceNorm checkpoints incompatibility (prior to 0.4)"""
key = keys[i]
if i + 1 == len(keys): # at the end, pointing to a parameter/buffer
if module.__class__.__name__.startswith('InstanceNorm') and \
(key == 'running_mean' or key == 'running_var'):
if getattr(module, key) is None:
state_dict.pop('.'.join(keys))
if module.__class__.__name__.startswith('InstanceNorm') and \
(key == 'num_batches_tracked'):
state_dict.pop('.'.join(keys))
else:
__patch_instance_norm_state_dict(state_dict, getattr(module, key), keys, i + 1)
def pix2pix(opt):
# print(opt.model_path,opt.netG)
if opt.netG == 'HD':
netG = define_G_HD(3, 3, 64, 'global' ,4)
netG = pix2pixHD_G(3, 3, 64, 'global' ,4)
else:
netG = define_G(3, 3, 64, opt.netG, norm='batch',use_dropout=True, init_type='normal', gpu_ids=[])
netG = pix2pix_G(3, 3, 64, opt.netG, norm='batch',use_dropout=True, init_type='normal', gpu_ids=[])
show_paramsnumber(netG,'netG')
netG.load_state_dict(torch.load(opt.model_path))
netG = model_util.todevice(netG,opt.gpu_id)
netG.eval()
if opt.use_gpu != -1:
netG.cuda()
return netG
def style(opt):
if opt.edges:
netG = define_G(1, 3, 64, 'resnet_9blocks', norm='instance',use_dropout=True, init_type='normal', gpu_ids=[])
netG = pix2pix_G(1, 3, 64, 'resnet_9blocks', norm='instance',use_dropout=True, init_type='normal', gpu_ids=[])
else:
netG = define_G(3, 3, 64, 'resnet_9blocks', norm='instance',use_dropout=False, init_type='normal', gpu_ids=[])
netG = pix2pix_G(3, 3, 64, 'resnet_9blocks', norm='instance',use_dropout=False, init_type='normal', gpu_ids=[])
#in other to load old pretrain model
#https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix/models/base_model.py
......@@ -57,23 +43,19 @@ def style(opt):
# patch InstanceNorm checkpoints prior to 0.4
for key in list(state_dict.keys()): # need to copy keys here because we mutate in loop
__patch_instance_norm_state_dict(state_dict, netG, key.split('.'))
model_util.patch_instance_norm_state_dict(state_dict, netG, key.split('.'))
netG.load_state_dict(state_dict)
if opt.use_gpu != -1:
netG.cuda()
netG = model_util.todevice(netG,opt.gpu_id)
netG.eval()
return netG
def video(opt):
if 'HD' in opt.model_path:
netG = MosaicNet_HD(3*25+1, 3, norm='instance')
else:
netG = MosaicNet(3*25+1, 3,norm = 'batch')
netG = video_G(N=2,n_blocks=4,gpu_id=opt.gpu_id)
show_paramsnumber(netG,'netG')
netG.load_state_dict(torch.load(opt.model_path))
netG = model_util.todevice(netG,opt.gpu_id)
netG.eval()
if opt.use_gpu != -1:
netG.cuda()
return netG
def bisenet(opt,type='roi'):
......@@ -86,7 +68,6 @@ def bisenet(opt,type='roi'):
net.load_state_dict(torch.load(opt.model_path))
elif type == 'mosaic':
net.load_state_dict(torch.load(opt.mosaic_position_model_path))
net = model_util.todevice(net,opt.gpu_id)
net.eval()
if opt.use_gpu != -1:
net.cuda()
return net
......@@ -7,11 +7,11 @@ from util import data
import torch
import numpy as np
def run_segment(img,net,size = 360,use_gpu = 0):
def run_segment(img,net,size = 360,gpu_id = '-1'):
img = impro.resize(img,size)
img = data.im2tensor(img,use_gpu = use_gpu, bgr2rgb = False,use_transform = False , is0_1 = True)
img = data.im2tensor(img,gpu_id = gpu_id, bgr2rgb = False, is0_1 = True)
mask = net(img)
mask = data.tensor2im(mask, gray=True,rgb2bgr = False, is0_1 = True)
mask = data.tensor2im(mask, gray=True, is0_1 = True)
return mask
def run_pix2pix(img,net,opt):
......@@ -19,7 +19,7 @@ def run_pix2pix(img,net,opt):
img = impro.resize(img,512)
else:
img = impro.resize(img,128)
img = data.im2tensor(img,use_gpu=opt.use_gpu)
img = data.im2tensor(img,gpu_id=opt.gpu_id)
img_fake = net(img)
img_fake = data.tensor2im(img_fake)
return img_fake
......@@ -50,18 +50,18 @@ def run_styletransfer(opt, net, img):
else:
canny_low = opt.canny-int(opt.canny/2)
canny_high = opt.canny+int(opt.canny/2)
img = cv2.Canny(img,opt.canny-50,opt.canny+50)
img = cv2.Canny(img,canny_low,canny_high)
if opt.only_edges:
return img
img = data.im2tensor(img,use_gpu=opt.use_gpu,gray=True,use_transform = False,is0_1 = False)
img = data.im2tensor(img,gpu_id=opt.gpu_id,gray=True)
else:
img = data.im2tensor(img,use_gpu=opt.use_gpu,gray=False,use_transform = True)
img = data.im2tensor(img,gpu_id=opt.gpu_id)
img = net(img)
img = data.tensor2im(img)
return img
def get_ROI_position(img,net,opt,keepsize=True):
mask = run_segment(img,net,size=360,use_gpu = opt.use_gpu)
mask = run_segment(img,net,size=360,gpu_id = opt.gpu_id)
mask = impro.mask_threshold(mask,opt.mask_extend,opt.mask_threshold)
if keepsize:
mask = impro.resize_like(mask, img)
......@@ -70,7 +70,7 @@ def get_ROI_position(img,net,opt,keepsize=True):
def get_mosaic_position(img_origin,net_mosaic_pos,opt):
h,w = img_origin.shape[:2]
mask = run_segment(img_origin,net_mosaic_pos,size=360,use_gpu = opt.use_gpu)
mask = run_segment(img_origin,net_mosaic_pos,size=360,gpu_id = opt.gpu_id)
# mask_1 = mask.copy()
mask = impro.mask_threshold(mask,ex_mun=int(min(h,w)/20),threshold=opt.mask_threshold)
if not opt.all_mosaic_area:
......
import torch
import torch.nn as nn
import torch.nn.functional as F
from .pix2pixHD_model import *
class encoder_2d(nn.Module):
def __init__(self, input_nc, output_nc, ngf=64, n_downsampling=3, n_blocks=9, norm_layer=nn.BatchNorm2d,
padding_type='reflect'):
assert(n_blocks >= 0)
super(encoder_2d, self).__init__()
activation = nn.ReLU(True)
model = [nn.ReflectionPad2d(3), nn.Conv2d(input_nc, ngf, kernel_size=7, padding=0), norm_layer(ngf), activation]
### downsample
for i in range(n_downsampling):
mult = 2**i
model += [nn.ReflectionPad2d(1),nn.Conv2d(ngf * mult, ngf * mult * 2, kernel_size=3, stride=2, padding=0),
norm_layer(ngf * mult * 2), activation]
self.model = nn.Sequential(*model)
def forward(self, input):
return self.model(input)
class decoder_2d(nn.Module):
def __init__(self, input_nc, output_nc, ngf=64, n_downsampling=3, n_blocks=9, norm_layer=nn.BatchNorm2d,
padding_type='reflect'):
assert(n_blocks >= 0)
super(decoder_2d, self).__init__()
activation = nn.ReLU(True)
model = []
### resnet blocks
mult = 2**n_downsampling
for i in range(n_blocks):
model += [ResnetBlock(ngf * mult, padding_type=padding_type, activation=activation, norm_layer=norm_layer)]
### upsample
for i in range(n_downsampling):
mult = 2**(n_downsampling - i)
# model += [ nn.Upsample(scale_factor = 2, mode='nearest'),
# nn.ReflectionPad2d(1),
# nn.Conv2d(ngf * mult, int(ngf * mult / 2),kernel_size=3, stride=1, padding=0),
# norm_layer(int(ngf * mult / 2)),
# nn.ReLU(True)]
model += [nn.ConvTranspose2d(ngf * mult, int(ngf * mult / 2), kernel_size=3, stride=2, padding=1, output_padding=1),
norm_layer(int(ngf * mult / 2)), activation]
model += [nn.ReflectionPad2d(3), nn.Conv2d(ngf, output_nc, kernel_size=7, padding=0), nn.Tanh()]
self.model = nn.Sequential(*model)
def forward(self, input):
return self.model(input)
class conv_3d(nn.Module):
def __init__(self,inchannel,outchannel,kernel_size=3,stride=2,padding=1,norm_layer_3d=nn.BatchNorm3d,use_bias=True):
super(conv_3d, self).__init__()
self.conv = nn.Sequential(
nn.Conv3d(inchannel, outchannel, kernel_size=kernel_size, stride=stride, padding=padding, bias=use_bias),
norm_layer_3d(outchannel),
nn.ReLU(inplace=True),
)
def forward(self, x):
x = self.conv(x)
return x
class conv_2d(nn.Module):
def __init__(self,inchannel,outchannel,kernel_size=3,stride=1,padding=1,norm_layer_2d=nn.BatchNorm2d,use_bias=True):
super(conv_2d, self).__init__()
self.conv = nn.Sequential(
nn.ReflectionPad2d(padding),
nn.Conv2d(inchannel, outchannel, kernel_size=kernel_size, stride=stride, padding=0, bias=use_bias),
norm_layer_2d(outchannel),
nn.ReLU(inplace=True),
)
def forward(self, x):
x = self.conv(x)
return x
class encoder_3d(nn.Module):
def __init__(self,in_channel,norm_layer_2d,norm_layer_3d,use_bias):
super(encoder_3d, self).__init__()
self.inconv = conv_3d(1, 64, 7, 2, 3,norm_layer_3d,use_bias)
self.down1 = conv_3d(64, 128, 3, 2, 1,norm_layer_3d,use_bias)
self.down2 = conv_3d(128, 256, 3, 2, 1,norm_layer_3d,use_bias)
self.down3 = conv_3d(256, 512, 3, 2, 1,norm_layer_3d,use_bias)
self.down4 = conv_3d(512, 1024, 3, 1, 1,norm_layer_3d,use_bias)
self.pool = nn.AvgPool3d((5,1,1))
# self.conver2d = nn.Sequential(
# nn.Conv2d(256*int(in_channel/4), 256, kernel_size=3, stride=1, padding=1, bias=use_bias),
# norm_layer_2d(256),
# nn.ReLU(inplace=True),
# )
def forward(self, x):
x = x.view(x.size(0),1,x.size(1),x.size(2),x.size(3))
x = self.inconv(x)
x = self.down1(x)
x = self.down2(x)
x = self.down3(x)
x = self.down4(x)
#print(x.size())
x = self.pool(x)
#print(x.size())
# torch.Size([1, 1024, 16, 16])
# torch.Size([1, 512, 5, 16, 16])
x = x.view(x.size(0),x.size(1),x.size(3),x.size(4))
# x = self.conver2d(x)
return x
# def __init__(self, input_nc, output_nc, ngf=64, n_downsampling=3, n_blocks=9, norm_layer=nn.BatchNorm2d,
# padding_type='reflect')
class ALL(nn.Module):
def __init__(self, in_channel, out_channel,norm_layer_2d,norm_layer_3d,use_bias):
super(ALL, self).__init__()
self.encoder_2d = encoder_2d(4,3,64,4,norm_layer=norm_layer_2d,padding_type='reflect')
self.encoder_3d = encoder_3d(in_channel,norm_layer_2d,norm_layer_3d,use_bias)
self.decoder_2d = decoder_2d(4,3,64,4,norm_layer=norm_layer_2d,padding_type='reflect')
# self.shortcut_cov = conv_2d(3,64,7,1,3,norm_layer_2d,use_bias)
self.merge1 = conv_2d(2048,1024,3,1,1,norm_layer_2d,use_bias)
# self.merge2 = nn.Sequential(
# conv_2d(128,64,3,1,1,norm_layer_2d,use_bias),
# nn.ReflectionPad2d(3),
# nn.Conv2d(64, out_channel, kernel_size=7, padding=0),
# nn.Tanh()
# )
def forward(self, x):
N = int((x.size()[1])/3)
x_2d = torch.cat((x[:,int((N-1)/2)*3:(int((N-1)/2)+1)*3,:,:], x[:,N-1:N,:,:]), 1)
#shortcut_2d = x[:,int((N-1)/2)*3:(int((N-1)/2)+1)*3,:,:]
x_2d = self.encoder_2d(x_2d)
x_3d = self.encoder_3d(x)
#x = x_2d + x_3d
x = torch.cat((x_2d,x_3d),1)
x = self.merge1(x)
x = self.decoder_2d(x)
#shortcut_2d = self.shortcut_cov(shortcut_2d)
#x = torch.cat((x,shortcut_2d),1)
#x = self.merge2(x)
return x
def MosaicNet(in_channel, out_channel, norm='batch'):
if norm == 'batch':
# norm_layer_2d = nn.BatchNorm2d
# norm_layer_3d = nn.BatchNorm3d
norm_layer_2d = functools.partial(nn.BatchNorm2d, affine=True, track_running_stats=True)
norm_layer_3d = functools.partial(nn.BatchNorm3d, affine=True, track_running_stats=True)
use_bias = False
elif norm == 'instance':
norm_layer_2d = functools.partial(nn.InstanceNorm2d, affine=False, track_running_stats=False)
norm_layer_3d = functools.partial(nn.InstanceNorm3d, affine=False, track_running_stats=False)
use_bias = True
return ALL(in_channel, out_channel, norm_layer_2d, norm_layer_3d, use_bias)
import torch
import torch.nn as nn
import torch.nn.functional as F
from .pix2pix_model import *
class encoder_2d(nn.Module):
"""Resnet-based generator that consists of Resnet blocks between a few downsampling/upsampling operations.
We adapt Torch code and idea from Justin Johnson's neural style transfer project(https://github.com/jcjohnson/fast-neural-style)
"""
def __init__(self, input_nc, output_nc, ngf=64, norm_layer=nn.BatchNorm2d, use_dropout=False, n_blocks=6, padding_type='reflect'):
"""Construct a Resnet-based generator
Parameters:
input_nc (int) -- the number of channels in input images
output_nc (int) -- the number of channels in output images
ngf (int) -- the number of filters in the last conv layer
norm_layer -- normalization layer
use_dropout (bool) -- if use dropout layers
n_blocks (int) -- the number of ResNet blocks
padding_type (str) -- the name of padding layer in conv layers: reflect | replicate | zero
"""
assert(n_blocks >= 0)
super(encoder_2d, self).__init__()
if type(norm_layer) == functools.partial:
use_bias = norm_layer.func == nn.InstanceNorm2d
else:
use_bias = norm_layer == nn.InstanceNorm2d
model = [nn.ReflectionPad2d(3),
nn.Conv2d(input_nc, ngf, kernel_size=7, padding=0, bias=use_bias),
norm_layer(ngf),
nn.ReLU(True)]
n_downsampling = 2
for i in range(n_downsampling): # add downsampling layers
mult = 2 ** i
model += [nn.Conv2d(ngf * mult, ngf * mult * 2, kernel_size=3, stride=2, padding=1, bias=use_bias),
norm_layer(ngf * mult * 2),
nn.ReLU(True)]
#torch.Size([1, 256, 32, 32])
self.model = nn.Sequential(*model)
def forward(self, input):
"""Standard forward"""
return self.model(input)
class decoder_2d(nn.Module):
"""Resnet-based generator that consists of Resnet blocks between a few downsampling/upsampling operations.
We adapt Torch code and idea from Justin Johnson's neural style transfer project(https://github.com/jcjohnson/fast-neural-style)
"""
def __init__(self, input_nc, output_nc, ngf=64, norm_layer=nn.BatchNorm2d, use_dropout=False, n_blocks=6, padding_type='reflect'):
"""Construct a Resnet-based generator
Parameters:
input_nc (int) -- the number of channels in input images
output_nc (int) -- the number of channels in output images
ngf (int) -- the number of filters in the last conv layer
norm_layer -- normalization layer
use_dropout (bool) -- if use dropout layers
n_blocks (int) -- the number of ResNet blocks
padding_type (str) -- the name of padding layer in conv layers: reflect | replicate | zero
"""
super(decoder_2d, self).__init__()
if type(norm_layer) == functools.partial:
use_bias = norm_layer.func == nn.InstanceNorm2d
else:
use_bias = norm_layer == nn.InstanceNorm2d
model = []
n_downsampling = 2
mult = 2 ** n_downsampling
for i in range(n_blocks): # add ResNet blocks
model += [ResnetBlock(ngf * mult, padding_type=padding_type, norm_layer=norm_layer, use_dropout=use_dropout, use_bias=use_bias)]
#torch.Size([1, 256, 32, 32])
for i in range(n_downsampling): # add upsampling layers
mult = 2 ** (n_downsampling - i)
# model += [nn.ConvTranspose2d(ngf * mult, int(ngf * mult / 2),
# kernel_size=3, stride=2,
# padding=1, output_padding=1,
# bias=use_bias),
# norm_layer(int(ngf * mult / 2)),
# nn.ReLU(True)]
#https://distill.pub/2016/deconv-checkerboard/
#https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix/issues/190
model += [ nn.Upsample(scale_factor = 2, mode='nearest'),
nn.ReflectionPad2d(1),
nn.Conv2d(ngf * mult, int(ngf * mult / 2),kernel_size=3, stride=1, padding=0),
norm_layer(int(ngf * mult / 2)),
nn.ReLU(True)]
# model += [nn.ReflectionPad2d(3)]
# model += [nn.Conv2d(ngf, output_nc, kernel_size=7, padding=0)]
# model += [nn.Tanh()]
# model += [nn.Sigmoid()]
self.model = nn.Sequential(*model)
def forward(self, input):
"""Standard forward"""
return self.model(input)
class conv_3d(nn.Module):
def __init__(self,inchannel,outchannel,kernel_size=3,stride=2,padding=1,norm_layer_3d=nn.BatchNorm3d,use_bias=True):
super(conv_3d, self).__init__()
self.conv = nn.Sequential(
nn.Conv3d(inchannel, outchannel, kernel_size=kernel_size, stride=stride, padding=padding, bias=use_bias),
norm_layer_3d(outchannel),
nn.ReLU(inplace=True),
)
def forward(self, x):
x = self.conv(x)
return x
class conv_2d(nn.Module):
def __init__(self,inchannel,outchannel,kernel_size=3,stride=1,padding=1,norm_layer_2d=nn.BatchNorm2d,use_bias=True):
super(conv_2d, self).__init__()
self.conv = nn.Sequential(
nn.ReflectionPad2d(padding),
nn.Conv2d(inchannel, outchannel, kernel_size=kernel_size, stride=stride, padding=0, bias=use_bias),
norm_layer_2d(outchannel),
nn.ReLU(inplace=True),
)
def forward(self, x):
x = self.conv(x)
return x
class encoder_3d(nn.Module):
def __init__(self,in_channel,norm_layer_2d,norm_layer_3d,use_bias):
super(encoder_3d, self).__init__()
self.down1 = conv_3d(1, 64, 3, 2, 1,norm_layer_3d,use_bias)
self.down2 = conv_3d(64, 128, 3, 2, 1,norm_layer_3d,use_bias)
self.down3 = conv_3d(128, 256, 3, 1, 1,norm_layer_3d,use_bias)
self.conver2d = nn.Sequential(
nn.Conv2d(256*int(in_channel/4), 256, kernel_size=3, stride=1, padding=1, bias=use_bias),
norm_layer_2d(256),
nn.ReLU(inplace=True),
)
def forward(self, x):
x = x.view(x.size(0),1,x.size(1),x.size(2),x.size(3))
x = self.down1(x)
x = self.down2(x)
x = self.down3(x)
x = x.view(x.size(0),x.size(1)*x.size(2),x.size(3),x.size(4))
x = self.conver2d(x)
return x
class ALL(nn.Module):
def __init__(self, in_channel, out_channel,norm_layer_2d,norm_layer_3d,use_bias):
super(ALL, self).__init__()
self.encoder_2d = encoder_2d(4,-1,64,norm_layer=norm_layer_2d,n_blocks=9)
self.encoder_3d = encoder_3d(in_channel,norm_layer_2d,norm_layer_3d,use_bias)
self.decoder_2d = decoder_2d(4,3,64,norm_layer=norm_layer_2d,n_blocks=9)
self.shortcut_cov = conv_2d(3,64,7,1,3,norm_layer_2d,use_bias)
self.merge1 = conv_2d(512,256,3,1,1,norm_layer_2d,use_bias)
self.merge2 = nn.Sequential(
conv_2d(128,64,3,1,1,norm_layer_2d,use_bias),
nn.ReflectionPad2d(3),
nn.Conv2d(64, out_channel, kernel_size=7, padding=0),
nn.Tanh()
)
def forward(self, x):
N = int((x.size()[1])/3)
x_2d = torch.cat((x[:,int((N-1)/2)*3:(int((N-1)/2)+1)*3,:,:], x[:,N-1:N,:,:]), 1)
shortcut_2d = x[:,int((N-1)/2)*3:(int((N-1)/2)+1)*3,:,:]
x_2d = self.encoder_2d(x_2d)
x_3d = self.encoder_3d(x)
x = torch.cat((x_2d,x_3d),1)
x = self.merge1(x)
x = self.decoder_2d(x)
shortcut_2d = self.shortcut_cov(shortcut_2d)
x = torch.cat((x,shortcut_2d),1)
x = self.merge2(x)
return x
def MosaicNet(in_channel, out_channel, norm='batch'):
if norm == 'batch':
# norm_layer_2d = nn.BatchNorm2d
# norm_layer_3d = nn.BatchNorm3d
norm_layer_2d = functools.partial(nn.BatchNorm2d, affine=True, track_running_stats=True)
norm_layer_3d = functools.partial(nn.BatchNorm3d, affine=True, track_running_stats=True)
use_bias = False
elif norm == 'instance':
norm_layer_2d = functools.partial(nn.InstanceNorm2d, affine=False, track_running_stats=False)
norm_layer_3d = functools.partial(nn.InstanceNorm3d, affine=False, track_running_stats=False)
use_bias = True
return ALL(in_channel, out_channel, norm_layer_2d, norm_layer_3d, use_bias)
opencv_python==4.5.1.48
numpy==1.19.2
torchvision==0.8.2
torch==1.7.1
matplotlib==3.3.2
tensorboardX==2.2
scikit-image==0.17.2
\ No newline at end of file
import os
import sys
import traceback
import cv2
import numpy as np
try:
from cores import Options,core
from util import util
from util import image_processing as impro
from models import loadmodel
except Exception as e:
print(e)
input('Please press any key to exit.\n')
sys.exit(0)
# python server.py --gpu_id 0 --model_path ./pretrained_models/mosaic/clean_face_HD.pth
opt = Options()
opt.parser.add_argument('--port',type=int,default=4000, help='')
opt = opt.getparse(True)
netM = loadmodel.bisenet(opt,'mosaic')
netG = loadmodel.pix2pix(opt)
from flask import Flask, request
import base64
import shutil
app = Flask(__name__)
@app.route("/handle", methods=["POST"])
def handle():
result = {}
# to opencv img
try:
imgRec = request.form['img']
imgByte = base64.b64decode(imgRec)
img_np_arr = np.frombuffer(imgByte, np.uint8)
img = cv2.imdecode(img_np_arr, cv2.IMREAD_COLOR)
except Exception as e:
result['img'] = imgRec
result['info'] = 'readfailed'
return result
# run model
try:
if max(img.shape)>1080:
img = impro.resize(img,720,interpolation=cv2.INTER_CUBIC)
img = core.cleanmosaic_img_server(opt,img,netG,netM)
except Exception as e:
result['img'] = imgRec
result['info'] = 'procfailed'
return result
# return
imgbytes = cv2.imencode('.jpg', img)[1]
imgString = base64.b64encode(imgbytes).decode('utf-8')
result['img'] = imgString
result['info'] = 'ok'
return result
app.run("0.0.0.0", port= opt.port, debug=opt.debug)
\ No newline at end of file
......@@ -54,10 +54,10 @@ util.makedirs(dir_checkpoint)
util.writelog(os.path.join(dir_checkpoint,'loss.txt'),
str(time.asctime(time.localtime(time.time())))+'\n'+util.opt2str(opt))
def Totensor(img,use_gpu=True):
def Totensor(img,gpu_id=True):
size=img.shape[0]
img = torch.from_numpy(img).float()
if opt.use_gpu != -1:
if opt.gpu_id != -1:
img = img.cuda()
return img
......@@ -68,11 +68,11 @@ def loadimage(imagepaths,maskpaths,opt,test_flag = False):
for i in range(len(imagepaths)):
img = impro.resize(impro.imread(imagepaths[i]),opt.loadsize)
mask = impro.resize(impro.imread(maskpaths[i],mod = 'gray'),opt.loadsize)
img,mask = data.random_transform_image(img, mask, opt.finesize, test_flag)
img,mask = data.random_transform_pair_image(img, mask, opt.finesize, test_flag)
images[i] = (img.transpose((2, 0, 1))/255.0)
masks[i] = (mask.reshape(1,1,opt.finesize,opt.finesize)/255.0)
images = Totensor(images,opt.use_gpu)
masks = Totensor(masks,opt.use_gpu)
images = data.to_tensor(images,opt.gpu_id)
masks = data.to_tensor(masks,opt.gpu_id)
return images,masks
......@@ -111,7 +111,7 @@ if opt.continue_train:
f = open(os.path.join(dir_checkpoint,'epoch_log.txt'),'r')
opt.startepoch = int(f.read())
f.close()
if opt.use_gpu != -1:
if opt.gpu_id != -1:
net.cuda()
cudnn.benchmark = True
......@@ -135,7 +135,7 @@ for epoch in range(opt.startepoch,opt.maxepoch):
starttime = datetime.datetime.now()
util.writelog(os.path.join(dir_checkpoint,'loss.txt'),'Epoch {}/{}.'.format(epoch + 1, opt.maxepoch),True)
net.train()
if opt.use_gpu != -1:
if opt.gpu_id != -1:
net.cuda()
epoch_loss = 0
for i in range(int(img_num*0.8/opt.batchsize)):
......
此差异已折叠。
import random
import os
from util.mosaic import get_random_parameter
import numpy as np
import torch
import torchvision.transforms as transforms
import cv2
from . import image_processing as impro
from . import mosaic
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(mean = (0.5, 0.5, 0.5), std = (0.5, 0.5, 0.5))
]
)
def tensor2im(image_tensor, imtype=np.uint8, gray=False, rgb2bgr = True ,is0_1 = False):
from . import degradater
def to_tensor(data,gpu_id):
data = torch.from_numpy(data)
if gpu_id != '-1':
data = data.cuda()
return data
def normalize(data):
'''
normalize to -1 ~ 1
'''
return (data.astype(np.float32)/255.0-0.5)/0.5
def anti_normalize(data):
return np.clip((data*0.5+0.5)*255,0,255).astype(np.uint8)
def tensor2im(image_tensor, gray=False, rgb2bgr = True ,is0_1 = False, batch_index=0):
image_tensor =image_tensor.data
image_numpy = image_tensor[0].cpu().float().numpy()
image_numpy = image_tensor[batch_index].cpu().float().numpy()
if not is0_1:
image_numpy = (image_numpy + 1)/2.0
......@@ -24,7 +35,7 @@ def tensor2im(image_tensor, imtype=np.uint8, gray=False, rgb2bgr = True ,is0_1 =
if gray:
h, w = image_numpy.shape[1:]
image_numpy = image_numpy.reshape(h,w)
return image_numpy.astype(imtype)
return image_numpy.astype(np.uint8)
# output 3ch
if image_numpy.shape[0] == 1:
......@@ -32,11 +43,10 @@ def tensor2im(image_tensor, imtype=np.uint8, gray=False, rgb2bgr = True ,is0_1 =
image_numpy = image_numpy.transpose((1, 2, 0))
if rgb2bgr and not gray:
image_numpy = image_numpy[...,::-1]-np.zeros_like(image_numpy)
return image_numpy.astype(imtype)
return image_numpy.astype(np.uint8)
def im2tensor(image_numpy, imtype=np.uint8, gray=False,bgr2rgb = True, reshape = True, use_gpu = 0, use_transform = True,is0_1 = True):
def im2tensor(image_numpy, gray=False,bgr2rgb = True, reshape = True, gpu_id = '-1',is0_1 = False):
if gray:
h, w = image_numpy.shape
image_numpy = (image_numpy/255.0-0.5)/0.5
......@@ -47,18 +57,15 @@ def im2tensor(image_numpy, imtype=np.uint8, gray=False,bgr2rgb = True, reshape =
h, w ,ch = image_numpy.shape
if bgr2rgb:
image_numpy = image_numpy[...,::-1]-np.zeros_like(image_numpy)
if use_transform:
image_tensor = transform(image_numpy)
if is0_1:
image_numpy = image_numpy/255.0
else:
if is0_1:
image_numpy = image_numpy/255.0
else:
image_numpy = (image_numpy/255.0-0.5)/0.5
image_numpy = image_numpy.transpose((2, 0, 1))
image_tensor = torch.from_numpy(image_numpy).float()
image_numpy = (image_numpy/255.0-0.5)/0.5
image_numpy = image_numpy.transpose((2, 0, 1))
image_tensor = torch.from_numpy(image_numpy).float()
if reshape:
image_tensor = image_tensor.reshape(1,ch,h,w)
if use_gpu != -1:
if gpu_id != '-1':
image_tensor = image_tensor.cuda()
return image_tensor
......@@ -68,53 +75,7 @@ def shuffledata(data,target):
np.random.set_state(state)
np.random.shuffle(target)
def random_transform_video(src,target,finesize,N):
#random blur
if random.random()<0.2:
h,w = src.shape[:2]
src = src[:8*(h//8),:8*(w//8)]
Q_ran = random.randint(1,15)
src[:,:,:3*N] = impro.dctblur(src[:,:,:3*N],Q_ran)
target = impro.dctblur(target,Q_ran)
#random crop
h,w = target.shape[:2]
h_move = int((h-finesize)*random.random())
w_move = int((w-finesize)*random.random())
target = target[h_move:h_move+finesize,w_move:w_move+finesize,:]
src = src[h_move:h_move+finesize,w_move:w_move+finesize,:]
#random flip
if random.random()<0.5:
src = src[:,::-1,:]
target = target[:,::-1,:]
#random color
alpha = random.uniform(-0.1,0.1)
beta = random.uniform(-0.1,0.1)
b = random.uniform(-0.05,0.05)
g = random.uniform(-0.05,0.05)
r = random.uniform(-0.05,0.05)
for i in range(N):
src[:,:,i*3:(i+1)*3] = impro.color_adjust(src[:,:,i*3:(i+1)*3],alpha,beta,b,g,r)
target = impro.color_adjust(target,alpha,beta,b,g,r)
#random resize blur
if random.random()<0.5:
interpolations = [cv2.INTER_LINEAR,cv2.INTER_CUBIC,cv2.INTER_LANCZOS4]
size_ran = random.uniform(0.7,1.5)
interpolation_up = interpolations[random.randint(0,2)]
interpolation_down =interpolations[random.randint(0,2)]
tmp = cv2.resize(src[:,:,:3*N], (int(finesize*size_ran),int(finesize*size_ran)),interpolation=interpolation_up)
src[:,:,:3*N] = cv2.resize(tmp, (finesize,finesize),interpolation=interpolation_down)
tmp = cv2.resize(target, (int(finesize*size_ran),int(finesize*size_ran)),interpolation=interpolation_up)
target = cv2.resize(tmp, (finesize,finesize),interpolation=interpolation_down)
return src,target
def random_transform_single(img,out_shape):
def random_transform_single_mask(img,out_shape):
out_h,out_w = out_shape
img = cv2.resize(img,(int(out_w*random.uniform(1.1, 1.5)),int(out_h*random.uniform(1.1, 1.5))))
h,w = img.shape[:2]
......@@ -130,90 +91,65 @@ def random_transform_single(img,out_shape):
img = cv2.resize(img,(out_w,out_h))
return img
def random_transform_image(img,mask,finesize,test_flag = False):
#random scale
if random.random()<0.5:
h,w = img.shape[:2]
loadsize = min((h,w))
a = (float(h)/float(w))*random.uniform(0.9, 1.1)
if h<w:
mask = cv2.resize(mask, (int(loadsize/a),loadsize))
img = cv2.resize(img, (int(loadsize/a),loadsize))
else:
mask = cv2.resize(mask, (loadsize,int(loadsize*a)))
img = cv2.resize(img, (loadsize,int(loadsize*a)))
#random crop
h,w = img.shape[:2]
h_move = int((h-finesize)*random.random())
w_move = int((w-finesize)*random.random())
img_crop = img[h_move:h_move+finesize,w_move:w_move+finesize]
mask_crop = mask[h_move:h_move+finesize,w_move:w_move+finesize]
def get_transform_params():
crop_flag = True
rotat_flag = np.random.random()<0.2
color_flag = True
flip_flag = np.random.random()<0.2
degradate_flag = np.random.random()<0.5
flag_dict = {'crop':crop_flag,'rotat':rotat_flag,'color':color_flag,'flip':flip_flag,'degradate':degradate_flag}
crop_rate = [np.random.random(),np.random.random()]
rotat_rate = np.random.random()
color_rate = [np.random.uniform(-0.05,0.05),np.random.uniform(-0.05,0.05),np.random.uniform(-0.05,0.05),
np.random.uniform(-0.05,0.05),np.random.uniform(-0.05,0.05)]
flip_rate = np.random.random()
degradate_params = degradater.get_random_degenerate_params(mod='weaker_2')
rate_dict = {'crop':crop_rate,'rotat':rotat_rate,'color':color_rate,'flip':flip_rate,'degradate':degradate_params}
return {'flag':flag_dict,'rate':rate_dict}
def random_transform_single_image(img,finesize,params=None,test_flag = False):
if params is None:
params = get_transform_params()
if params['flag']['degradate']:
img = degradater.degradate(img,params['rate']['degradate'])
if test_flag:
return img_crop,mask_crop
if params['flag']['crop']:
h,w = img.shape[:2]
h_move = int((h-finesize)*params['rate']['crop'][0])
w_move = int((w-finesize)*params['rate']['crop'][1])
img = img[h_move:h_move+finesize,w_move:w_move+finesize]
#random rotation
if random.random()<0.2:
h,w = img_crop.shape[:2]
M = cv2.getRotationMatrix2D((w/2,h/2),90*int(4*random.random()),1)
img = cv2.warpAffine(img_crop,M,(w,h))
mask = cv2.warpAffine(mask_crop,M,(w,h))
else:
img,mask = img_crop,mask_crop
if test_flag:
return img
#random color
img = impro.color_adjust(img,ran=True)
if params['flag']['rotat']:
h,w = img.shape[:2]
M = cv2.getRotationMatrix2D((w/2,h/2),90*int(4*params['rate']['rotat']),1)
img = cv2.warpAffine(img,M,(w,h))
#random flip
if random.random()<0.5:
if random.random()<0.5:
img = img[:,::-1,:]
mask = mask[:,::-1]
else:
img = img[::-1,:,:]
mask = mask[::-1,:]
if params['flag']['color']:
img = impro.color_adjust(img,params['rate']['color'][0],params['rate']['color'][1],
params['rate']['color'][2],params['rate']['color'][3],params['rate']['color'][4])
if params['flag']['flip']:
img = img[:,::-1]
#random blur
if random.random()<0.5:
img = impro.dctblur(img,random.randint(1,15))
# interpolations = [cv2.INTER_LINEAR,cv2.INTER_CUBIC,cv2.INTER_LANCZOS4]
# size_ran = random.uniform(0.7,1.5)
# img = cv2.resize(img, (int(finesize*size_ran),int(finesize*size_ran)),interpolation=interpolations[random.randint(0,2)])
# img = cv2.resize(img, (finesize,finesize),interpolation=interpolations[random.randint(0,2)])
#check shape
if img.shape[0]!= finesize or img.shape[1]!= finesize or mask.shape[0]!= finesize or mask.shape[1]!= finesize:
if img.shape[0]!= finesize or img.shape[1]!= finesize:
img = cv2.resize(img,(finesize,finesize))
mask = cv2.resize(mask,(finesize,finesize))
print('warning! shape error.')
return img,mask
def load_train_video(videoname,img_index,opt):
N = opt.N
input_img = np.zeros((opt.loadsize,opt.loadsize,3*N+1), dtype='uint8')
# this frame
this_mask = impro.imread(os.path.join(opt.dataset,videoname,'mask','%05d'%(img_index)+'.png'),'gray',loadsize=opt.loadsize)
input_img[:,:,-1] = this_mask
#print(os.path.join(opt.dataset,videoname,'origin_image','%05d'%(img_index)+'.jpg'))
ground_true = impro.imread(os.path.join(opt.dataset,videoname,'origin_image','%05d'%(img_index)+'.jpg'),loadsize=opt.loadsize)
mosaic_size,mod,rect_rat,feather = mosaic.get_random_parameter(ground_true,this_mask)
start_pos = mosaic.get_random_startpos(num=N,bisa_p=0.3,bisa_max=mosaic_size,bisa_max_part=3)
# merge other frame
for i in range(0,N):
img = impro.imread(os.path.join(opt.dataset,videoname,'origin_image','%05d'%(img_index+i-int(N/2))+'.jpg'),loadsize=opt.loadsize)
mask = impro.imread(os.path.join(opt.dataset,videoname,'mask','%05d'%(img_index+i-int(N/2))+'.png'),'gray',loadsize=opt.loadsize)
img_mosaic = mosaic.addmosaic_base(img, mask, mosaic_size,model = mod,rect_rat=rect_rat,feather=feather,start_point=start_pos[i])
input_img[:,:,i*3:(i+1)*3] = img_mosaic
# to tensor
input_img,ground_true = random_transform_video(input_img,ground_true,opt.finesize,N)
input_img = im2tensor(input_img,bgr2rgb=False,use_gpu=-1,use_transform = False,is0_1=False)
ground_true = im2tensor(ground_true,bgr2rgb=False,use_gpu=-1,use_transform = False,is0_1=False)
return input_img,ground_true
return img
def random_transform_pair_image(img,mask,finesize,test_flag = False):
params = get_transform_params()
img = random_transform_single_image(img,finesize,params)
params['flag']['degradate'] = False
params['flag']['color'] = False
mask = random_transform_single_image(mask,finesize,params)
return img,mask
def showresult(img1,img2,img3,name,is0_1 = False):
size = img1.shape[3]
......
import os
import random
import numpy as np
from multiprocessing import Process, Queue
from . import image_processing as impro
from . import mosaic,data
class VideoLoader(object):
"""docstring for VideoLoader
Load a single video(Converted to images)
How to use:
1.Init VideoLoader as loader
2.Get data by loader.ori_stream
3.loader.next() to get next stream
"""
def __init__(self, opt, video_dir, test_flag=False):
super(VideoLoader, self).__init__()
self.opt = opt
self.test_flag = test_flag
self.video_dir = video_dir
self.t = 0
self.n_iter = self.opt.M -self.opt.S*(self.opt.T+1)
self.transform_params = data.get_transform_params()
self.ori_load_pool = []
self.mosaic_load_pool = []
self.previous_pred = None
feg_ori = impro.imread(os.path.join(video_dir,'origin_image','00001.jpg'),loadsize=self.opt.loadsize,rgb=True)
feg_mask = impro.imread(os.path.join(video_dir,'mask','00001.png'),mod='gray',loadsize=self.opt.loadsize)
self.mosaic_size,self.mod,self.rect_rat,self.feather = mosaic.get_random_parameter(feg_ori,feg_mask)
self.startpos = [random.randint(0,self.mosaic_size),random.randint(0,self.mosaic_size)]
self.loadsize = self.opt.loadsize
#Init load pool
for i in range(self.opt.S*self.opt.T):
_ori_img = impro.imread(os.path.join(video_dir,'origin_image','%05d' % (i+1)+'.jpg'),loadsize=self.loadsize,rgb=True)
_mask = impro.imread(os.path.join(video_dir,'mask','%05d' % (i+1)+'.png' ),mod='gray',loadsize=self.loadsize)
_mosaic_img = mosaic.addmosaic_base(_ori_img, _mask, self.mosaic_size,0, self.mod,self.rect_rat,self.feather,self.startpos)
_ori_img = data.random_transform_single_image(_ori_img,opt.finesize,self.transform_params)
_mosaic_img = data.random_transform_single_image(_mosaic_img,opt.finesize,self.transform_params)
self.ori_load_pool.append(self.normalize(_ori_img))
self.mosaic_load_pool.append(self.normalize(_mosaic_img))
self.ori_load_pool = np.array(self.ori_load_pool)
self.mosaic_load_pool = np.array(self.mosaic_load_pool)
#Init frist stream
self.ori_stream = self.ori_load_pool [np.linspace(0, (self.opt.T-1)*self.opt.S,self.opt.T,dtype=np.int64)].copy()
self.mosaic_stream = self.mosaic_load_pool[np.linspace(0, (self.opt.T-1)*self.opt.S,self.opt.T,dtype=np.int64)].copy()
# stream B,T,H,W,C -> B,C,T,H,W
self.ori_stream = self.ori_stream.reshape (1,self.opt.T,opt.finesize,opt.finesize,3).transpose((0,4,1,2,3))
self.mosaic_stream = self.mosaic_stream.reshape(1,self.opt.T,opt.finesize,opt.finesize,3).transpose((0,4,1,2,3))
#Init frist previous frame
self.previous_pred = self.ori_load_pool[self.opt.S*self.opt.N-1].copy()
# previous B,C,H,W
self.previous_pred = self.previous_pred.reshape(1,opt.finesize,opt.finesize,3).transpose((0,3,1,2))
def normalize(self,data):
'''
normalize to -1 ~ 1
'''
return (data.astype(np.float32)/255.0-0.5)/0.5
def anti_normalize(self,data):
return np.clip((data*0.5+0.5)*255,0,255).astype(np.uint8)
def next(self):
# random
if np.random.random()<0.05:
self.startpos = [random.randint(0,self.mosaic_size),random.randint(0,self.mosaic_size)]
if np.random.random()<0.02:
self.transform_params['rate']['crop'] = [np.random.random(),np.random.random()]
if np.random.random()<0.02:
self.loadsize = np.random.randint(self.opt.finesize,self.opt.loadsize)
if self.t != 0:
self.previous_pred = None
self.ori_load_pool [:self.opt.S*self.opt.T-1] = self.ori_load_pool [1:self.opt.S*self.opt.T]
self.mosaic_load_pool[:self.opt.S*self.opt.T-1] = self.mosaic_load_pool[1:self.opt.S*self.opt.T]
#print(os.path.join(self.video_dir,'origin_image','%05d' % (self.opt.S*self.opt.T+self.t)+'.jpg'))
_ori_img = impro.imread(os.path.join(self.video_dir,'origin_image','%05d' % (self.opt.S*self.opt.T+self.t)+'.jpg'),loadsize=self.loadsize,rgb=True)
_mask = impro.imread(os.path.join(self.video_dir,'mask','%05d' % (self.opt.S*self.opt.T+self.t)+'.png' ),mod='gray',loadsize=self.loadsize)
_mosaic_img = mosaic.addmosaic_base(_ori_img, _mask, self.mosaic_size,0, self.mod,self.rect_rat,self.feather,self.startpos)
_ori_img = data.random_transform_single_image(_ori_img,self.opt.finesize,self.transform_params)
_mosaic_img = data.random_transform_single_image(_mosaic_img,self.opt.finesize,self.transform_params)
_ori_img,_mosaic_img = self.normalize(_ori_img),self.normalize(_mosaic_img)
self.ori_load_pool [self.opt.S*self.opt.T-1] = _ori_img
self.mosaic_load_pool[self.opt.S*self.opt.T-1] = _mosaic_img
self.ori_stream = self.ori_load_pool [np.linspace(0, (self.opt.T-1)*self.opt.S,self.opt.T,dtype=np.int64)].copy()
self.mosaic_stream = self.mosaic_load_pool[np.linspace(0, (self.opt.T-1)*self.opt.S,self.opt.T,dtype=np.int64)].copy()
# stream B,T,H,W,C -> B,C,T,H,W
self.ori_stream = self.ori_stream.reshape (1,self.opt.T,self.opt.finesize,self.opt.finesize,3).transpose((0,4,1,2,3))
self.mosaic_stream = self.mosaic_stream.reshape(1,self.opt.T,self.opt.finesize,self.opt.finesize,3).transpose((0,4,1,2,3))
self.t += 1
class VideoDataLoader(object):
"""VideoDataLoader"""
def __init__(self, opt, videolist, test_flag=False):
super(VideoDataLoader, self).__init__()
self.videolist = []
self.opt = opt
self.test_flag = test_flag
for i in range(self.opt.n_epoch):
self.videolist += videolist.copy()
random.shuffle(self.videolist)
self.each_video_n_iter = self.opt.M -self.opt.S*(self.opt.T+1)
self.n_iter = len(self.videolist)//self.opt.load_thread//self.opt.batchsize*self.each_video_n_iter*self.opt.load_thread
self.queue = Queue(self.opt.load_thread)
self.ori_stream = np.zeros((self.opt.batchsize,3,self.opt.T,self.opt.finesize,self.opt.finesize),dtype=np.float32)# B,C,T,H,W
self.mosaic_stream = np.zeros((self.opt.batchsize,3,self.opt.T,self.opt.finesize,self.opt.finesize),dtype=np.float32)# B,C,T,H,W
self.previous_pred = np.zeros((self.opt.batchsize,3,self.opt.finesize,self.opt.finesize),dtype=np.float32)
self.load_init()
def load(self,videolist):
for load_video_iter in range(len(videolist)//self.opt.batchsize):
iter_videolist = videolist[load_video_iter*self.opt.batchsize:(load_video_iter+1)*self.opt.batchsize]
videoloaders = [VideoLoader(self.opt,os.path.join(self.opt.dataset,iter_videolist[i]),self.test_flag) for i in range(self.opt.batchsize)]
for each_video_iter in range(self.each_video_n_iter):
for i in range(self.opt.batchsize):
self.ori_stream[i] = videoloaders[i].ori_stream
self.mosaic_stream[i] = videoloaders[i].mosaic_stream
if each_video_iter == 0:
self.previous_pred[i] = videoloaders[i].previous_pred
videoloaders[i].next()
if each_video_iter == 0:
self.queue.put([self.ori_stream.copy(),self.mosaic_stream.copy(),self.previous_pred])
else:
self.queue.put([self.ori_stream.copy(),self.mosaic_stream.copy(),None])
def load_init(self):
ptvn = len(self.videolist)//self.opt.load_thread #pre_thread_video_num
for i in range(self.opt.load_thread):
p = Process(target=self.load,args=(self.videolist[i*ptvn:(i+1)*ptvn],))
p.daemon = True
p.start()
def get_data(self):
return self.queue.get()
\ No newline at end of file
'''
https://github.com/sonack/GFRNet_pytorch_new
'''
import random
import cv2
import numpy as np
def gaussian_blur(img, sigma=3, size=13):
if sigma > 0:
if isinstance(size, int):
size = (size, size)
img = cv2.GaussianBlur(img, size, sigma)
return img
def down(img, scale, shape):
if scale > 1:
h, w, _ = shape
scaled_h, scaled_w = int(h / scale), int(w / scale)
img = cv2.resize(img, (scaled_w, scaled_h), interpolation = cv2.INTER_CUBIC)
return img
def up(img, scale, shape):
if scale > 1:
h, w, _ = shape
img = cv2.resize(img, (w, h), interpolation = cv2.INTER_CUBIC)
return img
def awgn(img, level):
if level > 0:
noise = np.random.randn(*img.shape) * level
img = (img + noise).clip(0,255).astype(np.uint8)
return img
def jpeg_compressor(img,quality):
if quality > 0: # 0 indicating no lossy compression (i.e losslessly compression)
encode_param = [int(cv2.IMWRITE_JPEG_QUALITY), quality]
img = cv2.imdecode(cv2.imencode('.jpg', img, encode_param)[1], 1)
return img
def get_random_degenerate_params(mod='strong'):
'''
mod : strong | only_downsample | only_4x | weaker_1 | weaker_2
'''
params = {}
gaussianBlur_size_list = list(range(3,14,2))
if mod == 'strong':
gaussianBlur_sigma_list = [1 + x for x in range(3)]
gaussianBlur_sigma_list += [0]
downsample_scale_list = [1 + x * 0.1 for x in range(0,71)]
awgn_level_list = list(range(1, 8, 1))
jpeg_quality_list = list(range(10, 41, 1))
jpeg_quality_list += int(len(jpeg_quality_list) * 0.33) * [0]
elif mod == 'only_downsample':
gaussianBlur_sigma_list = [0]
downsample_scale_list = [1 + x * 0.1 for x in range(0,71)]
awgn_level_list = [0]
jpeg_quality_list = [0]
elif mod == 'only_4x':
gaussianBlur_sigma_list = [0]
downsample_scale_list = [4]
awgn_level_list = [0]
jpeg_quality_list = [0]
elif mod == 'weaker_1': # 0.5 trigger prob
gaussianBlur_sigma_list = [1 + x for x in range(3)]
gaussianBlur_sigma_list += int(len(gaussianBlur_sigma_list)) * [0] # 1/2 trigger this degradation
downsample_scale_list = [1 + x * 0.1 for x in range(0,71)]
downsample_scale_list += int(len(downsample_scale_list)) * [1]
awgn_level_list = list(range(1, 8, 1))
awgn_level_list += int(len(awgn_level_list)) * [0]
jpeg_quality_list = list(range(10, 41, 1))
jpeg_quality_list += int(len(jpeg_quality_list)) * [0]
elif mod == 'weaker_2': # weaker than weaker_1, jpeg [20,40]
gaussianBlur_sigma_list = [1 + x for x in range(3)]
gaussianBlur_sigma_list += int(len(gaussianBlur_sigma_list)) * [0] # 1/2 trigger this degradation
downsample_scale_list = [1 + x * 0.1 for x in range(0,71)]
downsample_scale_list += int(len(downsample_scale_list)) * [1]
awgn_level_list = list(range(1, 8, 1))
awgn_level_list += int(len(awgn_level_list)) * [0]
jpeg_quality_list = list(range(20, 41, 1))
jpeg_quality_list += int(len(jpeg_quality_list)) * [0]
params['blur_sigma'] = random.choice(gaussianBlur_sigma_list)
params['blur_size'] = random.choice(gaussianBlur_size_list)
params['updown_scale'] = random.choice(downsample_scale_list)
params['awgn_level'] = random.choice(awgn_level_list)
params['jpeg_quality'] = random.choice(jpeg_quality_list)
return params
def degradate(img,params,jpeg_last = True):
shape = img.shape
if not params:
params = get_random_degenerate_params('original')
if jpeg_last:
img = gaussian_blur(img,params['blur_sigma'],params['blur_size'])
img = down(img,params['updown_scale'],shape)
img = awgn(img,params['awgn_level'])
img = up(img,params['updown_scale'],shape)
img = jpeg_compressor(img,params['jpeg_quality'])
else:
img = gaussian_blur(img,params['blur_sigma'],params['blur_size'])
img = down(img,params['updown_scale'],shape)
img = awgn(img,params['awgn_level'])
img = jpeg_compressor(img,params['jpeg_quality'])
img = up(img,params['updown_scale'],shape)
return img
\ No newline at end of file
import os,json
import subprocess
# ffmpeg 3.4.6
def args2cmd(args):
......@@ -32,10 +32,11 @@ def run(args,mode = 0):
return sout
def video2image(videopath, imagepath, fps=0, start_time='00:00:00', last_time='00:00:00'):
args = ['ffmpeg', '-i', '"'+videopath+'"']
args = ['ffmpeg']
if last_time != '00:00:00':
args += ['-ss', start_time]
args += ['-t', last_time]
args += ['-i', '"'+videopath+'"']
if fps != 0:
args += ['-r', str(fps)]
args += ['-f', 'image2','-q:v','-0',imagepath]
......
......@@ -8,16 +8,7 @@ system_type = 'Linux'
if 'Windows' in platform.platform():
system_type = 'Windows'
DCT_Q = np.array([[8,16,19,22,26,27,29,34],
[16,16,22,24,27,29,34,37],
[19,22,26,27,29,34,34,38],
[22,22,26,27,29,34,37,40],
[22,26,27,29,32,35,40,48],
[26,27,29,32,35,40,48,58],
[26,27,29,34,38,46,56,59],
[27,29,35,38,46,56,69,83]])
def imread(file_path,mod = 'normal',loadsize = 0):
def imread(file_path,mod = 'normal',loadsize = 0, rgb=False):
'''
mod: 'normal' | 'gray' | 'all'
loadsize: 0->original
......@@ -43,6 +34,9 @@ def imread(file_path,mod = 'normal',loadsize = 0):
if loadsize != 0:
img = resize(img, loadsize, interpolation=cv2.INTER_CUBIC)
if rgb and img.ndim==3:
img = img[:,:,::-1]
return img
def imwrite(file_path,img):
......@@ -110,6 +104,12 @@ def color_adjust(img,alpha=0,beta=0,b=0,g=0,r=0,ran = False):
return (np.clip(img,0,255)).astype('uint8')
def CAdaIN(src,dst):
'''
make src has dst's style
'''
return np.std(dst)*((src-np.mean(src))/np.std(src))+np.mean(dst)
def makedataset(target_image,orgin_image):
target_image = resize(target_image,256)
orgin_image = resize(orgin_image,256)
......@@ -118,34 +118,6 @@ def makedataset(target_image,orgin_image):
img[0:256,0:256] = target_image[0:256,int(w/2-256/2):int(w/2+256/2)]
img[0:256,256:512] = orgin_image[0:256,int(w/2-256/2):int(w/2+256/2)]
return img
def block_dct_and_idct(g,QQF,QQF_16):
return cv2.idct(np.round(16.0*cv2.dct(g)/QQF)*QQF_16)
def image_dct_and_idct(I,QF):
h,w = I.shape
QQF = DCT_Q*QF
QQF_16 = QQF/16.0
for i in range(h//8):
for j in range(w//8):
I[i*8:(i+1)*8,j*8:(j+1)*8] = cv2.idct(np.round(16.0*cv2.dct(I[i*8:(i+1)*8,j*8:(j+1)*8])/QQF)*QQF_16)
#I[i*8:(i+1)*8,j*8:(j+1)*8] = block_dct_and_idct(I[i*8:(i+1)*8,j*8:(j+1)*8],QQF,QQF_16)
return I
def dctblur(img,Q):
'''
Q: 1~20, 1->best
'''
h,w = img.shape[:2]
img = img[:8*(h//8),:8*(w//8)]
img = img.astype(np.float32)
if img.ndim == 2:
img = image_dct_and_idct(img, Q)
if img.ndim == 3:
h,w,ch = img.shape
for i in range(ch):
img[:,:,i] = image_dct_and_idct(img[:,:,i], Q)
return (np.clip(img,0,255)).astype(np.uint8)
def find_mostlikely_ROI(mask):
contours,hierarchy=cv2.findContours(mask, cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
......@@ -211,6 +183,30 @@ def mask_area(mask):
except:
area = 0
return area
import time
def replace_mosaic(img_origin,img_fake,mask,x,y,size,no_feather):
img_fake = cv2.resize(img_fake,(size*2,size*2),interpolation=cv2.INTER_CUBIC)
if no_feather:
img_origin[y-size:y+size,x-size:x+size]=img_fake
return img_origin
else:
# #color correction
# RGB_origin = img_origin[y-size:y+size,x-size:x+size].mean(0).mean(0)
# RGB_fake = img_fake.mean(0).mean(0)
# for i in range(3):img_fake[:,:,i] = np.clip(img_fake[:,:,i]+RGB_origin[i]-RGB_fake[i],0,255)
#eclosion
eclosion_num = int(size/10)+2
mask_crop = cv2.resize(mask,(img_origin.shape[1],img_origin.shape[0]))[y-size:y+size,x-size:x+size]
mask_crop = ch_one2three(mask_crop)
mask_crop = (cv2.blur(mask_crop, (eclosion_num, eclosion_num)))
mask_crop = mask_crop/255.0
img_crop = img_origin[y-size:y+size,x-size:x+size]
img_origin[y-size:y+size,x-size:x+size] = np.clip((img_crop*(1-mask_crop)+img_fake*mask_crop),0,255).astype('uint8')
return img_origin
def Q_lapulase(resImg):
......@@ -225,31 +221,25 @@ def Q_lapulase(resImg):
score = res.var()
return score
def replace_mosaic(img_origin,img_fake,mask,x,y,size,no_feather):
img_fake = cv2.resize(img_fake,(size*2,size*2),interpolation=cv2.INTER_LANCZOS4)
if no_feather:
img_origin[y-size:y+size,x-size:x+size]=img_fake
img_result = img_origin
else:
#color correction
RGB_origin = img_origin[y-size:y+size,x-size:x+size].mean(0).mean(0)
RGB_fake = img_fake.mean(0).mean(0)
for i in range(3):img_fake[:,:,i] = np.clip(img_fake[:,:,i]+RGB_origin[i]-RGB_fake[i],0,255)
#eclosion
eclosion_num = int(size/5)
entad = int(eclosion_num/2+2)
mask = cv2.resize(mask,(img_origin.shape[1],img_origin.shape[0]))
mask = ch_one2three(mask)
mask = (cv2.blur(mask, (eclosion_num, eclosion_num)))
mask_tmp = np.zeros_like(mask)
mask_tmp[y-size:y+size,x-size:x+size] = mask[y-size:y+size,x-size:x+size]# Fix edge overflow
mask = mask_tmp/255.0
img_tmp = np.zeros(img_origin.shape)
img_tmp[y-size:y+size,x-size:x+size]=img_fake
img_result = img_origin.copy()
img_result = (img_origin*(1-mask)+img_tmp*mask).astype('uint8')
def psnr(img1,img2):
mse = np.mean((img1/255.0-img2/255.0)**2)
if mse < 1e-10:
return 100
psnr_v = 20*np.log10(1/np.sqrt(mse))
return psnr_v
def splice(imgs,splice_shape):
'''Stitching multiple images, all imgs must have the same size
imgs : [img1,img2,img3,img4]
splice_shape: (2,2)
'''
h,w,ch = imgs[0].shape
output = np.zeros((h*splice_shape[0],w*splice_shape[1],ch),np.uint8)
cnt = 0
for i in range(splice_shape[0]):
for j in range(splice_shape[1]):
if cnt < len(imgs):
output[h*i:h*(i+1),w*j:w*(j+1)] = imgs[cnt]
cnt += 1
return output
return img_result
\ No newline at end of file
import os
import random
import string
import shutil
def Traversal(filedir):
......@@ -10,6 +12,9 @@ def Traversal(filedir):
Traversal(dir)
return file_list
def randomstr(num):
return ''.join(random.sample(string.ascii_letters + string.digits, num))
def is_img(path):
ext = os.path.splitext(path)[1]
ext = ext.lower()
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册