### 摘要
知识图谱(Knowledge Graph)是人工智能重要分支知识工程在大数据环境中的成功应用。当下,知识图谱正与大数据和深度学习一起,成为推动互联网和人工智能发展的核心驱动力之一。为了更好的理解知识图谱,我们将从知识图谱的技术实现、最新研究成果、应用实例、未来趋势几个方面对知识图谱进行介绍。
### Abstract
Knowledge graph is a successful application of knowledge engineering, an important branch of artificial intelligence, in a big data environment. At present, the knowledge graph is becoming one of the core driving forces for the development of the Internet and artificial intelligence, together with big data and deep learning. In order to better understand the knowledge graph, we will introduce the knowledge graph from the aspects of the technology realization of the knowledge graph, the latest research results, application examples, and future trends.
Key words : knowledge graph; recommended system; deep learning
### 1 引言
  知识图谱(Knowledge Graph)的概念最先是由谷歌于 2012 年正式提出[1],主要用来支撑下一代搜索和在线广告业务。2013 年以后知识图谱开始在学术界和业界普及,并在搜索、智能问答、情报分析、金融等领域应用中发挥重要作用。
&emsp;&emsp;知识图谱本质上是一种语义网络[2] , 在描述客观世界的概念、实体、事件及其之间的关系。其是人工智能对知识需求所导致的必然结果,但其发展又得益于很多其他的研究领域,涉及专家系统、语言学、语义网、数据库,以及信息抽取等众多领域,是交叉融合的产物而非一脉相承。其演化过程如图1-1所示。
### 2 知识图谱的实现
#### 2.1 知识图谱的基本概念
#### 2.2 知识图谱的实现步骤
##### 2.2.1 知识抽取
a. 实体命名识别(Name Entity Recognition)
b. 关系抽取(Relation Extraction)
c. 实体统一(Entity Resolution)
d. 指代消解(Coreference Resolution)
##### 2.2.2 知识链接与融合
##### 2.2.3 知识质量改进和知识更新
&emsp;&emsp;(1) 知识矫正的过程包括两个阶段,即误差检测和更正。Xu等人[4]提出了两种错误检测方法,第一个是基于规则的错误检测。该方法使用许多预定义的规则来检测违反规则的错误。第二个方法基于用户反馈。 在错误检测之后,Xu等人[4]也提出了一种基于众包的更正方法。错误事实被分配给众包的不同贡献者进行更正。然后采用一种简单而有效的方法,即多数投票以聚合多个和有干扰的贡献者输入,去产生一致的输出。
&emsp;&emsp;(2) 知识图谱补全技术包括类型推理和消息盒补全。目前主要的类型推理方法可以分为类型推断,属性驱动类型推断以及跨语言类型推断。消息盒补全:消息盒以三元组的形式显式地包含实体的结构化事实,当知识抽取获得的消息盒是不完整的就需要进行补全,目前主流的补全方法是基于长短时记忆递归神经网络的NLP方法。
##### 2.2.4 知识应用
### 3 最新研究成果
#### 3.1 GraphIE: A Graph-Based Framework for Information Extraction
&emsp;&emsp;该算法通过图形卷积网络在连接的节点之间传播信息,利用来改进单词级别的预测,从而生成更丰富的表示。论文评估了三个不同的任务:文本,社交媒体和视觉信息提取,结果一致地显示 GraphIE 优于最先进的信息抽取模型。
#### 3.2 KG-BERT: BERT for Knowledge Graph Completion
&emsp;&emsp;在这项工作中,作者使用预训练的语言模型来完成知识图谱。将知识图谱中的三元组视为文本序列,提出了一种新的基于Transformer (KG-BERT)的知识图双向编码器表示框架来对这些三元组进行建模。该方法以一个三元组的实体描述和关系描述为输入,利用KG-BERT语言模型计算三元组的评分函数。在多个基准知识图谱上的实验结果表明,该方法在三重分类、链接预测和关系预测任务上都能达到最好的性能。
### 4 应用实例
#### 4.1 神马知识图谱-推荐系统
&emsp;&emsp;为了实现这个应用,阿里巴巴构建了神马知识图谱,这是一个包含1000万个实体、1000个类型和数十亿个三元组的语义网络。它有广泛的领域,如人,教育,电影,电视,音乐,体育,科技,书籍,应用程序,食品,植物,动物等。这本书内容丰富,涵盖了大量关于世俗事实的实体。知识图中的实体通过各种关系连接在一起。基于神马知识图谱,构造了一个包含数百万实例和概念的认知概念图。与神马知识图不同的是,认知概念图是一种概率图,主要关注的是’is - a‘关系。例如,“robin”是鸟,“penguin”是鸟。认知概念图有助于实体概念化和查询理解。
#### 4.2 AliMe Chat-对话系统
&emsp;&emsp;阿里巴巴2015年推出的AliMe Chat,如图4-2所示,已经为数十亿用户提供了服务,现在平均每天有1000万用户访问。AliMe服务大致可以分为辅助服务、客户服务和聊天服务。在[9]中,阿里巴巴团队设计了一个基于知识图谱的解决AliMe服务的高频聊天问题的方法。为了满足在线系统每秒高问题(QPS)的需求,文中设计了几种解决方案来提高AliMe聊天的能力。
#### 4.3 文本生成-NLP
&emsp;&emsp;这项工作主要关注如何从信息抽取结果(特别是知识图谱)出发,生成连贯的多句文本。作者表示图谱化的知识表示在计算中普遍存在,但由于其非层次,长距离依赖,结构多样等特性,使得基于图谱的文本生成成为一个巨大的挑战。为了摆脱图谱表示学习过程需要添加的线性/层次约束,有效利用起图谱中的关系结构,作者提出一种新的 Graph Transformer 编码器[10],结构如图4-3所示。
&emsp;&emsp;Graph Transformers与图注意力网络(GAT)[11]的思路相近,利用注意力机制[12],将相邻节点的信息用于生成目标节点的隐状态表示。但是 GAT 模型仅考虑图谱中已出现相邻节点的信息,文章提出的全局节点设定使得模型能够利用更为全局的信息(可能存在的实体关联,但并未出现在知识子图中的潜在信息)。
&emsp;&emsp;利用Graph Transformer生成的文本如图4-4所示。图中“Title”为输入的标题,右侧为利用Graph Transformer生成的文本。
### 5 挑战与机遇
#### 5.1 知识图谱遇到的挑战
#### 5.2 知识图谱的发展前景
&emsp;&emsp;就目前而已,在知识库、信息检索、数据挖掘、知识表示、社会网络等方向在知识图谱领域的热度长盛不衰。除此之外,信息提取、查询应答、问题回答、机器学习、概率逻辑、 实体消歧、实体识别、查询处理、决策支持等方向的研究热度在近年来逐渐上升,概念图、 搜索引擎、信息系统等方向的热度逐渐消退。
&emsp;&emsp;随着知识图谱相关技术的发展,我们有理由相信,知识 图谱构建技术会朝着越来越自动化方向前进,同时知识图谱也会在越来越多的领域找到能够 正落地的应用场景,在各行各业中解放生产力,助力业务转型。
### 知识图谱的基本概念
&emsp;&emsp;知识图谱(Knowledge Graph)的概念最先是由谷歌于 2012 年正式提出,主要用来支撑下一代搜索和在线广告业务。2013 年以后知识图谱开始在学术界和业界普及,并在搜索、智能问答、情报分析、金融等领域应用中发挥重要作用。
<img src="./images/知识图谱的一部分的一个例子.jpg" alt="image" style="zoom:50%;" />
### 应用实例
#### 神马知识图谱-推荐系统
<img src="./images/神马搜索引擎实例.jpg" alt="image" style="zoom:50%;" />
&emsp;&emsp;为了实现这个应用,阿里巴巴构建了神马知识图谱,这是一个包含1000万个实体、1000个类型和数十亿个三元组的语义网络。它有广泛的领域,如人,教育,电影,电视,音乐,体育,科技,书籍,应用程序,食品,植物,动物等。这本书内容丰富,涵盖了大量关于世俗事实的实体。知识图中的实体通过各种关系连接在一起。基于神马知识图谱,构造了一个包含数百万实例和概念的认知概念图。与神马知识图不同的是,认知概念图是一种概率图,主要关注的是’is - a‘关系。例如,“robin”是鸟,“penguin”是鸟。认知概念图有助于实体概念化和查询理解。
#### AliMe Chat-对话系统
&emsp;&emsp;阿里巴巴2015年推出的AliMe Chat,如下图所示,已经为数十亿用户提供了服务,现在平均每天有1000万用户访问。AliMe服务大致可以分为辅助服务、客户服务和聊天服务。在[5]中,阿里巴巴团队设计了一个基于知识图谱的解决AliMe服务的高频聊天问题的方法。为了满足在线系统每秒高问题(QPS)的需求,文中设计了几种解决方案来提高AliMe聊天的能力。
<img src="./images/AliMe_Chat演示.jpg" alt="image" style="zoom:50%;" />
#### 文本生成-NLP
&emsp;&emsp;这项工作主要关注如何从信息抽取结果(特别是知识图谱)出发,生成连贯的多句文本。作者表示图谱化的知识表示在计算中普遍存在,但由于其非层次,长距离依赖,结构多样等特性,使得基于图谱的文本生成成为一个巨大的挑战。为了摆脱图谱表示学习过程需要添加的线性/层次约束,有效利用起图谱中的关系结构,作者提出一种新的 Graph Transformer 编码器[6],结构如下图所示。
<img src="./images/Graph_Transformers结构.jpg" alt="image" style="zoom:67%;" />
&emsp;&emsp;Graph Transformers与图注意力网络(GAT)[7]的思路相近,利用注意力机制[8],将相邻节点的信息用于生成目标节点的隐状态表示。但是 GAT 模型仅考虑图谱中已出现相邻节点的信息,文章提出的全局节点设定使得模型能够利用更为全局的信息(可能存在的实体关联,但并未出现在知识子图中的潜在信息)。
&emsp;&emsp;利用Graph Transformer生成的文本如下图所示。图中“Title”为输入的标题,右侧为利用Graph Transformer生成的文本。
### 心得体会
## Motion Compensation
In this project, I use python to implement a YUV reader and three motion estimation methods: full search, three step search, and diamond search . Finally, I compare their performance in terms of accuracy and complexity.
### Prerequisites
- Linux, Mac OS, Windows
- Python 3.6+
- numpy, matplotlib, opencv-python
### Getting Started
python plot.py
then you will get:
python main.py
then you will get:<br>
Read dragon_video.yuv done!
Read gas_video.yuv done!
-------Full Search-------
dragon_video rmse:5.691 psnr:33.034 time:190.468
gas_video rmse:1.267 psnr:47.109 time:190.697
-------Three-step search-------
dragon_video rmse:5.561 psnr:33.234 time:60.568
gas_video rmse:1.218 psnr:47.214 time:59.848
-------Multi-Step search-------
dragon_video rmse:5.287 psnr:33.673 time:80.428
gas_video rmse:1.289 psnr:46.433 time:79.131
-------Diamond Search-------
dragon_video rmse:5.540 psnr:33.267 time:48.877
gas_video rmse:1.188 psnr:47.492 time:34.231
import cv2
import numpy as np
import os
def load(filename, width, height, startfrm=0, endfrm = None, type = 'YUV'):
:param filename: YUV video name
:param height: YUV video height
:param width: YUV video width
:param startfrm: start frame
:param endfrm: end frame
:param type: output type 'YUV' 'BGR'
:return: array like [frame,h,w,ch]
fp = open(filename, 'rb')
framesize = height * width * 3 // 2 # 一帧图像所含的像素个数
h_h = height // 2
h_w = width // 2
fp.seek(0, 2) # 设置文件指针到文件流的尾部
ps = fp.tell() # 当前文件指针位置
numfrm = ps // framesize # 计算输出帧数
fp.seek(framesize * startfrm, 0)
if endfrm != None:
numfrm = endfrm
output = np.zeros(shape=(numfrm, height, width, 3), dtype='uint8', order='C')
for i in range(numfrm - startfrm):
Yt = np.zeros(shape=(height, width), dtype='uint8', order='C')
Ut = np.zeros(shape=(h_h, h_w), dtype='uint8', order='C')
Vt = np.zeros(shape=(h_h, h_w), dtype='uint8', order='C')
for m in range(height):
for n in range(width):
Yt[m, n] = ord(fp.read(1))
for m in range(h_h):
for n in range(h_w):
Ut[m, n] = ord(fp.read(1))
for m in range(h_h):
for n in range(h_w):
Vt[m, n] = ord(fp.read(1))
if type == 'YUV':
output[i,:,:,0] = Yt
output[i,:,:,1] = cv2.resize(Ut,(width,height ))
output[i,:,:,2] = cv2.resize(Vt,(width, height))
elif type == 'BGR':
BGR = np.zeros(shape=(numfrm,height, width, 3), dtype='uint8', order='C')
img = np.concatenate((Yt.reshape(-1), Ut.reshape(-1), Vt.reshape(-1)))
img = img.reshape((height * 3 // 2, width)).astype('uint8') # YUV 的存储格式为:NV12(YYYY UV)
# 由于 opencv 不能直接读取 YUV 格式的文件, 所以要转换一下格式
bgr_img = cv2.cvtColor(img, cv2.COLOR_YUV2BGR_I420) # 注意 YUV 的存储格式
output[i] = bgr_img
if os.path.isdir('./yuv2bgr'):
cv2.imwrite('yuv2bgr/%d.jpg' % (i + 1), bgr_img)
# print("Extract frame %d " % (i + 1))
print('Read '+filename+' done!')
return output
if __name__ == '__main__':
_ = yuv2bgr(filename='dragon_video.yuv', width=640, height=480,startfrm=0)
Algorithm 1 FullSearch
:param fr: the frame need to predict
:param fr_ref: previous frame
:param window_size: full search window size
Function FullSearch(fr, fr_ref, window_size)
for each block in fr:
move the block in search area of fr_ref: #base on window_size and block location
calculate error()
find the best match block
replace the fr_block use fr_ref_block
Algorithm 2 ThreeStepSearch
Function ThreeStepSearch(fr, fr_ref)
for each block in fr:
original_point = center of block
for S in [4,2,1]:
points = get search point(S) #8 locations +/- S pixels around original point and the original point
for each point in points:
calculate error()
original_point = the minimum cost point
# replace
original_point is the center of best match block
replace the fr_block use fr_ref_block
Algorithm 3 DiamondSearch
Function DiamondSearch(fr, fr_ref)
for each block in fr:
original_point = center of block
S = 2
search 9 locations pixels (X,Y): #(|X|+|Y|=S) around location original_point and original_point
calculate error()
original_point = the minimum cost point
if original_point is found at center of search window:
goto SDSP
goto SDSP
S = 1
search 5 locations pixels (X,Y): #(|X|+|Y|=S) around location original_point and original_point
calculate error()
original_point = the minimum cost point
# replace
original_point is the center of best match block
replace the fr_block use fr_ref_block
Algorithm 4 MultiStepSearch
Function MultiStepSearch(fr, fr_ref, step)
for each block in fr:
original_point = center of block
for S in get_S(step):# S = [2^(step-1),2^(step-2)...8,4,2,1]
points = get search point(S) #8 locations +/- S pixels around original point and the original point
for each point in points:
calculate error()
original_point = the minimum cost point
# replace
original_point is the center of best match block
replace the fr_block use fr_ref_block
import numpy as np
import cv2
import time
from matplotlib import pyplot as plt
import YUVreader
import motion_compensation as mc
#load YUV
dragon_video = YUVreader.load('dragon_video.yuv', 640, 480, 0, type='YUV')
gas_video = YUVreader.load('gas_video.yuv', 640, 480, 0,type='YUV')
def core(mc_mod,video):
rmse_error = 0
psnr_error = 0
t1 = time.time()
for i in range(len(video)-1):
ref_frame = video[i]
curr_frame = video[i+1]
if mc_mod == 'full':
curr_frame_predicted = mc.full(curr_frame, ref_frame,5)
elif mc_mod == 'three_step':
curr_frame_predicted = mc.multi_step(curr_frame, ref_frame,3)
elif mc_mod == 'multi_step':
curr_frame_predicted = mc.multi_step(curr_frame, ref_frame,4)
elif mc_mod == 'diamond':
curr_frame_predicted = mc.diamond(curr_frame, ref_frame)
rmse_error = rmse_error + mc.rmse(curr_frame_predicted, curr_frame)
psnr_error = psnr_error + mc.psnr(curr_frame_predicted, curr_frame)
t2 = time.time()
return rmse_error/(len(video)-1), psnr_error/(len(video)-1),t2-t1
#Full Search
print('-------Full Search-------')
_rmse,_psnr,_time = core('full',dragon_video)
print('dragon_video rmse:{:.3f} psnr:{:.3f} time:{:.3f}'.format(_rmse,_psnr,_time))
_rmse,_psnr,_time = core('full',gas_video)
print('gas_video rmse:{:.3f} psnr:{:.3f} time:{:.3f}'.format(_rmse,_psnr,_time))
#Three-step search
print('-------Three-step search-------')
_rmse,_psnr,_time = core('three_step',dragon_video)
print('dragon_video rmse:{:.3f} psnr:{:.3f} time:{:.3f}'.format(_rmse,_psnr,_time))
_rmse,_psnr,_time = core('three_step',gas_video)
print('gas_video rmse:{:.3f} psnr:{:.3f} time:{:.3f}'.format(_rmse,_psnr,_time))
#Multi-Step search
print('-------Multi-Step search-------')
_rmse,_psnr,_time = core('multi_step',dragon_video)
print('dragon_video rmse:{:.3f} psnr:{:.3f} time:{:.3f}'.format(_rmse,_psnr,_time))
_rmse,_psnr,_time = core('multi_step',gas_video)
print('gas_video rmse:{:.3f} psnr:{:.3f} time:{:.3f}'.format(_rmse,_psnr,_time))
#Diamond Search
print('-------Diamond Search-------')
_rmse,_psnr,_time = core('diamond',dragon_video)
print('dragon_video rmse:{:.3f} psnr:{:.3f} time:{:.3f}'.format(_rmse,_psnr,_time))
_rmse,_psnr,_time = core('diamond',gas_video)
print('gas_video rmse:{:.3f} psnr:{:.3f} time:{:.3f}'.format(_rmse,_psnr,_time))
function YUV = YUVread(fname,dim,frnum)
% This function reads a frame #frnum (0..n-1) from YUV file into an
% 3D array with Y, U and V components
f = fopen(fname,'r');
% Read Y-component
% Read Y-component
YUV = [];
% Read U-component
if length(U)<dim(1)*dim(2)/4
YUV = [];
% Read V-component
if length(V)<dim(1)*dim(2)/4
YUV = [];
% Combine Y, U, and V
predict_error = 0
for i = 0 to num_frames-2
ref_frame = read_frame(i)
curr_frame = read_frame(i+1)
curr_frame_predicted = motion_estimation (ref_frame, curr_frame)
predict_error = predict_error + rmse(curr_frame, curr_frame_predicted)
time_used = timer_stop
avg_prediction_error = prediction_error/(num_frames-1)
0.6895990682998346 cost time: 0.152 s
# dragon_video = YUVreader.yuv2bgr('dragon_video.yuv', 640, 480, 0, 2,type='YUV')
# t1 = time.time()
# fr_out,loss = mc.full(dragon_video[1],dragon_video[0],4)
# t2 = time.time()
# print(loss,' cost time:','%.3f'%(t2-t1),'s')
dragon_video = YUVreader.yuv2bgr('dragon_video.yuv', 640, 480, 0, 2,type='YUV')
t1 = time.time()
fr_out,loss = mc.multi_step(dragon_video[1],dragon_video[0],5)
t2 = time.time()
print(loss,' cost time:','%.3f'%(t2-t1),'s')
cv2.namedWindow('image', cv2.WINDOW_NORMAL)
import numpy as np
def mse(predictions, targets):
return np.mean((predictions-targets)**2)
def rmse(predictions, targets):
return np.sqrt(mse(predictions, targets))
def psnr(predictions, targets):
return 10*np.log10((255*255)/mse(predictions, targets))
# full search
def full(fr,fr_ref,search_win):
:param fr: the frame need to predict
:param fr_ref: previous frame
:param search_win: full search window size
:return fr_out: predict frame
height,width = fr.shape[:2]
fr_out = np.zeros_like(fr,dtype='uint8')
for i in range(0,height,16):
for j in range(0,width,16):
loss = 1e10
ref_h = i
ref_w = j
blk = fr[i:i+16,j:j+16]
for y in range(i-search_win,i+search_win):
for x in range(j-search_win,j+search_win):
if x>0 and y>0 and x<width-16 and y<height-16:
ref_blk = fr_ref[y:y+16,x:x+16]
loss_this = rmse(ref_blk,blk)
if loss_this < loss:
loss = loss_this
ref_h = y
ref_w = x
fr_out[i:i+16,j:j+16] = fr_ref[ref_h:ref_h+16,ref_w:ref_w+16]
return fr_out
# Three-step search
def get_multi_step_point(h,w,stride):
h = h - stride
w = w - stride
points = []
for i in range(3):
for j in range(3):
return points
def multi_step_substep(ref_h,ref_w,loss,fr_ref,blk,stride):
height,width = fr_ref.shape[:2]
points = get_multi_step_point(ref_h,ref_w,stride)
for point in points:
h,w = point
if h>0 and w>0 and w<width-16 and h<height-16:
ref_blk = fr_ref[h:h+16,w:w+16]
loss_this = rmse(ref_blk,blk)
if loss_this < loss:
loss = loss_this
ref_h = h
ref_w = w
return loss,ref_h,ref_w
def multi_step(fr,fr_ref,step_num):
:param fr: the frame need to predict
:param fr_ref: previous frame
:param step_num: step number if 3 -> Three-step search
:return fr_out: predict frame
height,width = fr.shape[:2]
fr_out = np.zeros_like(fr,dtype='uint8')
for h in range(0,height,16):
for w in range(0,width,16):
loss = 1e10
ref_h = h
ref_w = w
blk = fr[h:h+16,w:w+16]
for step_cnt in range(step_num):
loss,ref_h,ref_w = multi_step_substep(ref_h,ref_w,loss,fr_ref,blk,int(2**(step_num-step_cnt-1)))
fr_out[h:h+16,w:w+16] = fr_ref[ref_h:ref_h+16,ref_w:ref_w+16]
return fr_out
# Diamond search
def get_diamond_point(h,w,wide,stride = 1):
points = []
if wide == 2:
points = [[h,w],[h+1,w],[h-1,w],[h,w+1],[h,w-1]]
elif wide ==3:
points = [[h,w],[h+2*stride,w],[h-2*stride,w],[h,w+2*stride],[h,w-2*stride],[h+stride,w+stride],[h-stride,w-stride],[h+stride,w-stride],[h-stride,w+stride]]
return points
def multi_diamond_substep(ref_h,ref_w,loss,used_points,fr_ref,blk,wide):
height,width = fr_ref.shape[:2]
points = get_diamond_point(ref_h,ref_w,wide)
for point in points:
if point not in used_points:
h,w = point
if h>0 and w>0 and w<width-16 and h<height-16:
ref_blk = fr_ref[h:h+16,w:w+16]
loss_this = rmse(ref_blk,blk)
if loss_this < loss:
loss = loss_this
ref_h = h
ref_w = w
return ref_h,ref_w,loss,used_points
def diamond(fr,fr_ref):
:param fr: the frame need to predict
:param fr_ref: previous frame
:return fr_out: predict frame
height,width = fr.shape[:2]
fr_out = np.zeros_like(fr,dtype='uint8')
for h in range(0,height,16):
for w in range(0,width,16):
loss = 1e10
ref_h = h
ref_w = w
blk = fr[h:h+16,w:w+16]
used_points = []
#step1 LDSP
cnt = 0
cnt += 1
input_h = ref_h
input_w = ref_w
ref_h,ref_w,loss,used_points = multi_diamond_substep(ref_h,ref_w,loss,used_points,fr_ref,blk,3)
if (input_h == ref_h) and (input_w == ref_w):
#step2 SDSP
#step2 SDSP
fr_out[h:h+16,w:w+16] = fr_ref[ref_h:ref_h+16,ref_w:ref_w+16]
return fr_out
def main():
if __name__ == "__main__":
import numpy as np
import cv2
import time
from matplotlib import pyplot as plt
import YUVreader
import motion_compensation as mc
#load YUV
dragon_video = YUVreader.load('dragon_video.yuv', 640, 480, 0, 2,type='YUV')
gas_video = YUVreader.load('gas_video.yuv', 640, 480, 0,2,type='YUV')
#Find the right parameters
#Full Search
losss = []
windows = []
times = []
for window_size in range(3,30,1):
t1 = time.time()
pred = mc.full(dragon_video[1],dragon_video[0],window_size)
t2 = time.time()
plt.title('window size & cost time - Full Search')
plt.xlabel('window size')
plt.ylabel('cost time')
plt.title('window size & rmse- Full Search')
plt.xlabel('window size')
# predict_error = 0
# t1 = time.time
# plot image
dragon_video = YUVreader.load('dragon_video.yuv', 640, 480, 0, 2,type='BGR')
pred = mc.full(dragon_video[1],dragon_video[0],20)
img = np.zeros((480,640*3,3),dtype='uint8')
## my_conv
This code finish in python3 and use numpy as a Matrix tool.
@Hypo,雷海波 1910273011
## to test code
In test.py , I compare my_conv's result with np.convolve's result,and found they are same.
python test.py
conv with my_conv: [ 1. 2. 4. 7. 10. 13. 9. 11. 6.]
conv with np.convolve: [ 1 2 4 7 10 13 9 11 6]
conv with my_conv: [0.3319973 0.68744623 0.39637609 0.44753119 0.53135132 0.17920317
0.22541965 0.26269674 0.03181879]
conv with np.convolve: [0.3319973 0.68744623 0.39637609 0.44753119 0.53135132 0.17920317
0.22541965 0.26269674 0.03181879]
[Finished in 0.2s]
import numpy as np
def conv(u,v):
@Hypo 1910273011
This is a function to conv two vectors
u : array_like
v : array_like
out : array_like
out_length = len(u)+len(v)-1
w_length = 2*(len(u)-1)+len(v)
out = np.zeros(out_length)
u_flip = u[::-1] #flip u
v_pad = np.zeros(w_length)
v_pad[len(u)-1:len(u)+len(v)-1] = v # fill 0 to v
#shift, multiple, sum
for i in range(out_length):
out[i] = np.sum(u_flip*v_pad[i:i+len(u)])
return out
import numpy as np
import my_conv
@Hypo 1910273011
A code to test my_conv
x = np.array([1,0,1,1])
y = np.array([1,2,3,4,5,6])
out_my_conv = my_conv.conv(x, y)
out_np_conv = np.convolve(x, y)
print('conv with my_conv:',out_my_conv)
print('conv with np.convolve:',out_np_conv)
x = np.random.rand(3)
y = np.random.rand(7)
out_my_conv = my_conv.conv(x, y)
out_np_conv = np.convolve(x, y)
print('conv with my_conv:',out_my_conv)
print('conv with np.convolve:',out_np_conv)
## my_dft &my_ idft
This code finish in python3 and use numpy as a Matrix tool.
@Hypo,雷海波 1910273011
## test code
my_dft and my_ idft codes were wrote in my_dft.py
In test.py , I compare my_dft/idft's result with np.fft.fft/ifft's result,and found they are same.
python test.py
Xn: [1 2 3 4 5 6]
DFT Xn with my_dft: [21.+0.j -3.+5.196j -3.+1.732j -3.-0.j -3.-1.732j -3.-5.196j]
IDFT my_Xk with my_idft: [1.-0.j 2.+0.j 3.-0.j 4.+0.j 5.+0.j 6.-0.j]
FFT Xn with np.fft.fft: [21.+0.j -3.+5.196j -3.+1.732j -3.-0.j -3.-1.732j -3.-5.196j]
IFFT np_Xk with np.fft.ifft: [1.-0.j 2.+0.j 3.-0.j 4.+0.j 5.-0.j 6.+0.j]
Xn: [1 1 1 1 1 1]
DFT Xn with my_dft: [ 6.+0.j -0.-0.j 0.-0.j 0.-0.j -0.-0.j -0.-0.j]
IDFT my_Xk with my_idft: [1.-0.j 1.+0.j 1.-0.j 1.+0.j 1.+0.j 1.-0.j]
FFT Xn with np.fft.fft: [6.+0.j 0.+0.j 0.+0.j 0.+0.j 0.+0.j 0.+0.j]
IFFT np_Xk with np.fft.ifft: [1.+0.j 1.+0.j 1.+0.j 1.+0.j 1.+0.j 1.+0.j]
import numpy as np
def dft(Xn):
@Hypo 1910273011
This is a function for Discrete Fourier Transform(DFT)
Xn : array_like, type = complex
out : array_like, type = complex
Xk = []; Xk_i = 0
N = len(Xn)
for k in range(N):
for n in range(N):
Xk_i += Xn[n]*np.exp(-1j*k*2*np.pi*n/N)
Xk_i = 0
return np.array(Xk)
def idft(Xk):
@Hypo 1910273011
This is a function for Inverse Discrete Fourier Transform(IDFT)
Xn : array_like, type = complex
out : array_like, type = complex
# use dft calculate idft
return dft(Xk.conjugate()).conjugate()/len(Xk)
#use formula
Xn = []; Xn_i = 0
N = len(Xk)
for n in range(N):
for k in range(N):
Xn_i += Xk[k]*np.exp(1j*k*2*np.pi*n/N)
Xn_i = 0
return np.array(Xn)/N
import numpy as np
import my_dft
@Hypo 1910273011
A code to test my_dft and my_idft
Xn = np.array([1,2,3,4,5,6]) #change test vectors here
my_Xk = my_dft.dft(Xn)
my_Xn = my_dft.idft(my_Xk)
np_Xk = np.fft.fft(Xn)
np_Xn = np.fft.ifft(np_Xk)
print('My:\nDFT Xn with my_dft:',np.around(my_Xk,3))
print('IDFT my_Xk with my_idft:',np.around(my_Xn,3))
print('Numpy:\nFFT Xn with np.fft.fft:',np.around(np_Xk,3))
print('IFFT np_Xk with np.fft.ifft:',np.around(np_Xn,3))
Xn = np.array([1,1,1,1,1,1])
my_Xk = my_dft.dft(Xn)
my_Xn = my_dft.idft(my_Xk)
np_Xk = np.fft.fft(Xn)
np_Xn = np.fft.ifft(np_Xk)
print('My:\nDFT Xn with my_dft:',np.around(my_Xk,3))
print('IDFT my_Xk with my_idft:',np.around(my_Xn,3))
print('Numpy:\nFFT Xn with np.fft.fft:',np.around(np_Xk,3))
print('IFFT np_Xk with np.fft.ifft:',np.around(np_Xn,3))
@Hypo,雷海波 1910273011
* run my_Huffman
python my_Huffman.py
* result:
Origin input: ACBDEAAABCDE
Encode huffman: 1011001001111010100111000111
Decode huffman: ACBDEAAABCDE
codebook = {'A':'10','B':'01','C':'110','D':'00','E':'111'}
uncodebook = dict((value,key) for key,value in codebook.items())
def write_huffman_code(uncodedata):
codedata = ''
for char in uncodedata:
codedata += codebook[char]
return codedata
def read_huffman_code(codedata):
uncodedata = ''
flag ,i = 0,0
while i <= len(codedata):
if codedata[flag:i] in uncodebook:
uncodedata += uncodebook[codedata[flag:i]]
flag = i
i += 1
return uncodedata
input = 'ACBDEAAABCDE'#input a String consists of A,B,C,D,E
print('Origin input:',input)
print('Encode huffman:',write_huffman_code(input))
print('Decode huffman:',read_huffman_code(write_huffman_code(input)))
## my_dct
This code finish in python3 and use numpy ,opencv-python as tool.
@Hypo,雷海波 1910273011
## Run code
python image_dct_and_idct.py
## Result:
QF=1 | QF=20 | QF=50 | QF=100
:-: | :-: | :-: | :-: | :-:
![image](./images/QF=1.png) | ![image](./images/QF=20.png) | ![image](./images/QF=50.png) | ![image](./images/QF=100.png) |
* We can find that image will get worse quality when QF is larger.
* When quantfy DCT blocks, it will loss more information when QF is large.
import numpy as np
import cv2
def block_dct_and_idct(g,QF):
Q = np.array([[8,16,19,22,26,27,29,34],
T = cv2.dct(g.astype(np.float32))
QT = np.round(16.0*T/(Q*QF))
IQT = np.round(QT*Q*QF/16)
IT = np.round(cv2.idct(IQT))
return IT.astype(np.uint8)
def image_dct_and_idct(I,QF):
h,w = I.shape
I = I[:8*int(h/8),:8*int(w/8)]
output = np.zeros_like(I)
for i in range(int(h/8)):
for j in range(int(w/8)):
output[i*8:(i+1)*8,j*8:(j+1)*8] = block_dct_and_idct(I[i*8:(i+1)*8,j*8:(j+1)*8],QF)
return output
img = cv2.imread('./images/lena.jpg')
img_y = cv2.cvtColor(img,cv2.COLOR_RGB2YUV)[:,:,0] #get Y component
#QF = 1
img_QF1 = image_dct_and_idct(img_y,1)
#QF = 20
img_QF20 = image_dct_and_idct(img_y,20)
#QF = 50
img_QF50 = image_dct_and_idct(img_y,50)
#QF = 100
img_QF100 = image_dct_and_idct(img_y,100)
k = cv2.waitKey(0)
if k == 27: # wait for ESC key to exit
* SCI是科学引文索引。EI是工程索引。
SCI,即《科学引文索引》,英文全称是Science Citation Index,是美国科学情报研究所出版的一部世界著名的期刊文献检索工具,通过其严格的选刊标准和评估程序来挑选刊源。其检索对象主要是自然科学。
EI即《工程索引》, 其不收录基础理论研究文章。系美国工程信息公司出版的一个著名工程技术类综合检索工具。
* 单盲(评审人知道作者的名字和单位,作者不知道评审人)和双盲(评审人不知道作者的名字和单位,作者不知道评审人)
* 出版社
ACM:Association for Computing and Machinery
AAAI:Association for the Advancement of Artificial Intelligence
* 中国计算机学会(CCF)
CCF, A,B,C类期刊会议
* 分区
小类学科:即JCR学科分类体系Journal Ranking确定的176个学科领域。一本期刊只可属于一个大类学科,但是一本期刊却可以属于多个不同的小类学科
前5%为该类1区、6%~20% 为2区、21%~50%为3区,其余为4区。
* 影响因子 Impact Factor,IF
某期刊前两年发表的论文在该报告年份(JCR year)中被引用总次数除以该期刊在这两年内发表的论文总数。
* 评审回避
* 版权,交纳费用
一般归出版社所有,按照要求签署版权转让协议书(Assignment of copyright),注意什么地方要打印,什么地方要手写,什么时间必须返回协议书。
* latex
MiKTeX 或proTeXt 或TeX Live 发行版
* 所在领域 人机交互与普适计算
期刊:TOCHI ACM Transactions on Computer-Human Interaction CCF A
IJHCS International Journal of Human Computer Studies CCF A
会议:UbiComp Uniquitous Computing 普适计算 CCF A
CHI ACM International Conference on Ubiquitous Computing CCF A
CSCW ACM Conference on Computer Supported Cooperative Work and Social Computing
* UbiComp:
collocated with the ACM International Symposium on Wearable Computers (ISWC'19)
分为paper,posters,demos,其中paper以长文为主,一般在10页-20页之间,以不同主题进行分会议,中奖率与这个有关,注重idea,要求详细的实验和结论,单盲审。会议的平均提交率为400-500 /年,过去几年的接受率为20-25%。
* 论文规范写作
图表:字体,间隔等 参考文献:同样的格式
1.the 标题不用加the ,单数可数名词不能单独出现,特指某个东西,名词后有修饰性短语,特殊情况,
2.单引号(‘’)还是双引号(“”): 英式英语和中文习惯相反,一般用单引号,单引号里面才用双引号
5.逗号在并列结构的用法:对于and, nor, but, or, yet 和 so连接的并列结构,若有三个及以上的并列,最后一个并列部分无需用逗号分开(a, b and c), 若三个以下并列可用逗号分开
6.e.g. 和 i.e.:一般只用于括号中,在正式文本中用全拼for example 和 that is更好(e.g.and i.e.are used mainly in parentheses. The English equivalents for example and that is are preferred in the text for a formal expression.)
9.在introduction部分介绍自己要做什么时没用一般现在时, 陈述别人做了什么时,用过去时或过去完成时,但是在陈述具体发现时,用一般现在时。 在数据分析部分,陈述自己做的工作时,用一般过去时。例如,we adopted data from……,we analyzed……,在discussion(包括main findings,implication,limitation等),写自己做了什么事,用一般过去时, 在conclusion总结本文工作时,用一般现在时。
* 其他注意事项
1.是否收版面费(Page charges)?如果论文被接收,自己的经济能力能否支付该杂志的发表全部费用
5.图表是否符合杂志的数目(Number)、大小(Size)和分辨率(Resolution)要求?有几副彩图(建议能设置为灰度的图就改成灰度的图,比如一些统计结果图。因为彩图收费(Color charges)是很贵的)。图的格式类型是否有要求,一般只接收EPS或TIFF格式。图的模式是否有要求,比如过去一般要求是CMYK模式,现在很多杂志要求RGB模式。
* 论文图表
3.图的数量是否符合要求?哪些图可以合并?哪些图可以放到Supporting Material中?哪些图可以拼成(A,B,C)这样的?
* latex
\documentclass[UTF8]{ctexart} %使用中文版的article文档类型排版,并选择UTF8编码格式
\usepackage{amsmath} %使用宏包,这里使用的是调用公式宏包,可以调用多个宏包
\begin{document} %开始写文章
\title{杂谈勾股定理} %大括号里填写标题
\author{张三} %大括号里填写作者姓名
\date{\today} %大括号里填写\today会自动生成当前的日期
\maketitle %我们写了以上内容以后一定要添加这个,制作标题,否则上面的内容都是无效的。
\end{document} %结束写文章
\begin{thebibliography}{99} %参考文献开始
\bibitem{1}失野健太郎.几何的有名定理.上海科学技术出版社,1986. %参考文献1
\bibitem{quanjing}曲安金.商高、赵爽与刘辉关于勾股定理的证明.数学传播,20(3),1998. %参考文献2
\begin{appendix} %附录开始
\small 勾股定理又叫商高定理,国外也称百牛定理。
