Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
sfewfsaf
Synonyms
提交
4a44eff8
S
Synonyms
项目概览
sfewfsaf
/
Synonyms
与 Fork 源项目一致
从无法访问的项目Fork
通知
6
Star
0
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
S
Synonyms
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
前往新版Gitcode,体验更适合开发者的 AI 搜索 >>
提交
4a44eff8
编写于
5月 28, 2018
作者:
H
Hai Liang Wang
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
#60 compare 支持交换句子
上级
c580b3d8
变更
5
隐藏空白更改
内联
并排
Showing
5 changed file
with
32 addition
and
21 deletion
+32
-21
CHANGELOG.md
CHANGELOG.md
+2
-0
Requirements.txt
Requirements.txt
+1
-1
demo.py
demo.py
+9
-0
setup.py
setup.py
+1
-1
synonyms/synonyms.py
synonyms/synonyms.py
+19
-19
未找到文件。
CHANGELOG.md
浏览文件 @
4a44eff8
# 3.6
*
Fix Bug: compare 保证交换两个句子后分数一致
[
#60
](
https://github.com/huyingxi/Synonyms/issues/60
)
# 3.5
*
根据实际情况,降低向量距离对近似度分数的影响
...
...
Requirements.txt
浏览文件 @
4a44eff8
synonyms>=3.5
\ No newline at end of file
synonyms>=3.6
\ No newline at end of file
demo.py
浏览文件 @
4a44eff8
...
...
@@ -114,6 +114,15 @@ class Test(unittest.TestCase):
r
=
synonyms
.
compare
(
sen1
,
sen2
,
seg
=
False
)
print
(
"%s vs %s"
%
(
sen1
,
sen2
),
r
)
def
test_swap_sent
(
self
):
print
(
"test_swap_sent"
)
s1
=
synonyms
.
compare
(
"教学"
,
"老师"
)
s2
=
synonyms
.
compare
(
"老师"
,
"教学"
)
print
(
'"教学", "老师": %s '
%
s1
)
print
(
'"老师", "教学": %s '
%
s2
)
assert
s1
==
s2
,
"Scores should be the same after swap sents"
def
test_nearby
(
self
):
synonyms
.
display
(
"奥运"
)
# synonyms.display calls synonyms.nearby
synonyms
.
display
(
"北新桥"
)
# synonyms.display calls synonyms.nearby
...
...
setup.py
浏览文件 @
4a44eff8
...
...
@@ -13,7 +13,7 @@ Welcome
setup
(
name
=
'synonyms'
,
version
=
'3.
5
.0'
,
version
=
'3.
6
.0'
,
description
=
'Chinese Synonyms for Natural Language Processing and Understanding'
,
long_description
=
LONGDOC
,
author
=
'Hai Liang Wang, Hu Ying Xi'
,
...
...
synonyms/synonyms.py
浏览文件 @
4a44eff8
...
...
@@ -211,28 +211,28 @@ def _nearby_levenshtein_distance(s1, s2):
使用空间距离近的词汇优化编辑距离计算
'''
s1_len
,
s2_len
=
len
(
s1
),
len
(
s2
)
maxlen
=
max
(
s1_len
,
s2_len
)
first
,
second
=
(
s2
,
s1
)
if
s1_len
==
maxlen
else
(
s1
,
s2
)
ft_1
=
set
()
# all related words with first sentence
maxlen
=
s1_len
if
s1_len
==
s2_len
:
first
,
second
=
sorted
([
s1
,
s2
])
elif
s1_len
<
s2_len
:
first
=
s1
second
=
s2
maxlen
=
s2_len
else
:
first
=
s2
second
=
s1
ft
=
set
()
# all related words with first sentence
for
x
in
first
:
ft
_1
.
add
(
x
)
ft
.
add
(
x
)
n
,
_
=
nearby
(
x
)
for
o
in
n
[:
5
]:
ft_1
.
add
(
o
)
ft_2
=
set
()
# all related words with second sentence
for
x
in
second
:
ft_2
.
add
(
x
)
n
,
_
=
nearby
(
x
)
for
o
in
n
[:
5
]:
ft_2
.
add
(
0
)
for
o
in
n
[:
10
]:
ft
.
add
(
o
)
scores
=
[]
if
len
(
ft_1
)
==
0
or
len
(
ft_2
)
==
0
:
return
0.0
# invalid length
for
x
in
ft_1
:
for
y
in
ft_2
:
scores
.
append
([
_levenshtein_distance
(
x
,
y
)])
s
=
np
.
sum
(
scores
)
/
(
s1_len
*
s2_len
)
for
x
in
second
:
scores
.
append
(
max
([
_levenshtein_distance
(
x
,
y
)
for
y
in
ft
]))
s
=
np
.
sum
(
scores
)
/
maxlen
return
s
def
_similarity_distance
(
s1
,
s2
,
ignore
):
...
...
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录