Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
MindSpore
course
提交
a2d3860d
C
course
项目概览
MindSpore
/
course
通知
4
Star
0
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
C
course
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
前往新版Gitcode,体验更适合开发者的 AI 搜索 >>
提交
a2d3860d
编写于
8月 29, 2020
作者:
D
dyonghan
提交者:
zhengnengjin
9月 08, 2020
浏览文件
操作
浏览文件
下载
差异文件
!32 modified gcn and lstm experiments
Merge pull request !32 from dyonghan/gcn
上级
58a87f49
29bdaea5
变更
13
展开全部
隐藏空白更改
内联
并排
Showing
13 changed file
with
330 addition
and
315 deletion
+330
-315
graph_convolutional_network/README.md
graph_convolutional_network/README.md
+184
-156
lstm/README.md
lstm/README.md
+111
-127
lstm/config.py
lstm/config.py
+6
-6
lstm/images/LSTM1.png
lstm/images/LSTM1.png
+0
-0
lstm/images/LSTM2.png
lstm/images/LSTM2.png
+0
-0
lstm/images/LSTM3.png
lstm/images/LSTM3.png
+0
-0
lstm/images/LSTM4.png
lstm/images/LSTM4.png
+0
-0
lstm/images/LSTM5.png
lstm/images/LSTM5.png
+0
-0
lstm/images/LSTM6.png
lstm/images/LSTM6.png
+0
-0
lstm/images/LSTM7.png
lstm/images/LSTM7.png
+0
-0
lstm/images/LSTM8.png
lstm/images/LSTM8.png
+0
-0
lstm/images/LSTM9.png
lstm/images/LSTM9.png
+0
-0
lstm/main.py
lstm/main.py
+29
-26
未找到文件。
graph_convolutional_network/README.md
浏览文件 @
a2d3860d
此差异已折叠。
点击以展开。
nlp_
lstm/README.md
→
lstm/README.md
浏览文件 @
a2d3860d
...
...
@@ -14,17 +14,13 @@ RNN是一个包含大量重复神经网络模块的链式形式,在标准RNN
![
LSTM1
](
./images/LSTM1.png
)
**标准RNN中只包含单个tanh层的重复模块**
LSTM也有与之相似的链式结构,但不同的是它的重复模块结构不同,是4个以特殊方式进行交互的神经网络。
![
LSTM2
](
./images/LSTM2.png
)
**LSTM示意图**
这里我们先来看看图中的这些符号:
![
LSTM3
](
./
IMAGES
/LSTM3.png
)
![
LSTM3
](
./
images
/LSTM3.png
)
在示意图中,从某个节点的输出到其他节点的输入,每条线都传递一个完整的向量。粉色圆圈表示pointwise操作,如节点求和,而黄色框则表示用于学习的神经网络层。合并的两条线表示连接,分开的两条线表示信息被复制成两个副本,并将传递到不同的位置。
...
...
@@ -66,19 +62,10 @@ sigmoid层输出0到1之间的数字,点乘操作决定多少信息可以传
### 数据集介绍
IMDB是一个与国内豆瓣比较类似的与电影相关的网站,而本次实验用到的数据集是这个网站中的一些用户评论。IMDB数据集共包含50000项影评文字,训练数据和测试数据各25000项,每一项影评文字都被标记为正面评价或负面评价,所ff以本实验可以看做一个二分类问题。
l 从华为云对象存储服务(OBS)获取
华为云开通了相应的数据存储服务OBS可直接通过链接进行数据集下载。
[
数据集链接
]:
https://obs-deeplearning.obs.cn-north-1.myhuaweicloud.com/obs-80d2/aclImdb_v1.tar.gz
l 从斯坦福大学网站获取
[
数据集链接
]:
http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz
IMDB是一个与国内豆瓣比较类似的与电影相关的网站,而本次实验用到的数据集是这个网站中的一些用户评论。IMDB数据集共包含50000项影评文字,训练数据和测试数据各25000项,每一项影评文字都被标记为正面评价或负面评价,所以本实验可以看做一个二分类问题。IMDB数据集官网:
[
Large Movie Review Dataset
](
http://ai.stanford.edu/~amaas/data/sentiment/
)
。
-
方式一,从斯坦福大学官网下载
[
aclImdb_v1.tar.gz
](
http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz
)
并解压。
-
方式二,从华为云OBS中下载
[
aclImdb_v1.tar.gz
](
https://obs-deeplearning.obs.cn-north-1.myhuaweicloud.com/obs-80d2/aclImdb_v1.tar.gz
)
并解压。
## 实验目的
...
...
@@ -101,7 +88,7 @@ l 从斯坦福大学网站获取
### 数据集准备
采用
[
IMDB影评数据集
](
http://ai.stanford.edu/~amaas/data/sentiment/
)
作为实验数据。同时,我们要下载
[
GloVe
](
http://nlp.stanford.edu/data/glove.6B.zip
)
文件,并在文件glove.6B.
300d.txt开头处添加新的一行400000 300,意思是总共读取400000个单词,每个单词用300纬
度的词向量表示。
采用
[
IMDB影评数据集
](
http://ai.stanford.edu/~amaas/data/sentiment/
)
作为实验数据。同时,我们要下载
[
GloVe
](
http://nlp.stanford.edu/data/glove.6B.zip
)
文件,并在文件glove.6B.
200d.txt开头处添加新的一行400000 200,意思是总共读取400000个单词,每个单词用200维
度的词向量表示。
### 确定评价标准
...
...
@@ -155,29 +142,23 @@ experiment
导入MindSpore模块和辅助模块:
```
python
import
os
import
shutil
import
math
import
argparse
import
json
import
os
from
itertools
import
chain
import
numpy
as
np
from
config
import
lstm_cfg
as
cfg
import
gensim
from
easydict
import
EasyDict
as
edict
import
mindspore.nn
as
nn
import
mindspore.context
as
context
from
mindspore
import
Model
import
mindspore.dataset
as
ds
from
mindspore.nn
import
Accuracy
from
mindspore
import
Tensor
,
nn
,
context
from
mindspore.train.callback
import
Callback
from
mindspore.ops
import
operations
as
P
from
mindspore
import
Tensor
from
mindspore.common.initializer
import
initializer
from
mindspore.common.parameter
import
Parameter
from
mindspore.mindrecord
import
FileWriter
from
mindspore.train
import
Model
from
mindspore.nn.metrics
import
Accuracy
from
mindspore.train.serialization
import
load_checkpoint
,
load_param_into_net
from
mindspore.train.callback
import
ModelCheckpoint
,
CheckpointConfig
,
LossMonitor
,
TimeMonitor
# Install gensim with 'pip install gensim'
import
gensim
from
mindspore.train.callback
import
CheckpointConfig
,
ModelCheckpoint
,
TimeMonitor
,
LossMonitor
```
### 预处理数据集
...
...
@@ -282,7 +263,7 @@ class ImdbParser():
encoded_features
.
append
(
encoded_sentence
)
self
.
__features
[
seg
]
=
encoded_features
def
__padding_features
(
self
,
seg
,
maxlen
=
5
00
,
pad
=
0
):
def
__padding_features
(
self
,
seg
,
maxlen
=
2
00
,
pad
=
0
):
""" pad all features to the same length """
padded_features
=
[]
for
feature
in
self
.
__features
[
seg
]:
...
...
@@ -374,7 +355,7 @@ def convert_to_mindrecord(embed_size, aclimdb_path, preprocess_path, glove_path)
_convert_to_mindrecord
(
preprocess_path
,
test_features
,
test_labels
,
training
=
False
)
```
定义创建数据集函数
`lstm_create_dataset`
,创建训练集
`ds_train`
。
定义创建数据集函数
`lstm_create_dataset`
,创建训练集
`ds_train`
和验证集
`ds_eval`
。
```
python
def
lstm_create_dataset
(
data_home
,
batch_size
,
repeat_num
=
1
,
training
=
True
):
...
...
@@ -392,6 +373,9 @@ def lstm_create_dataset(data_home, batch_size, repeat_num=1, training=True):
data_set
=
data_set
.
repeat
(
count
=
repeat_num
)
return
data_set
ds_train
=
lstm_create_dataset
(
args
.
preprocess_path
,
cfg
.
batch_size
)
ds_eval
=
lstm_create_dataset
(
args
.
preprocess_path
,
cfg
.
batch_size
,
training
=
False
)
```
### 定义网络
...
...
@@ -399,6 +383,7 @@ def lstm_create_dataset(data_home, batch_size, repeat_num=1, training=True):
定义
`lstm_default_state`
函数来初始化网络参数及网络状态。
```
python
# Initialize short-term memory (h) and long-term memory (c) to 0
def
lstm_default_state
(
batch_size
,
hidden_size
,
num_layers
,
bidirectional
):
"""init default input."""
num_directions
=
1
...
...
@@ -431,6 +416,7 @@ def lstm_default_state(batch_size, hidden_size, num_layers, bidirectional):
```
python
class
SentimentNet
(
nn
.
Cell
):
"""Sentiment network structure."""
def
__init__
(
self
,
vocab_size
,
embed_size
,
...
...
@@ -441,6 +427,7 @@ class SentimentNet(nn.Cell):
weight
,
batch_size
):
super
(
SentimentNet
,
self
).
__init__
()
# Mapp words to vectors
self
.
embedding
=
nn
.
Embedding
(
vocab_size
,
embed_size
,
embedding_table
=
weight
)
...
...
@@ -463,16 +450,38 @@ class SentimentNet(nn.Cell):
self
.
decoder
=
nn
.
Dense
(
num_hiddens
*
2
,
num_classes
)
def
construct
(
self
,
inputs
):
# (64,500,300)
#
input:
(64,500,300)
embeddings
=
self
.
embedding
(
inputs
)
embeddings
=
self
.
trans
(
embeddings
,
self
.
perm
)
output
,
_
=
self
.
encoder
(
embeddings
,
(
self
.
h
,
self
.
c
))
# states[i] size(64,200) -> encoding.size(64,400)
encoding
=
self
.
concat
((
output
[
0
],
output
[
1
]))
encoding
=
self
.
concat
((
output
[
0
],
output
[
1
99
]))
outputs
=
self
.
decoder
(
encoding
)
return
outputs
```
### 定义回调函数
定义回调函数EvalCallBack,采用一边训练的同时,在相隔固定epoch的位置对模型进行精度验证,等训练完毕后,通过查看对应模型精度的变化就能迅速地挑选出相对最优的模型,实现同步进行训练和验证。
```
python
class
EvalCallBack
(
Callback
):
def
__init__
(
self
,
model
,
eval_dataset
,
eval_per_epoch
,
epoch_per_eval
):
self
.
model
=
model
self
.
eval_dataset
=
eval_dataset
self
.
eval_per_epoch
=
eval_per_epoch
self
.
epoch_per_eval
=
epoch_per_eval
def
epoch_end
(
self
,
run_context
):
cb_param
=
run_context
.
original_args
()
cur_epoch
=
cb_param
.
cur_epoch_num
if
cur_epoch
%
self
.
eval_per_epoch
==
0
:
acc
=
self
.
model
.
eval
(
self
.
eval_dataset
,
dataset_sink_mode
=
False
)
self
.
epoch_per_eval
[
"epoch"
].
append
(
cur_epoch
)
self
.
epoch_per_eval
[
"acc"
].
append
(
acc
[
"acc"
])
print
(
acc
)
```
### 配置运行信息
使用
`parser`
模块,传入运行必要的信息,如数据集存放路径,GloVe存放路径,这样的好处是,对于经常变化的配置,可以在运行代码时输入,使用更加灵活。
...
...
@@ -486,45 +495,39 @@ class SentimentNet(nn.Cell):
-
device_target:指定GPU或CPU环境。
```
python
if
__name__
==
'__main__'
:
parser
=
argparse
.
ArgumentParser
(
description
=
'MindSpore LSTM Example'
)
parser
.
add_argument
(
'--preprocess'
,
type
=
str
,
default
=
'true'
,
choices
=
[
'true'
,
'false'
],
help
=
'whether to preprocess data.'
)
parser
.
add_argument
(
'--aclimdb_path'
,
type
=
str
,
default
=
"./aclImdb"
,
help
=
'path where the dataset is stored.'
)
parser
.
add_argument
(
'--glove_path'
,
type
=
str
,
default
=
"./glove"
,
help
=
'path where the GloVe is stored.'
)
parser
.
add_argument
(
'--preprocess_path'
,
type
=
str
,
default
=
"./preprocess"
,
help
=
'path where the pre-process data is stored.'
)
parser
.
add_argument
(
'--ckpt_path'
,
type
=
str
,
default
=
"./"
,
help
=
'the path to save the checkpoint file.'
)
parser
.
add_argument
(
'--pre_trained'
,
type
=
str
,
default
=
None
,
help
=
'the pretrained checkpoint file path.'
)
parser
.
add_argument
(
'--device_target'
,
type
=
str
,
default
=
"GPU"
,
choices
=
[
'GPU'
,
'CPU'
],
help
=
'the target device to run, support "GPU", "CPU". Default: "GPU".'
)
args
=
parser
.
parse_args
([
'--device_target'
,
'CPU'
,
'--preprocess'
,
'true'
])
context
.
set_context
(
mode
=
context
.
GRAPH_MODE
,
save_graphs
=
False
,
device_target
=
args
.
device_target
)
if
args
.
preprocess
==
"true"
:
print
(
"============== Starting Data Pre-processing =============="
)
convert_to_mindrecord
(
cfg
.
embed_size
,
args
.
aclimdb_path
,
args
.
preprocess_path
,
args
.
glove_path
)
print
(
"======================= Successful ======================="
)
ds_train
=
lstm_create_dataset
(
args
.
preprocess_path
,
cfg
.
batch_size
)
#实例化SentimentNet,创建网络。
embedding_table
=
np
.
loadtxt
(
os
.
path
.
join
(
args
.
preprocess_path
,
"weight.txt"
)).
astype
(
np
.
float32
)
network
=
SentimentNet
(
vocab_size
=
embedding_table
.
shape
[
0
],
embed_size
=
cfg
.
embed_size
,
num_hiddens
=
cfg
.
num_hiddens
,
num_layers
=
cfg
.
num_layers
,
bidirectional
=
cfg
.
bidirectional
,
num_classes
=
cfg
.
num_classes
,
weight
=
Tensor
(
embedding_table
),
batch_size
=
cfg
.
batch_size
)
parser
=
argparse
.
ArgumentParser
(
description
=
'MindSpore LSTM Example'
)
parser
.
add_argument
(
'--preprocess'
,
type
=
str
,
default
=
'false'
,
choices
=
[
'true'
,
'false'
],
help
=
'whether to preprocess data.'
)
parser
.
add_argument
(
'--aclimdb_path'
,
type
=
str
,
default
=
"./aclImdb"
,
help
=
'path where the dataset is stored.'
)
parser
.
add_argument
(
'--glove_path'
,
type
=
str
,
default
=
"./glove"
,
help
=
'path where the GloVe is stored.'
)
parser
.
add_argument
(
'--preprocess_path'
,
type
=
str
,
default
=
"./preprocess"
,
help
=
'path where the pre-process data is stored.'
)
parser
.
add_argument
(
'--ckpt_path'
,
type
=
str
,
default
=
"./"
,
help
=
'the path to save the checkpoint file.'
)
parser
.
add_argument
(
'--pre_trained'
,
type
=
str
,
default
=
None
,
help
=
'the pretrained checkpoint file path.'
)
parser
.
add_argument
(
'--device_target'
,
type
=
str
,
default
=
"GPU"
,
choices
=
[
'GPU'
,
'CPU'
],
help
=
'the target device to run, support "GPU", "CPU". Default: "GPU".'
)
args
=
parser
.
parse_args
([
'--device_target'
,
'CPU'
,
'--preprocess'
,
'true'
])
context
.
set_context
(
mode
=
context
.
GRAPH_MODE
,
save_graphs
=
False
,
device_target
=
args
.
device_target
)
if
args
.
preprocess
==
"true"
:
print
(
"============== Starting Data Pre-processing =============="
)
convert_to_mindrecord
(
cfg
.
embed_size
,
args
.
aclimdb_path
,
args
.
preprocess_path
,
args
.
glove_path
)
print
(
"======================= Successful ======================="
)
#实例化SentimentNet,创建网络。
embedding_table
=
np
.
loadtxt
(
os
.
path
.
join
(
args
.
preprocess_path
,
"weight.txt"
)).
astype
(
np
.
float32
)
network
=
SentimentNet
(
vocab_size
=
embedding_table
.
shape
[
0
],
embed_size
=
cfg
.
embed_size
,
num_hiddens
=
cfg
.
num_hiddens
,
num_layers
=
cfg
.
num_layers
,
bidirectional
=
cfg
.
bidirectional
,
num_classes
=
cfg
.
num_classes
,
weight
=
Tensor
(
embedding_table
),
batch_size
=
cfg
.
batch_size
)
```
通过
`create_dict_iterator`
方法创建字典迭代器,读取已创建的数据集
`ds_train`
中的数据。
...
...
@@ -542,27 +545,32 @@ print(f"The feature of the first item in the first batch is below vector:\n{firs
### 定义优化器及损失函数
```
python
loss
=
nn
.
SoftmaxCrossEntropyWithLogits
(
is_grad
=
False
,
sparse
=
True
)
opt
=
nn
.
Momentum
(
network
.
trainable_params
(),
cfg
.
learning_rate
,
cfg
.
momentum
)
loss_cb
=
LossMonitor
()
loss
=
nn
.
SoftmaxCrossEntropyWithLogits
(
is_grad
=
False
,
sparse
=
True
)
opt
=
nn
.
Momentum
(
network
.
trainable_params
(),
cfg
.
learning_rate
,
cfg
.
momentum
)
loss_cb
=
LossMonitor
()
```
###
训练并保存
模型
###
同步训练并验证
模型
加载训练数据集(
`ds_train`
)并配置好
`CheckPoint`
生成信息,然后使用
`model.train`
接口,进行模型训练,此步骤在GPU上训练用时约7分钟。CPU上需更久;根据输出可以看到loss值随着训练逐步降低,最后达到0.262左右。
```
python
model
=
Model
(
network
,
loss
,
opt
,
{
'acc'
:
Accuracy
()})
print
(
"============== Starting Training =============="
)
config_ck
=
CheckpointConfig
(
save_checkpoint_steps
=
cfg
.
save_checkpoint_steps
,
keep_checkpoint_max
=
cfg
.
keep_checkpoint_max
)
ckpoint_cb
=
ModelCheckpoint
(
prefix
=
"lstm"
,
directory
=
args
.
ckpt_path
,
config
=
config_ck
)
time_cb
=
TimeMonitor
(
data_size
=
ds_train
.
get_dataset_size
())
if
args
.
device_target
==
"CPU"
:
model
.
train
(
cfg
.
num_epochs
,
ds_train
,
callbacks
=
[
time_cb
,
ckpoint_cb
,
loss_cb
],
dataset_sink_mode
=
False
)
else
:
model
.
train
(
cfg
.
num_epochs
,
ds_train
,
callbacks
=
[
time_cb
,
ckpoint_cb
,
loss_cb
])
print
(
"============== Training Success =============="
)
model
=
Model
(
network
,
loss
,
opt
,
{
'acc'
:
Accuracy
()})
print
(
"============== Starting Training =============="
)
config_ck
=
CheckpointConfig
(
save_checkpoint_steps
=
ds_train
.
get_dataset_size
(),
keep_checkpoint_max
=
cfg
.
keep_checkpoint_max
)
ckpoint_cb
=
ModelCheckpoint
(
prefix
=
"lstm"
,
directory
=
args
.
ckpt_path
,
config
=
config_ck
)
time_cb
=
TimeMonitor
(
data_size
=
ds_train
.
get_dataset_size
())
if
args
.
device_target
==
"CPU"
:
epoch_per_eval
=
{
"epoch"
:
[],
"acc"
:
[]}
eval_cb
=
EvalCallBack
(
model
,
ds_eval
,
1
,
epoch_per_eval
)
model
.
train
(
cfg
.
num_epochs
,
ds_train
,
callbacks
=
[
time_cb
,
ckpoint_cb
,
loss_cb
,
eval_cb
],
dataset_sink_mode
=
False
)
else
:
epoch_per_eval
=
{
"epoch"
:
[],
"acc"
:
[]}
eval_cb
=
EvalCallBack
(
model
,
ds_eval
,
1
,
epoch_per_eval
)
model
.
train
(
cfg
.
num_epochs
,
ds_train
,
callbacks
=
[
time_cb
,
ckpoint_cb
,
loss_cb
,
eval_cb
])
print
(
"============== Training Success =============="
)
```
```
...
...
@@ -577,49 +585,25 @@ epoch: 1 step: 7, loss is 0.6856
epoch: 1 step: 8, loss is 0.6819
epoch: 1 step: 9, loss is 0.7372
epoch: 1 step: 10, loss is 0.6948
...
epoch: 10 step: 380, loss is 0.3090
epoch: 10 step: 381, loss is 0.2692
epoch: 10 step: 382, loss is 0.3088
epoch: 10 step: 383, loss is 0.2008
epoch: 10 step: 384, loss is 0.1450
epoch: 10 step: 385, loss is 0.2522
epoch: 10 step: 386, loss is 0.2532
epoch: 10 step: 387, loss is 0.3558
epoch: 10 step: 388, loss is 0.2641
epoch: 10 step: 389, loss is 0.2334
epoch: 10 step: 390, loss is 0.1966
Epoch time: 43320.815, per step time: 111.079, avg loss: 0.262
epoch: 10 step 774, loss is 0.3010297119617462
epoch: 10 step 775, loss is 0.4418136477470398
epoch: 10 step 776, loss is 0.29638347029685974
epoch: 10 step 777, loss is 0.38901057839393616
epoch: 10 step 778, loss is 0.3772362470626831
epoch: 10 step 779, loss is 0.4098552167415619
epoch: 10 step 780, loss is 0.41440871357917786
epoch: 10 step 781, loss is 0.2255304455757141
Epoch time: 63056.078, per step time: 80.738
Epoch time: 63056.078, per step time: 80.738, avg loss: 0.354
************************************************************
{'acc': 0.8312996158770807}
============== Training Success ==============
```
### 模型验证
创建并加载验证数据集(
`ds_eval`
),加载由
**训练**
保存的CheckPoint文件,进行验证,查看模型质量,此步骤用时约30秒。
```
python
args
.
ckpt_path
=
f
'./lstm-
{
cfg
.
num_epochs
}
_390.ckpt'
print
(
"============== Starting Testing =============="
)
ds_eval
=
lstm_create_dataset
(
args
.
preprocess_path
,
cfg
.
batch_size
,
training
=
False
)
param_dict
=
load_checkpoint
(
args
.
ckpt_path
)
load_param_into_net
(
network
,
param_dict
)
if
args
.
device_target
==
"CPU"
:
acc
=
model
.
eval
(
ds_eval
,
dataset_sink_mode
=
False
)
else
:
acc
=
model
.
eval
(
ds_eval
)
print
(
"============== {} =============="
.
format
(
acc
))
```
```
============== Starting Testing ==============
============== {'acc': 0.8495592948717948} ==============
```
### 训练结果评价
根据以上一段代码的输出可以看到,在经历了10轮epoch之后,使用验证的数据集,对文本的情感分析正确率在8
5
%左右,达到一个基本满意的结果。
根据以上一段代码的输出可以看到,在经历了10轮epoch之后,使用验证的数据集,对文本的情感分析正确率在8
3
%左右,达到一个基本满意的结果。
## 实验总结
...
...
nlp_
lstm/config.py
→
lstm/config.py
浏览文件 @
a2d3860d
...
...
@@ -8,12 +8,12 @@ lstm_cfg = edict({
'num_classes'
:
2
,
'learning_rate'
:
0.1
,
'momentum'
:
0.9
,
'num_epochs'
:
1
,
'batch_size'
:
64
,
'embed_size'
:
3
00
,
'num_epochs'
:
1
0
,
'batch_size'
:
32
,
'embed_size'
:
2
00
,
'num_hiddens'
:
100
,
'num_layers'
:
2
,
'bidirectional'
:
Tru
e
,
'save_checkpoint_steps'
:
390
,
'num_layers'
:
1
,
'bidirectional'
:
Fals
e
,
'save_checkpoint_steps'
:
390
*
5
,
'keep_checkpoint_max'
:
10
})
nlp_
lstm/images/LSTM1.png
→
lstm/images/LSTM1.png
浏览文件 @
a2d3860d
文件已移动
nlp_
lstm/images/LSTM2.png
→
lstm/images/LSTM2.png
浏览文件 @
a2d3860d
文件已移动
nlp_
lstm/images/LSTM3.png
→
lstm/images/LSTM3.png
浏览文件 @
a2d3860d
文件已移动
nlp_
lstm/images/LSTM4.png
→
lstm/images/LSTM4.png
浏览文件 @
a2d3860d
文件已移动
nlp_
lstm/images/LSTM5.png
→
lstm/images/LSTM5.png
浏览文件 @
a2d3860d
文件已移动
nlp_
lstm/images/LSTM6.png
→
lstm/images/LSTM6.png
浏览文件 @
a2d3860d
文件已移动
nlp_
lstm/images/LSTM7.png
→
lstm/images/LSTM7.png
浏览文件 @
a2d3860d
文件已移动
nlp_
lstm/images/LSTM8.png
→
lstm/images/LSTM8.png
浏览文件 @
a2d3860d
文件已移动
nlp_
lstm/images/LSTM9.png
→
lstm/images/LSTM9.png
浏览文件 @
a2d3860d
文件已移动
nlp_
lstm/main.py
→
lstm/main.py
浏览文件 @
a2d3860d
import
os
import
shutil
import
math
import
argparse
import
json
from
itertools
import
chain
import
numpy
as
np
from
config
import
lstm_cfg
as
cfg
...
...
@@ -12,10 +9,9 @@ import mindspore.context as context
import
mindspore.dataset
as
ds
from
mindspore.ops
import
operations
as
P
from
mindspore
import
Tensor
from
mindspore.common.initializer
import
initializer
from
mindspore.common.parameter
import
Parameter
from
mindspore.mindrecord
import
FileWriter
from
mindspore.train
import
Model
from
mindspore.train.callback
import
Callback
from
mindspore.nn.metrics
import
Accuracy
from
mindspore.train.serialization
import
load_checkpoint
,
load_param_into_net
from
mindspore.train.callback
import
ModelCheckpoint
,
CheckpointConfig
,
LossMonitor
,
TimeMonitor
...
...
@@ -119,7 +115,7 @@ class ImdbParser():
encoded_features
.
append
(
encoded_sentence
)
self
.
__features
[
seg
]
=
encoded_features
def
__padding_features
(
self
,
seg
,
maxlen
=
5
00
,
pad
=
0
):
def
__padding_features
(
self
,
seg
,
maxlen
=
2
00
,
pad
=
0
):
""" pad all features to the same length """
padded_features
=
[]
for
feature
in
self
.
__features
[
seg
]:
...
...
@@ -287,11 +283,27 @@ class SentimentNet(nn.Cell):
embeddings
=
self
.
trans
(
embeddings
,
self
.
perm
)
output
,
_
=
self
.
encoder
(
embeddings
,
(
self
.
h
,
self
.
c
))
# states[i] size(64,200) -> encoding.size(64,400)
encoding
=
self
.
concat
((
output
[
0
],
output
[
4
99
]))
encoding
=
self
.
concat
((
output
[
0
],
output
[
1
99
]))
outputs
=
self
.
decoder
(
encoding
)
return
outputs
class
EvalCallBack
(
Callback
):
def
__init__
(
self
,
model
,
eval_dataset
,
eval_per_epoch
,
epoch_per_eval
):
self
.
model
=
model
self
.
eval_dataset
=
eval_dataset
self
.
eval_per_epoch
=
eval_per_epoch
self
.
epoch_per_eval
=
epoch_per_eval
def
epoch_end
(
self
,
run_context
):
cb_param
=
run_context
.
original_args
()
cur_epoch
=
cb_param
.
cur_epoch_num
if
cur_epoch
%
self
.
eval_per_epoch
==
0
:
acc
=
self
.
model
.
eval
(
self
.
eval_dataset
,
dataset_sink_mode
=
False
)
self
.
epoch_per_eval
[
"epoch"
].
append
(
cur_epoch
)
self
.
epoch_per_eval
[
"acc"
].
append
(
acc
[
"acc"
])
print
(
acc
)
if
__name__
==
'__main__'
:
parser
=
argparse
.
ArgumentParser
(
description
=
'MindSpore LSTM Example'
)
parser
.
add_argument
(
'--preprocess'
,
type
=
str
,
default
=
'true'
,
choices
=
[
'true'
,
'false'
],
...
...
@@ -310,10 +322,7 @@ if __name__ == '__main__':
help
=
'the target device to run, support "GPU", "CPU". Default: "GPU".'
)
args
=
parser
.
parse_args
([
'--device_target'
,
'CPU'
,
'--preprocess'
,
'true'
])
context
.
set_context
(
mode
=
context
.
GRAPH_MODE
,
save_graphs
=
False
,
device_target
=
args
.
device_target
)
context
.
set_context
(
mode
=
context
.
GRAPH_MODE
,
save_graphs
=
False
,
device_target
=
args
.
device_target
)
if
args
.
preprocess
==
"true"
:
print
(
"============== Starting Data Pre-processing =============="
)
...
...
@@ -321,6 +330,7 @@ if __name__ == '__main__':
print
(
"======================= Successful ======================="
)
ds_train
=
lstm_create_dataset
(
args
.
preprocess_path
,
cfg
.
batch_size
)
ds_eval
=
lstm_create_dataset
(
args
.
preprocess_path
,
cfg
.
batch_size
,
training
=
False
)
iterator
=
ds_train
.
create_dict_iterator
().
get_next
()
first_batch_label
=
iterator
[
"label"
]
...
...
@@ -344,23 +354,16 @@ if __name__ == '__main__':
model
=
Model
(
network
,
loss
,
opt
,
{
'acc'
:
Accuracy
()})
print
(
"============== Starting Training =============="
)
config_ck
=
CheckpointConfig
(
save_checkpoint_steps
=
cfg
.
save_checkpoint_steps
,
config_ck
=
CheckpointConfig
(
save_checkpoint_steps
=
ds_train
.
get_dataset_size
()
,
keep_checkpoint_max
=
cfg
.
keep_checkpoint_max
)
ckpoint_cb
=
ModelCheckpoint
(
prefix
=
"lstm"
,
directory
=
args
.
ckpt_path
,
config
=
config_ck
)
time_cb
=
TimeMonitor
(
data_size
=
ds_train
.
get_dataset_size
())
if
args
.
device_target
==
"CPU"
:
model
.
train
(
cfg
.
num_epochs
,
ds_train
,
callbacks
=
[
time_cb
,
ckpoint_cb
,
loss_cb
],
dataset_sink_mode
=
False
)
else
:
model
.
train
(
cfg
.
num_epochs
,
ds_train
,
callbacks
=
[
time_cb
,
ckpoint_cb
,
loss_cb
])
print
(
"============== Training Success =============="
)
args
.
ckpt_path
=
f
'./lstm-
{
cfg
.
num_epochs
}
_390.ckpt'
print
(
"============== Starting Testing =============="
)
ds_eval
=
lstm_create_dataset
(
args
.
preprocess_path
,
cfg
.
batch_size
,
training
=
False
)
param_dict
=
load_checkpoint
(
args
.
ckpt_path
)
load_param_into_net
(
network
,
param_dict
)
if
args
.
device_target
==
"CPU"
:
acc
=
model
.
eval
(
ds_eval
,
dataset_sink_mode
=
False
)
epoch_per_eval
=
{
"epoch"
:
[],
"acc"
:
[]}
eval_cb
=
EvalCallBack
(
model
,
ds_eval
,
1
,
epoch_per_eval
)
model
.
train
(
cfg
.
num_epochs
,
ds_train
,
callbacks
=
[
time_cb
,
ckpoint_cb
,
loss_cb
,
eval_cb
],
dataset_sink_mode
=
False
)
else
:
acc
=
model
.
eval
(
ds_eval
)
print
(
"============== {} =============="
.
format
(
acc
))
\ No newline at end of file
epoch_per_eval
=
{
"epoch"
:
[],
"acc"
:
[]}
eval_cb
=
EvalCallBack
(
model
,
ds_eval
,
1
,
epoch_per_eval
)
model
.
train
(
cfg
.
num_epochs
,
ds_train
,
callbacks
=
[
time_cb
,
ckpoint_cb
,
loss_cb
,
eval_cb
])
print
(
"============== Training Success =============="
)
\ No newline at end of file
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录