Skip to content
体验新版
项目
组织
正在加载...
登录
切换导航
打开侧边栏
Hypo
candock
提交
6043de95
C
candock
项目概览
Hypo
/
candock
通知
1
Star
0
Fork
0
代码
文件
提交
分支
Tags
贡献者
分支图
Diff
Issue
0
列表
看板
标记
里程碑
合并请求
0
Wiki
0
Wiki
分析
仓库
DevOps
项目成员
Pages
C
candock
项目概览
项目概览
详情
发布
仓库
仓库
文件
提交
分支
标签
贡献者
分支图
比较
Issue
0
Issue
0
列表
看板
标记
里程碑
合并请求
0
合并请求
0
Pages
分析
分析
仓库分析
DevOps
Wiki
0
Wiki
成员
成员
收起侧边栏
关闭侧边栏
动态
分支图
创建新Issue
提交
Issue看板
前往新版Gitcode,体验更适合开发者的 AI 搜索 >>
提交
6043de95
编写于
4月 04, 2020
作者:
H
hypox64
浏览文件
操作
浏览文件
下载
电子邮件补丁
差异文件
Update simple_test.py
上级
97149ce5
变更
11
显示空白变更内容
内联
并排
Showing
11 changed file
with
438 addition
and
164 deletion
+438
-164
.gitignore
.gitignore
+0
-1
README.md
README.md
+46
-82
checkpoints/pretrained/pretrained_model_URL
checkpoints/pretrained/pretrained_model_URL
+4
-1
datasets/simple_test/test_data_URL
datasets/simple_test/test_data_URL
+2
-1
docs/how_to_run(sleep_stage).md
docs/how_to_run(sleep_stage).md
+0
-0
docs/log_eg.txt
docs/log_eg.txt
+270
-0
docs/sleep_stage.md
docs/sleep_stage.md
+71
-0
heatmap.py
heatmap.py
+9
-3
options.py
options.py
+9
-9
simple_test.py
simple_test.py
+18
-62
train.py
train.py
+9
-5
未找到文件。
.gitignore
浏览文件 @
6043de95
...
...
@@ -137,5 +137,4 @@ dmypy.json
/python_test.py
*.pth
*.edf
*log*
*.png
\ No newline at end of file
README.md
浏览文件 @
6043de95
# candock
[
这原本是一个用于记录毕业设计的日志仓库
](
<
https://github.com/HypoX64/candock/tree/Graduation_Project
>
)
,其目的是尝试多种不同的深度神经网络结构(如LSTM,ResNet,DFCNN等)对单通道EEG进行自动化睡眠阶段分期.
<br>
目前,
毕业设计已经完成,我将继续跟进这个项目。项目重点将转变为如何将代码进行实际应用,我们将考虑运算量与准确率之间的平衡。另外,将提供一些预训练的模型便于使用。
<br>
同时我们相信这些代码也可以用于其他生理信号(如ECG,EMG等)的分类.希望这将有助于您的研究或项目
.
<br>
![
image
](
https://github.com/HypoX64/candock/blob/master/image/compare.png
)
[
这原本是一个用于记录毕业设计的日志仓库
](
<
https://github.com/HypoX64/candock/tree/Graduation_Project
>
)
,其目的是尝试多种不同的深度神经网络结构(如LSTM,ResNet,DFCNN等)对单通道EEG进行自动化睡眠阶段分期.
<br>
目前,
项目重点将转变为如何建立一个通用的一维时序信号分析,分类框架.
<br>
它将包含多种网络结构,并提供数据预处理,读取,训练,评估,测试等功能
.
<br>
一些训练时的输出样例:
[
heatmap
](
./image/heatmap_eg.png
)
[
running_err
]
(./image/running_err_eg.png)
[
log.txt
](
./docs/log_eg.txt
)
## 注意
为了适应新的项目,代码已被大幅更改,不能确保仍然能正常运行如sleep-edfx等睡眠数据集,如果仍然需要运行,请按照输入格式标准自行加载数据,如果有时间我会修复这个问题。
当然,也可以直接使用
[
老的版本
](
https://github.com/HypoX64/candock/tree/f24cc44933f494d2235b3bf965a04cde5e6a1ae9
)
```
python
'''
#数据输入格式
change your own data to train
but the data needs meet the following conditions:
1.type numpydata signals:np.float16 labels:np.int16
2.shape signals:[num,ch,length] labels:[num]
'''
为了适应新的项目,代码已被大幅更改,不能确保仍然能正常运行如sleep-edfx等睡眠数据集,如果仍然需要运行,请参照下文按照输入格式标准自行加载数据,如果有时间我会修复这个问题。
当然,如果需要加载睡眠数据集也可以直接使用
[
老的版本
](
https://github.com/HypoX64/candock/tree/f24cc44933f494d2235b3bf965a04cde5e6a1ae9
)
## Getting Started
### Prerequisites
-
Linux, Windows,mac
-
CPU or NVIDIA GPU + CUDA CuDNN
-
Python 3
-
Pytroch 1.0+
### Dependencies
This code depends on torchvision, numpy, scipy , matplotlib,available via pip install.
<br>
For example:
<br>
```
bash
pip3
install
matplotlib
```
## 如何运行
如果你需要运行这些代码(训练自己的模型或者使用预训练模型对自己的数据进行预测)请进入以下页面
<br>
[
How to run codes
](
https://github.com/HypoX64/candock/blob/master/how_to_run.md
)
<br>
## 数据集
使用了两个公开的睡眠数据集进行训练,分别是:
[
[CinC Challenge 2018]
](
https://physionet.org/physiobank/database/challenge/2018/#files
)
[
[sleep-edfx
]
](https://www.physionet.org/physiobank/database/sleep-edfx/)
<br>
对于CinC Challenge 2018数据集,我们仅使用其C4-M1通道, 对于sleep-edfx与sleep-edf数据集,使用Fpz-Cz通道
<br>
注意:
<br>
1.
如果需要获得其他EEG通道的预训练模型,这需要下载这两个数据集并使用train.py完成训练。当然,你也可以使用自己的数据训练模型。
<br>
2.
对于sleep-edfx数据集,我们仅仅截取了入睡前30分钟到醒来后30分钟之间的睡眠区间作为读入数据(实验结果中用select sleep time 进行标注),目的是平衡各睡眠时期的比例并加快训练速度.
<br>
## 一些说明
*
数据预处理
<br>
1.
降采样:CinC Challenge 2018数据集的EEG信号将被降采样到100HZ
<br>
2.
归一化处理:我们推荐每个受试者的EEG信号均采用5th-95th分位数归一化,即第5%大的数据为0,第95%大的数据为1。注意:所有预训练模型均按照这个方法进行归一化后训练得到
<br>
3.
将读取的数据分割为30s/Epoch作为一个输入,每个输入包含3000个数据点。睡眠阶段标签为5个分别是N3,N2,N1,REM,W.每个Epoch的数据将对应一个标签。标签映射:N3(S4+S3)->0 N2->1 N1->2 REM->3 W->4
<br>
4.
数据集扩充:训练时,对每一个Epoch的数据均要进行随机切,随机翻转,随机改变信号幅度等操作
<br>
5.
对于不同的网络结构,对原始eeg信号采取了预处理,使其拥有不同的shape:
<br>
LSTM:将30s的eeg信号进行FIR带通滤波,获得θ,σ,α,δ,β波,并将它们进行连接后作为输入数据
<br>
CNN_1d类(标有1d的网络):没有什么特别的操作,其实就是把图像领域的各种模型换成Conv1d之后拿过来用而已
<br>
DFCNN类(就是科大讯飞的那种想法,先转化为频谱图,然后直接用图像分类的各种模型):将30s的eeg信号进行短时傅里叶变换,并生成频谱图作为输入,并使用图像分类网络进行分类。我们不推荐使用这种方法,因为转化为频谱图需要耗费较大的运算资源。
<br>
*
EEG频谱图
<br>
这里展示5个睡眠阶段对应的频谱图,它们依次是Wake, Stage 1, Stage 2, Stage 3, REM
<br>
!
[
image
](
https://github.com/HypoX64/candock/blob/master/image/spectrum_Wake.png
)
!
[
image
](
https://github.com/HypoX64/candock/blob/master/image/spectrum_Stage1.png
)
!
[
image
](
https://github.com/HypoX64/candock/blob/master/image/spectrum_Stage2.png
)
!
[
image
](
https://github.com/HypoX64/candock/blob/master/image/spectrum_Stage3.png
)
!
[
image
](
https://github.com/HypoX64/candock/blob/master/image/spectrum_REM.png
)
<br>
*
multi_scale_resnet_1d 网络结构
<br>
该网络参考
[
geekfeiw / Multi-Scale-1D-ResNet
](
https://github.com/geekfeiw/Multi-Scale-1D-ResNet
)
这个网络将被我们命名为micro_multi_scale_resnet_1d
<br>
修改后的
[
网络结构
](
https://github.com/HypoX64/candock/blob/master/image/multi_scale_resnet_1d_network.png
)
<br>
*
关于交叉验证
<br>
为了更好的进行实际应用,我们将使用受试者交叉验证。即训练集和验证集的数据来自于不同的受试者。值得注意的是sleep-edfx数据集中每个受试者均有两个样本,我们视两个样本为同一个受试者,很多paper忽略了这一点,手动滑稽。
<br>
*
关于评估指标
<br>
对于各睡眠阶段标签: Accuracy = (TP+TN)/(TP+FN+TN+FP) Recall = sensitivity = (TP)/(TP+FN)
<br>
对于总体: Top1 err. Kappa 另外对Acc与Re做平均
<br>
特别说明:这项分类任务中样本标签分布及不平衡,为了更具说服力,我们的平均并不加权。
## 部分实验结果
该部分将持续更新... ...
<br>
[
[Confusion matrix]
](
https://github.com/HypoX64/candock/blob/master/confusion_mat
)
<br>
#### Subject Cross-Validation Results
特别说明:这项分类任务中样本标签分布及不平衡,我们对分类损失函数中的类别权重进行了魔改,这将使得Average Recall得到小幅提升,但同时整体error也将提升.若使用默认权重,Top1 err.至少下降5%,但这会导致数据占比极小的N1时期的recall猛跌20%,这绝对不是我们在实际应用中所希望看到的。下面给出的结果均是使用魔改后的权重得到的。
<br>
*
[
sleep-edfx
](
https://www.physionet.org/physiobank/database/sleep-edfx/
)
->sample size = 197, select sleep time
| Network | Parameters | Top1.err. | Avg. Acc. | Avg. Re. | Need to extract feature |
| --------------------------- | ---------- | --------- | --------- | -------- | ----------------------- |
| lstm | 1.25M | 26.32% | 89.47% | 68.57% | Yes |
| micro_multi_scale_resnet_1d | 2.11M | 25.33% | 89.87% | 72.61% | No |
| resnet18_1d | 3.85M | 24.21% | 90.31% | 72.87% | No |
| multi_scale_resnet_1d | 8.42M | 24.01% | 90.40% | 72.37% | No |
*
[
CinC Challenge 2018
](
https://physionet.org/physiobank/database/challenge/2018/#files
)
->sample size = 994
| Network | Parameters | Top1.err. | Avg. Acc. | Avg. Re. | Need to extract feature |
| --------------------------- | ---------- | --------- | --------- | -------- | ----------------------- |
| lstm | 1.25M | 26.85% | 89.26% | 71.39% | Yes |
| micro_multi_scale_resnet_1d | 2.11M | 27.01% | 89.20% | 73.12% | No |
| resnet18_1d | 3.85M | 25.84% | 89.66% | 73.32% | No |
| multi_scale_resnet_1d | 8.42M | 25.27% | 89.89% | 73.63% | No |
### Clone this repo:
```
bash
git clone https://github.com/HypoX64/candock
cd
candock
```
### Download dataset and pretrained-model
[
[Google Drive]
](
https://drive.google.com/open?id=1NTtLmT02jqlc81lhtzQ7GlPK8epuHfU5
)
[
[百度云,y4ks
]
](https://pan.baidu.com/s/1WKWZL91SekrSlhOoEC1bQA)
*
This datasets consists of signals.npy(shape:18207, 1, 2000) and labels.npy(shape:18207) which can be loaded by "np.load()".
*
samples:18207, channel:1, length of each sample:2000, class:50
*
Top1 err: 2.09%
### Train
```
bash
python3 train.py
--label
50
--input_nc
1
--dataset_dir
./datasets/simple_test
--save_dir
./checkpoints/simple_test
--model_name
micro_multi_scale_resnet_1d
--gpu_id
0
--batchsize
64
--k_fold
5
```
*
For more
[
options
](
./options.py
)
.
#### Use your own data to train
*
step1: Generate signals.npy and labels.npy in the following format.
```
python
#1.type:numpydata signals:np.float64 labels:np.int64
#2.shape signals:[num,ch,length] labels:[num]
#num:samples_num, ch :channel_num, num:length of each sample
#for example:
signals
=
np
.
zeros
((
10
,
1
,
10
),
dtype
=
'np.float64'
)
labels
=
np
.
array
([
0
,
0
,
0
,
0
,
0
,
1
,
1
,
1
,
1
,
1
])
#0->class0 1->class1
```
*
step2: input
```--dataset_dir your_dataset_dir```
when running code.
### Test
```
bash
python3 simple_test.py
--label
50
--input_nc
1
--model_name
micro_multi_scale_resnet_1d
--gpu_id
0
```
\ No newline at end of file
checkpoints/pretrained/pretrained_model_URL
浏览文件 @
6043de95
https://drive.google.com/open?id=1pup2_tZFGQQwB-hoXRjpMxiD4Vmpn0Lf
\ No newline at end of file
google drive: https://drive.google.com/open?id=1pup2_tZFGQQwB-hoXRjpMxiD4Vmpn0Lf
百度云: https://pan.baidu.com/s/1WKWZL91SekrSlhOoEC1bQA key:y4ks
datasets/simple_test/test_data_URL
浏览文件 @
6043de95
https://drive.google.com/open?id=1pup2_tZFGQQwB-hoXRjpMxiD4Vmpn0Lf
\ No newline at end of file
google drive: https://drive.google.com/open?id=1pup2_tZFGQQwB-hoXRjpMxiD4Vmpn0Lf
百度云: https://pan.baidu.com/s/1WKWZL91SekrSlhOoEC1bQA key:y4ks
how_to_run
.md
→
docs/how_to_run(sleep_stage)
.md
浏览文件 @
6043de95
文件已移动
docs/log_eg.txt
0 → 100644
浏览文件 @
6043de95
Sun Mar 22 00:30:40 2020
----------------- Options ---------------
BID: not-supported [default: 5_95_th]
batchsize: 16 [default: 64]
continue_train: False
dataset_dir: /home/hypo/MyProject/Ear_AU/datasets/emotion/candock_6class_60s_pad_selectlabel [default: ./datasets/sleep-edfx/]
dataset_name: preload
epochs: 150 [default: 20]
gpu_id: 1 [default: 0]
input_nc: 5 [default: 3]
k_fold: 5 [default: 0]
label: 6 [default: 5]
label_name: ['Amus', 'Neut', 'Sadn', 'Tend', 'Disg', 'Fear'] [default: auto]
lr: 0.001
model_name: multi_scale_resnet_1d [default: lstm]
network_save_freq: 1000 [default: 5]
no_cuda: False
no_cudnn: False
no_shuffle: False
pretrained: False
sample_num: not-supported [default: 20]
save_dir: ./checkpoints/EMDB_5ch_6class_last60s_pad_weightauto_selectlabel_multiscale [default: ./checkpoints/]
select_sleep_time: not-supported [default: False]
separated: False
signal_name: not-supported [default: EEG Fpz-Cz]
weight_mod: auto [default: normal]
----------------- End -------------------
network:
Multi_Scale_ResNet(
(pre_conv): Sequential(
(0): Conv1d(5, 64, kernel_size=(15,), stride=(2,), padding=(7,), bias=False)
(1): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): MaxPool1d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
)
(Route1): Route(
(block1): ResidualBlock(
(conv): Sequential(
(0): Conv1d(64, 64, kernel_size=(3,), stride=(1,), padding=(1,), bias=False)
(1): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Conv1d(64, 64, kernel_size=(3,), stride=(1,), padding=(1,), bias=False)
(4): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(shortcut): Sequential(
(0): Conv1d(64, 64, kernel_size=(1,), stride=(2,), bias=False)
(1): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(block2): ResidualBlock(
(conv): Sequential(
(0): Conv1d(64, 128, kernel_size=(3,), stride=(2,), padding=(1,), bias=False)
(1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Conv1d(128, 128, kernel_size=(3,), stride=(1,), padding=(1,), bias=False)
(4): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(shortcut): Sequential(
(0): Conv1d(64, 128, kernel_size=(1,), stride=(2,), bias=False)
(1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(block3): ResidualBlock(
(conv): Sequential(
(0): Conv1d(128, 256, kernel_size=(3,), stride=(2,), padding=(1,), bias=False)
(1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Conv1d(256, 256, kernel_size=(3,), stride=(1,), padding=(1,), bias=False)
(4): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(shortcut): Sequential(
(0): Conv1d(128, 256, kernel_size=(1,), stride=(2,), bias=False)
(1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(block4): ResidualBlock(
(conv): Sequential(
(0): Conv1d(256, 512, kernel_size=(3,), stride=(2,), padding=(1,), bias=False)
(1): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Conv1d(512, 512, kernel_size=(3,), stride=(1,), padding=(1,), bias=False)
(4): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(shortcut): Sequential(
(0): Conv1d(256, 512, kernel_size=(1,), stride=(2,), bias=False)
(1): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(avgpool): AdaptiveAvgPool1d(output_size=1)
)
(Route2): Route(
(block1): ResidualBlock(
(conv): Sequential(
(0): Conv1d(64, 64, kernel_size=(5,), stride=(1,), padding=(2,), bias=False)
(1): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Conv1d(64, 64, kernel_size=(5,), stride=(1,), padding=(2,), bias=False)
(4): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(shortcut): Sequential(
(0): Conv1d(64, 64, kernel_size=(1,), stride=(2,), bias=False)
(1): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(block2): ResidualBlock(
(conv): Sequential(
(0): Conv1d(64, 128, kernel_size=(5,), stride=(2,), padding=(2,), bias=False)
(1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Conv1d(128, 128, kernel_size=(5,), stride=(1,), padding=(2,), bias=False)
(4): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(shortcut): Sequential(
(0): Conv1d(64, 128, kernel_size=(1,), stride=(2,), bias=False)
(1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(block3): ResidualBlock(
(conv): Sequential(
(0): Conv1d(128, 256, kernel_size=(5,), stride=(2,), padding=(2,), bias=False)
(1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Conv1d(256, 256, kernel_size=(5,), stride=(1,), padding=(2,), bias=False)
(4): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(shortcut): Sequential(
(0): Conv1d(128, 256, kernel_size=(1,), stride=(2,), bias=False)
(1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(block4): ResidualBlock(
(conv): Sequential(
(0): Conv1d(256, 512, kernel_size=(5,), stride=(2,), padding=(2,), bias=False)
(1): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Conv1d(512, 512, kernel_size=(5,), stride=(1,), padding=(2,), bias=False)
(4): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(shortcut): Sequential(
(0): Conv1d(256, 512, kernel_size=(1,), stride=(2,), bias=False)
(1): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(avgpool): AdaptiveAvgPool1d(output_size=1)
)
(Route3): Route(
(block1): ResidualBlock(
(conv): Sequential(
(0): Conv1d(64, 64, kernel_size=(7,), stride=(1,), padding=(3,), bias=False)
(1): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Conv1d(64, 64, kernel_size=(7,), stride=(1,), padding=(3,), bias=False)
(4): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(shortcut): Sequential(
(0): Conv1d(64, 64, kernel_size=(1,), stride=(2,), bias=False)
(1): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(block2): ResidualBlock(
(conv): Sequential(
(0): Conv1d(64, 128, kernel_size=(7,), stride=(2,), padding=(3,), bias=False)
(1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Conv1d(128, 128, kernel_size=(7,), stride=(1,), padding=(3,), bias=False)
(4): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(shortcut): Sequential(
(0): Conv1d(64, 128, kernel_size=(1,), stride=(2,), bias=False)
(1): BatchNorm1d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(block3): ResidualBlock(
(conv): Sequential(
(0): Conv1d(128, 256, kernel_size=(7,), stride=(2,), padding=(3,), bias=False)
(1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Conv1d(256, 256, kernel_size=(7,), stride=(1,), padding=(3,), bias=False)
(4): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(shortcut): Sequential(
(0): Conv1d(128, 256, kernel_size=(1,), stride=(2,), bias=False)
(1): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(block4): ResidualBlock(
(conv): Sequential(
(0): Conv1d(256, 512, kernel_size=(7,), stride=(2,), padding=(3,), bias=False)
(1): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): Conv1d(512, 512, kernel_size=(7,), stride=(1,), padding=(3,), bias=False)
(4): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(shortcut): Sequential(
(0): Conv1d(256, 512, kernel_size=(1,), stride=(2,), bias=False)
(1): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(avgpool): AdaptiveAvgPool1d(output_size=1)
)
(fc): Linear(in_features=1536, out_features=6, bias=True)
)
net parameters: 8.42M
label statistics: [715 643 518 254 517 436]
Loss_weight:[0.81885882 0.87833147 0.99945861 1.39877569 1.00054139 1.09602268]
------------------------------ k-fold:1 ------------------------------
>>> per epoch cost time:99.97s
fold -> macro-prec,reca,F1,err,kappa: (0.3178, 0.3181, 0.2995, 0.6234, 0.2222)
confusion_mat:
[[64 42 4 3 16 6]
[18 82 8 5 20 5]
[24 41 21 2 16 6]
[12 18 4 0 7 2]
[ 8 24 7 0 50 10]
[ 8 28 6 2 27 12]]
------------------------------ k-fold:2 ------------------------------
>>> per epoch cost time:96.49s
fold -> macro-prec,reca,F1,err,kappa: (0.3149, 0.3155, 0.3127, 0.6464, 0.2092)
confusion_mat:
[[71 21 18 12 14 11]
[13 56 22 6 7 11]
[17 28 30 7 9 10]
[15 11 4 4 8 7]
[ 9 23 15 0 33 27]
[ 9 16 15 5 23 21]]
------------------------------ k-fold:3 ------------------------------
>>> per epoch cost time:95.27s
fold -> macro-prec,reca,F1,err,kappa: (0.3436, 0.3481, 0.3369, 0.6036, 0.2566)
confusion_mat:
[[86 17 18 8 13 2]
[16 53 24 4 15 7]
[24 20 38 3 9 5]
[15 16 14 3 7 2]
[10 18 12 4 49 12]
[11 23 16 2 20 12]]
------------------------------ k-fold:4 ------------------------------
>>> per epoch cost time:50.3s
fold -> macro-prec,reca,F1,err,kappa: (0.349, 0.354, 0.3469, 0.6102, 0.2523)
confusion_mat:
[[73 21 13 12 19 4]
[27 44 18 9 12 14]
[26 14 33 5 15 11]
[15 17 7 4 5 6]
[12 12 3 4 61 11]
[ 9 5 7 5 33 22]]
------------------------------ k-fold:5 ------------------------------
>>> per epoch cost time:49.65s
fold -> macro-prec,reca,F1,err,kappa: (0.3306, 0.3352, 0.3303, 0.6217, 0.2363)
confusion_mat:
[[68 18 18 9 20 7]
[17 61 29 5 13 12]
[18 23 30 5 14 7]
[ 9 14 13 2 5 3]
[11 14 5 3 45 19]
[ 9 17 11 3 27 24]]
------------------------------ final result ------------------------------
final -> macro-prec,reca,F1,err,kappa: (0.3299, 0.3345, 0.3284, 0.6211, 0.2357)
confusion_mat:
[[362 119 71 44 82 30]
[ 91 296 101 29 67 49]
[109 126 152 22 63 39]
[ 66 76 42 13 32 20]
[ 50 91 42 11 238 79]
[ 46 89 55 17 130 91]]
docs/sleep_stage.md
0 → 100644
浏览文件 @
6043de95
# candock
[
这原本是一个用于记录毕业设计的日志仓库
](
<
https://github.com/HypoX64/candock/tree/Graduation_Project
>
)
,其目的是尝试多种不同的深度神经网络结构(如LSTM,ResNet,DFCNN等)对单通道EEG进行自动化睡眠阶段分期.
<br>
目前,毕业设计已经完成,我将继续跟进这个项目。项目重点将转变为如何将代码进行实际应用,我们将考虑运算量与准确率之间的平衡。另外,将提供一些预训练的模型便于使用。
<br>
同时我们相信这些代码也可以用于其他生理信号(如ECG,EMG等)的分类.希望这将有助于您的研究或项目.
<br>
![
image
](
../image/compare.png
)
## 如何运行
如果你需要运行这些代码(训练自己的模型或者使用预训练模型对自己的数据进行预测)请进入以下页面
<br>
[
How to run codes
](
./how_to_run(sleep_stage
)
.md)
<br>
## 数据集
使用了两个公开的睡眠数据集进行训练,分别是:
[
[CinC Challenge 2018]
](
https://physionet.org/physiobank/database/challenge/2018/#files
)
[
[sleep-edfx
]
](https://www.physionet.org/physiobank/database/sleep-edfx/)
<br>
对于CinC Challenge 2018数据集,我们仅使用其C4-M1通道, 对于sleep-edfx与sleep-edf数据集,使用Fpz-Cz通道
<br>
注意:
<br>
1.
如果需要获得其他EEG通道的预训练模型,这需要下载这两个数据集并使用train.py完成训练。当然,你也可以使用自己的数据训练模型。
<br>
2.
对于sleep-edfx数据集,我们仅仅截取了入睡前30分钟到醒来后30分钟之间的睡眠区间作为读入数据(实验结果中用select sleep time 进行标注),目的是平衡各睡眠时期的比例并加快训练速度.
<br>
## 一些说明
*
数据预处理
<br>
1.
降采样:CinC Challenge 2018数据集的EEG信号将被降采样到100HZ
<br>
2.
归一化处理:我们推荐每个受试者的EEG信号均采用5th-95th分位数归一化,即第5%大的数据为0,第95%大的数据为1。注意:所有预训练模型均按照这个方法进行归一化后训练得到
<br>
3.
将读取的数据分割为30s/Epoch作为一个输入,每个输入包含3000个数据点。睡眠阶段标签为5个分别是N3,N2,N1,REM,W.每个Epoch的数据将对应一个标签。标签映射:N3(S4+S3)->0 N2->1 N1->2 REM->3 W->4
<br>
4.
数据集扩充:训练时,对每一个Epoch的数据均要进行随机切,随机翻转,随机改变信号幅度等操作
<br>
5.
对于不同的网络结构,对原始eeg信号采取了预处理,使其拥有不同的shape:
<br>
LSTM:将30s的eeg信号进行FIR带通滤波,获得θ,σ,α,δ,β波,并将它们进行连接后作为输入数据
<br>
CNN_1d类(标有1d的网络):没有什么特别的操作,其实就是把图像领域的各种模型换成Conv1d之后拿过来用而已
<br>
DFCNN类(就是科大讯飞的那种想法,先转化为频谱图,然后直接用图像分类的各种模型):将30s的eeg信号进行短时傅里叶变换,并生成频谱图作为输入,并使用图像分类网络进行分类。我们不推荐使用这种方法,因为转化为频谱图需要耗费较大的运算资源。
<br>
*
EEG频谱图
<br>
这里展示5个睡眠阶段对应的频谱图,它们依次是Wake, Stage 1, Stage 2, Stage 3, REM
<br>
!
[
image
](
../image/spectrum_Wake.png
)![
image
](
../image/spectrum_Stage1.png
)![
image
](
../image/spectrum_Stage2.png
)![
image
](
../image/spectrum_Stage3.png
)![
image
](
../image/spectrum_REM.png
)
<br>
*
multi_scale_resnet_1d 网络结构
<br>
该网络参考
[
geekfeiw / Multi-Scale-1D-ResNet
](
https://github.com/geekfeiw/Multi-Scale-1D-ResNet
)
这个网络将被我们命名为micro_multi_scale_resnet_1d
<br>
修改后的
[
网络结构
](
https://github.com/HypoX64/candock/blob/master/image/multi_scale_resnet_1d_network.png
)
<br>
*
关于交叉验证
<br>
为了更好的进行实际应用,我们将使用受试者交叉验证。即训练集和验证集的数据来自于不同的受试者。值得注意的是sleep-edfx数据集中每个受试者均有两个样本,我们视两个样本为同一个受试者,很多paper忽略了这一点,手动滑稽。
<br>
*
关于评估指标
<br>
对于各睡眠阶段标签: Accuracy = (TP+TN)/(TP+FN+TN+FP) Recall = sensitivity = (TP)/(TP+FN)
<br>
对于总体: Top1 err. Kappa 另外对Acc与Re做平均
<br>
特别说明:这项分类任务中样本标签分布及不平衡,为了更具说服力,我们的平均并不加权。
## 部分实验结果
该部分将持续更新... ...
<br>
[
[Confusion matrix]
](
../confusion_mat
)
<br>
#### Subject Cross-Validation Results
特别说明:这项分类任务中样本标签分布及不平衡,我们对分类损失函数中的类别权重进行了魔改,这将使得Average Recall得到小幅提升,但同时整体error也将提升.若使用默认权重,Top1 err.至少下降5%,但这会导致数据占比极小的N1时期的recall猛跌20%,这绝对不是我们在实际应用中所希望看到的。下面给出的结果均是使用魔改后的权重得到的。
<br>
*
[
sleep-edfx
](
https://www.physionet.org/physiobank/database/sleep-edfx/
)
->sample size = 197, select sleep time
| Network | Parameters | Top1.err. | Avg. Acc. | Avg. Re. | Need to extract feature |
| --------------------------- | ---------- | --------- | --------- | -------- | ----------------------- |
| lstm | 1.25M | 26.32% | 89.47% | 68.57% | Yes |
| micro_multi_scale_resnet_1d | 2.11M | 25.33% | 89.87% | 72.61% | No |
| resnet18_1d | 3.85M | 24.21% | 90.31% | 72.87% | No |
| multi_scale_resnet_1d | 8.42M | 24.01% | 90.40% | 72.37% | No |
*
[
CinC Challenge 2018
](
https://physionet.org/physiobank/database/challenge/2018/#files
)
->sample size = 994
| Network | Parameters | Top1.err. | Avg. Acc. | Avg. Re. | Need to extract feature |
| --------------------------- | ---------- | --------- | --------- | -------- | ----------------------- |
| lstm | 1.25M | 26.85% | 89.26% | 71.39% | Yes |
| micro_multi_scale_resnet_1d | 2.11M | 27.01% | 89.20% | 73.12% | No |
| resnet18_1d | 3.85M | 25.84% | 89.66% | 73.32% | No |
| multi_scale_resnet_1d | 8.42M | 25.27% | 89.89% | 73.63% | No |
```
```
\ No newline at end of file
heatmap.py
浏览文件 @
6043de95
...
...
@@ -131,16 +131,22 @@ def annotate_heatmap(im, data=None, valfmt="{x:.2f}",
def
draw
(
mat
,
opt
,
name
=
'train'
):
if
'merge'
in
name
:
label_name
=
opt
.
mergelabel_name
else
:
label_name
=
opt
.
label_name
mat
=
mat
.
astype
(
float
)
for
i
in
range
(
mat
.
shape
[
0
]):
mat
[
i
,:]
=
mat
[
i
,:]
/
np
.
sum
(
mat
[
i
])
*
100
if
len
(
mat
)
>
8
:
fig
,
ax
=
plt
.
subplots
(
figsize
=
(
len
(
mat
)
+
2.5
,
len
(
mat
)))
else
:
fig
,
ax
=
plt
.
subplots
()
ax
.
set_ylabel
(
'True'
,
fontsize
=
12
)
ax
.
set_xlabel
(
'Pred'
,
fontsize
=
12
)
im
,
cbar
=
create_heatmap
(
mat
,
opt
.
label_name
,
opt
.
label_name
,
ax
=
ax
,
im
,
cbar
=
create_heatmap
(
mat
,
label_name
,
label_name
,
ax
=
ax
,
cmap
=
"Blues"
,
cbarlabel
=
"percentage"
)
texts
=
annotate_heatmap
(
im
,
valfmt
=
"{x:.1f}%"
)
...
...
options.py
浏览文件 @
6043de95
...
...
@@ -3,9 +3,6 @@ import os
import
time
import
util
# python3 train.py --dataset_dir '/media/hypo/Hypo/physionet_org_train' --dataset_name cc2018 --signal_name 'C4-M1' --sample_num 20 --model_name lstm --batchsize 64 --epochs 20 --lr 0.0005 --no_cudnn
# python3 train.py --dataset_dir './datasets/sleep-edfx/' --dataset_name sleep-edfx --signal_name 'EEG Fpz-Cz' --sample_num 50 --model_name lstm --batchsize 64 --network_save_freq 5 --epochs 25 --lr 0.0005 --BID 5_95_th --select_sleep_time --no_cudnn --select_sleep_time
class
Options
():
def
__init__
(
self
):
self
.
parser
=
argparse
.
ArgumentParser
(
formatter_class
=
argparse
.
ArgumentDefaultsHelpFormatter
)
...
...
@@ -19,7 +16,7 @@ class Options():
self
.
parser
.
add_argument
(
'--label'
,
type
=
int
,
default
=
5
,
help
=
'number of labels'
)
self
.
parser
.
add_argument
(
'--input_nc'
,
type
=
int
,
default
=
3
,
help
=
'# of input channels'
)
self
.
parser
.
add_argument
(
'--label_name'
,
type
=
str
,
default
=
'auto'
,
help
=
'name of labels,example:"a,b,c,d,e,f"'
)
self
.
parser
.
add_argument
(
'--model_name'
,
type
=
str
,
default
=
'
lstm
'
,
help
=
'Choose model lstm | multi_scale_resnet_1d | resnet18 | micro_multi_scale_resnet_1d...'
)
self
.
parser
.
add_argument
(
'--model_name'
,
type
=
str
,
default
=
'
micro_multi_scale_resnet_1d
'
,
help
=
'Choose model lstm | multi_scale_resnet_1d | resnet18 | micro_multi_scale_resnet_1d...'
)
self
.
parser
.
add_argument
(
'--pretrained'
,
action
=
'store_true'
,
help
=
'if input, use pretrained models'
)
self
.
parser
.
add_argument
(
'--continue_train'
,
action
=
'store_true'
,
help
=
'if input, continue train'
)
self
.
parser
.
add_argument
(
'--lr'
,
type
=
float
,
default
=
0.001
,
help
=
'learning rate'
)
...
...
@@ -30,6 +27,7 @@ class Options():
self
.
parser
.
add_argument
(
'--k_fold'
,
type
=
int
,
default
=
0
,
help
=
'fold_num of k-fold.if 0 or 1,no k-fold'
)
self
.
parser
.
add_argument
(
'--mergelabel'
,
type
=
str
,
default
=
'None'
,
help
=
'merge some labels to one label and give the result, example:"[[0,1,4],[2,3,5]]" , label(0,1,4) regard as 0,label(2,3,5) regard as 1'
)
self
.
parser
.
add_argument
(
'--mergelabel_name'
,
type
=
str
,
default
=
'None'
,
help
=
'name of labels,example:"a,b,c,d,e,f"'
)
self
.
parser
.
add_argument
(
'--dataset_dir'
,
type
=
str
,
default
=
'./datasets/sleep-edfx/'
,
help
=
'your dataset path'
)
...
...
@@ -76,12 +74,14 @@ class Options():
names
.
append
(
str
(
i
))
self
.
opt
.
label_name
=
names
else
:
names
=
self
.
opt
.
label_name
names
=
names
.
replace
(
" "
,
""
)
names
=
names
.
split
(
","
)
self
.
opt
.
label_name
=
names
self
.
opt
.
label_name
=
self
.
opt
.
label_name
.
replace
(
" "
,
""
).
split
(
","
)
self
.
opt
.
mergelabel
=
eval
(
self
.
opt
.
mergelabel
)
if
self
.
opt
.
mergelabel_name
!=
'None'
:
self
.
opt
.
mergelabel_name
=
self
.
opt
.
mergelabel_name
.
replace
(
" "
,
""
).
split
(
","
)
"""Print and save options
...
...
simple_test.py
浏览文件 @
6043de95
...
...
@@ -10,73 +10,29 @@ from options import Options
from
creatnet
import
CreatNet
'''
--------------------------------preload data--------------------------------
@hypox64
19/05/18
download pretrained model and test data here:
https://drive.google.com/open?id=1NTtLmT02jqlc81lhtzQ7GlPK8epuHfU5
2020/04/03
'''
opt
=
Options
().
getparse
()
#choose and creat model
net
=
CreatNet
(
opt
.
model_name
)
net
=
CreatNet
(
opt
)
if
not
opt
.
no_cuda
:
net
.
cuda
()
if
not
opt
.
no_cudnn
:
import
torch.backends.cudnn
as
cudnn
cudnn
.
benchmark
=
True
#load data
signals
=
np
.
load
(
'./datasets/simple_test/signals.npy'
)
labels
=
np
.
load
(
'./datasets/simple_test/labels.npy'
)
#load prtrained_model
net
.
load_state_dict
(
torch
.
load
(
'./checkpoints/pretrained/
'
+
opt
.
dataset_name
+
'/'
+
opt
.
model_name
+
'
.pth'
))
net
.
load_state_dict
(
torch
.
load
(
'./checkpoints/pretrained/
micro_multi_scale_resnet_1d_50class
.pth'
))
net
.
eval
()
if
not
opt
.
no_cuda
:
net
.
cuda
()
def
runmodel
(
eeg
):
eeg
=
eeg
.
reshape
(
1
,
-
1
)
eeg
=
transformer
.
ToInputShape
(
eeg
,
opt
.
model_name
,
test_flag
=
True
)
eeg
=
transformer
.
ToTensor
(
eeg
,
no_cuda
=
opt
.
no_cuda
)
out
=
net
(
eeg
)
pred
=
torch
.
max
(
out
,
1
)[
1
]
pred_stage
=
pred
.
data
.
cpu
().
numpy
()
return
pred_stage
[
0
]
'''
you can change your input data here.
but the data needs meet the following conditions:
1.fs = 100Hz
2.collect by uv
3.type numpydata signals:np.float16 stages:np.int16
4.shape signals:[?,3000] stages:[?]
'''
eegdata
=
np
.
load
(
'./datasets/simple_test/sleep_edfx_Fpz_Cz_test.npy'
)
true_stages
=
np
.
load
(
'./datasets/simple_test/sleep_edfx_stages_test.npy'
)
print
(
'shape of eegdata:'
,
eegdata
.
shape
)
print
(
'shape of true_stage:'
,
true_stages
.
shape
)
#Normalize
eegdata
=
transformer
.
Balance_individualized_differences
(
eegdata
,
'5_95_th'
)
#run pretrained model
pred_stages
=
[]
for
i
in
range
(
len
(
eegdata
)):
pred_stages
.
append
(
runmodel
(
eegdata
[
i
]))
pred_stages
=
np
.
array
(
pred_stages
)
print
(
'err:'
,
sum
((
true_stages
[
i
]
!=
pred_stages
[
i
])
for
i
in
range
(
len
(
pred_stages
)))
/
len
(
true_stages
)
*
100
,
'%'
)
#plot result
plt
.
subplot
(
211
)
plt
.
plot
(
true_stages
+
1
)
plt
.
xlim
((
0
,
len
(
true_stages
)))
plt
.
ylim
((
0
,
6
))
plt
.
yticks
([
1
,
2
,
3
,
4
,
5
],[
'N3'
,
'N2'
,
'N1'
,
'REM'
,
'W'
])
plt
.
xticks
([],[])
plt
.
title
(
'Manually scored hypnogram'
)
plt
.
subplot
(
212
)
plt
.
plot
(
pred_stages
+
1
)
plt
.
xlim
((
0
,
len
(
true_stages
)))
plt
.
ylim
((
0
,
6
))
plt
.
yticks
([
1
,
2
,
3
,
4
,
5
],[
'N3'
,
'N2'
,
'N1'
,
'REM'
,
'W'
])
plt
.
xlabel
(
'Epoch number'
)
plt
.
title
(
'Auto scored hypnogram'
)
plt
.
show
()
for
signal
,
true_label
in
zip
(
signals
,
labels
):
signal
=
signal
.
reshape
(
1
,
1
,
-
1
)
#batchsize,ch,length
true_label
=
true_label
.
reshape
(
1
,
-
1
)
#batchsize,label
signal
,
true_label
=
transformer
.
ToTensor
(
signal
,
true_label
,
no_cuda
=
opt
.
no_cuda
)
out
=
net
(
signal
)
pred_label
=
torch
.
max
(
out
,
1
)[
1
]
pred_label
=
pred_label
.
data
.
cpu
().
numpy
()
true_label
=
true_label
.
data
.
cpu
().
numpy
()
print
((
"true:{0:d} predict:{1:d}"
).
format
(
true_label
[
0
][
0
],
pred_label
[
0
]))
train.py
浏览文件 @
6043de95
...
...
@@ -45,9 +45,11 @@ util.writelog('network:\n'+str(net),opt,True)
util
.
show_paramsnumber
(
net
,
opt
)
weight
=
np
.
ones
(
opt
.
label
)
if
opt
.
weight_mod
==
'auto'
:
weight
=
np
.
log
(
1
/
label_cnt_per
)
weight
=
weight
/
np
.
median
(
weight
)
weight
=
np
.
clip
(
weight
,
0.8
,
2
)
weight
=
1
/
label_cnt_per
weight
=
weight
/
np
.
min
(
weight
)
# weight = np.log(1/label_cnt_per)
# weight = weight/np.median(weight)
# weight = np.clip(weight, 0.8, 2)
util
.
writelog
(
'label statistics: '
+
str
(
label_cnt
),
opt
,
True
)
util
.
writelog
(
'Loss_weight:'
+
str
(
weight
),
opt
,
True
)
weight
=
torch
.
from_numpy
(
weight
).
float
()
...
...
@@ -149,6 +151,7 @@ for fold in range(opt.k_fold):
final_confusion_mat
=
confusion_mats
[
pos
]
if
opt
.
k_fold
==
1
:
statistics
.
statistics
(
final_confusion_mat
,
opt
,
'final'
,
'final_test'
)
np
.
save
(
os
.
path
.
join
(
opt
.
save_dir
,
'confusion_mat.npy'
),
final_confusion_mat
)
else
:
fold_final_confusion_mat
+=
final_confusion_mat
util
.
writelog
(
'fold -> macro-prec,reca,F1,err,kappa: '
+
str
(
statistics
.
report
(
final_confusion_mat
)),
opt
,
True
)
...
...
@@ -157,7 +160,8 @@ for fold in range(opt.k_fold):
if
opt
.
k_fold
!=
1
:
statistics
.
statistics
(
fold_final_confusion_mat
,
opt
,
'final'
,
'k-fold-final_test'
)
np
.
save
(
os
.
path
.
join
(
opt
.
save_dir
,
'confusion_mat.npy'
),
fold_final_confusion_mat
)
if
opt
.
mergelabel
:
mat
=
statistics
.
mergemat
(
fold_final_confusion_mat
,
opt
.
mergelabel
)
statistics
.
statistics
(
mat
,
opt
,
'merge'
,
'mergelabel_
test
'
)
statistics
.
statistics
(
mat
,
opt
,
'merge'
,
'mergelabel_
final
'
)
编辑
预览
Markdown
is supported
0%
请重试
或
添加新附件
.
添加附件
取消
You are about to add
0
people
to the discussion. Proceed with caution.
先完成此消息的编辑!
取消
想要评论请
注册
或
登录