diff --git a/docs/Lecture4/Lecture 4.md b/docs/Lecture4/Lecture 4.md new file mode 100644 index 0000000000000000000000000000000000000000..565abba01a844df6fd8921631a67af285bdf6baf --- /dev/null +++ b/docs/Lecture4/Lecture 4.md @@ -0,0 +1,58 @@ +# 一. 前言 + +本章节基于第三讲的命名实体识别 (NER),而本章节内容讲述其反向传播。前一章的知识点就不多叙述,本章节会直接引用。 + +# 二. 正篇 + +反向传播是一种利用微分链式法则来计算模型上任意参数的损失梯度的方法。为了更能容易的理解反向传播,我们先看下图中的一个网络 + +![](media/1.png) + +来看一下符号定义: +Xi是神经网络的输入;S是神经网络的输出;每层的神经元都接收一个输入和生成一个输出。第k层的第j个神经元接收标量输入![](media/2.png)和生成一个标量激活输出![](media/3.png)。我们把反向传播误差在![](media/2.png)的计算定义为![](media/4.png)。第1层认为是输入层而不是第1个隐藏层。对输入层,![](media/5.png)。 + +假设损失函数![](media/14.png)为正值,我们想更新参数![](media/15.png),我们看到![](media/15.png)只参与了![](media/16.png)和![](media/17.png)的计算。这点对于理解反向传播是非常重要的-参数的反向传播梯度只和参与了前向计算中的参数的值有关系,![](media/17.png)在随后的前向计算中和![](media/18.png)相乘计算得分。我们可以从最大间隔损失看到:![](media/19.png) + +### 偏置更新 + +偏置项和其他权值在数学形式是等价的,只是在计算下一层神经![](media/16.png)元输入时相乘的值是常量1。因此在第k层的第i个神经元的偏置的梯度时![](media/20.png)。例如在上面的例子中,我们更新的是![](media/21.png)而不是![](media/15.png),那么这个梯度为![](media/22.png)。 + +我们有从![](media/23.png)向后传播的误差![](media/20.png),如下图所示![](media/24.png) + +我们通过把![](media/20.png)与路径上的权值![](media/26.png)相乘,将这个误差反向传播到![](media/27.png)。因此在![](media/27.png)接收的误差是![](media/28.png)。然而,![](media/27.png)在前向计算可能出下图的情况,会参与下一层中的多个神经元的计算。那么第k层的第m个神经元的误差也要使用上一步方法将误差反向传播到![](media/27.png)上。 + +![](media/25.png) + +因此现在在![](media/27.png)接收的误差是![](media/29.png)。实际上,我们可以把上面误差和简化为![](media/30.png)。现在我们有在![](media/27.png)接正确的误差,然后将其与局部梯度![](media/31.png)相乘,把误差信息反向传到第k-1层的第j个神经元上。因此到达![](media/33.png)的误差为![](media/32.png)。 + +## Dropout层 + +Dropout是一种强大的正则化技术。在训练过程中,以一定的概率随机“丢弃”一个子集。然后在测试过程中,使用整个网络来预测。这样网络通常能从数据中学习更有意义的信息,不太可能过拟合,且通常能获得更高的整体性能。这种技术之所以如此有效一个直观的原因是,dropout所做的,本质上是同时训练成指数级的许多较小的网络,并对预测进行平均。引入dropout的方法:取每一层神经元的输出h,并保持每个神经元的概率为p,否则将其设置为0。然后,在反向传播过程中,只通过在正向传播过程中保持活性的神经元传递梯度。最后,在测试过程中,用网络中的所有神经元来计算正向传递。为了让dropout有效,预期的输出神经元在测试期间应该差不多。因此,通常必须在测试期间将每个神经元的输出除以某个值。 + +## 神经单元 + +包含sigmoid的神经网络,以引入非线性。在许多应用中,可以使用其他激活函数来设计网络。 + +### sigmoid + +![](media/6.png) + +![](media/7.png) + +### Tanh + +![](media/8.png) + +![](media/9.png) + +### Hard tanh + +![](media/10.png) + +![](media/11.png) + +## ReLU + +![](media/12.png) + +![](media/13.png) \ No newline at end of file diff --git a/docs/Lecture4/cs224n-2019-lecture04-backprop.pdf b/docs/Lecture4/cs224n-2019-lecture04-backprop.pdf new file mode 100644 index 0000000000000000000000000000000000000000..ad0293fdc5ce17668bb09e3dc842ae02dd35e37b Binary files /dev/null and b/docs/Lecture4/cs224n-2019-lecture04-backprop.pdf differ diff --git a/docs/Lecture4/media/1.png b/docs/Lecture4/media/1.png new file mode 100644 index 0000000000000000000000000000000000000000..74d5df42867cccaf0d2a5ac665e4fbb140f9a09c Binary files /dev/null and b/docs/Lecture4/media/1.png differ diff --git a/docs/Lecture4/media/10.png b/docs/Lecture4/media/10.png new file mode 100644 index 0000000000000000000000000000000000000000..8fdc2032039e94581e3160db241f39bd900ab442 Binary files /dev/null and b/docs/Lecture4/media/10.png differ diff --git a/docs/Lecture4/media/11.png b/docs/Lecture4/media/11.png new file mode 100644 index 0000000000000000000000000000000000000000..b98ab254709a877741da35a1fafeffbcf8263491 Binary files /dev/null and b/docs/Lecture4/media/11.png differ diff --git a/docs/Lecture4/media/12.png b/docs/Lecture4/media/12.png new file mode 100644 index 0000000000000000000000000000000000000000..1ab3f2fc19f158d60eeb08ed23fbb636058d7b60 Binary files /dev/null and b/docs/Lecture4/media/12.png differ diff --git a/docs/Lecture4/media/13.png b/docs/Lecture4/media/13.png new file mode 100644 index 0000000000000000000000000000000000000000..275b5fc4aa2cd83da1456491a9209ed7c4513992 Binary files /dev/null and b/docs/Lecture4/media/13.png differ diff --git a/docs/Lecture4/media/14.png b/docs/Lecture4/media/14.png new file mode 100644 index 0000000000000000000000000000000000000000..4c725d40da48c0b973362518793be41987c38605 Binary files /dev/null and b/docs/Lecture4/media/14.png differ diff --git a/docs/Lecture4/media/15.png b/docs/Lecture4/media/15.png new file mode 100644 index 0000000000000000000000000000000000000000..278953cbdca1907128a9333a1c2ee9efa0cacd27 Binary files /dev/null and b/docs/Lecture4/media/15.png differ diff --git a/docs/Lecture4/media/16.png b/docs/Lecture4/media/16.png new file mode 100644 index 0000000000000000000000000000000000000000..9ef298e89815ec4e10785982a20a8659d297323c Binary files /dev/null and b/docs/Lecture4/media/16.png differ diff --git a/docs/Lecture4/media/17.png b/docs/Lecture4/media/17.png new file mode 100644 index 0000000000000000000000000000000000000000..c92582b151220a5a6eddd7e8235b5865ea4181a5 Binary files /dev/null and b/docs/Lecture4/media/17.png differ diff --git a/docs/Lecture4/media/18.png b/docs/Lecture4/media/18.png new file mode 100644 index 0000000000000000000000000000000000000000..e1066148d28ff91f11b12985e8bffd168d5aee74 Binary files /dev/null and b/docs/Lecture4/media/18.png differ diff --git a/docs/Lecture4/media/19.png b/docs/Lecture4/media/19.png new file mode 100644 index 0000000000000000000000000000000000000000..ff9dada7a7783629483a343e3be5fddc2c099618 Binary files /dev/null and b/docs/Lecture4/media/19.png differ diff --git a/docs/Lecture4/media/2.png b/docs/Lecture4/media/2.png new file mode 100644 index 0000000000000000000000000000000000000000..7b957a1b55bb4857a6996fa14f09eeb42734c87f Binary files /dev/null and b/docs/Lecture4/media/2.png differ diff --git a/docs/Lecture4/media/20.png b/docs/Lecture4/media/20.png new file mode 100644 index 0000000000000000000000000000000000000000..b85aa33f8a2b5e445433a796e82bdf49c949e6bd Binary files /dev/null and b/docs/Lecture4/media/20.png differ diff --git a/docs/Lecture4/media/21.png b/docs/Lecture4/media/21.png new file mode 100644 index 0000000000000000000000000000000000000000..d4d7920ca35c951d9c16dae383ab3e90f579ff8a Binary files /dev/null and b/docs/Lecture4/media/21.png differ diff --git a/docs/Lecture4/media/22.png b/docs/Lecture4/media/22.png new file mode 100644 index 0000000000000000000000000000000000000000..13322092daaf55d2b0e2c418a1caa19225d0bbc4 Binary files /dev/null and b/docs/Lecture4/media/22.png differ diff --git a/docs/Lecture4/media/23.png b/docs/Lecture4/media/23.png new file mode 100644 index 0000000000000000000000000000000000000000..acdf331eb4a28d3f4977e47a35283f98bc4bdd91 Binary files /dev/null and b/docs/Lecture4/media/23.png differ diff --git a/docs/Lecture4/media/24.png b/docs/Lecture4/media/24.png new file mode 100644 index 0000000000000000000000000000000000000000..5672a5c8e0d4f9fbfa8e175f014fa21981ce00f5 Binary files /dev/null and b/docs/Lecture4/media/24.png differ diff --git a/docs/Lecture4/media/25.png b/docs/Lecture4/media/25.png new file mode 100644 index 0000000000000000000000000000000000000000..04b1712406347b93b967f7ee3a4db73a2b97b92f Binary files /dev/null and b/docs/Lecture4/media/25.png differ diff --git a/docs/Lecture4/media/26.png b/docs/Lecture4/media/26.png new file mode 100644 index 0000000000000000000000000000000000000000..6369f05b9d774c8264874a3a764a201dc72c77e0 Binary files /dev/null and b/docs/Lecture4/media/26.png differ diff --git a/docs/Lecture4/media/27.png b/docs/Lecture4/media/27.png new file mode 100644 index 0000000000000000000000000000000000000000..6e5f1dc549dded8cf80cb6d13b4bdd88b58e5a63 Binary files /dev/null and b/docs/Lecture4/media/27.png differ diff --git a/docs/Lecture4/media/28.png b/docs/Lecture4/media/28.png new file mode 100644 index 0000000000000000000000000000000000000000..81f08efde70fe2eac33b94c1aae2ade8eb829b33 Binary files /dev/null and b/docs/Lecture4/media/28.png differ diff --git a/docs/Lecture4/media/29.png b/docs/Lecture4/media/29.png new file mode 100644 index 0000000000000000000000000000000000000000..dcf0e1d55ee0b791252a864e9a7c0c0a3c179283 Binary files /dev/null and b/docs/Lecture4/media/29.png differ diff --git a/docs/Lecture4/media/3.png b/docs/Lecture4/media/3.png new file mode 100644 index 0000000000000000000000000000000000000000..96b35cf128ae2ea0d7fbc0da63885d5f7e521fbc Binary files /dev/null and b/docs/Lecture4/media/3.png differ diff --git a/docs/Lecture4/media/30.png b/docs/Lecture4/media/30.png new file mode 100644 index 0000000000000000000000000000000000000000..e7a3062688c92ade85d178b528c80a8c81da9e4d Binary files /dev/null and b/docs/Lecture4/media/30.png differ diff --git a/docs/Lecture4/media/31.png b/docs/Lecture4/media/31.png new file mode 100644 index 0000000000000000000000000000000000000000..b3059f2e3ec26810625dd317a60b508b9989cea6 Binary files /dev/null and b/docs/Lecture4/media/31.png differ diff --git a/docs/Lecture4/media/32.png b/docs/Lecture4/media/32.png new file mode 100644 index 0000000000000000000000000000000000000000..f1911e840e493dc7660b106f6f1a1497c6230404 Binary files /dev/null and b/docs/Lecture4/media/32.png differ diff --git a/docs/Lecture4/media/33.png b/docs/Lecture4/media/33.png new file mode 100644 index 0000000000000000000000000000000000000000..d2329dd3aeacead79dac8e55cea2b0e682c392ba Binary files /dev/null and b/docs/Lecture4/media/33.png differ diff --git a/docs/Lecture4/media/4.png b/docs/Lecture4/media/4.png new file mode 100644 index 0000000000000000000000000000000000000000..dfd22068f1ac78daf6e0119ad39f7def06e527c7 Binary files /dev/null and b/docs/Lecture4/media/4.png differ diff --git a/docs/Lecture4/media/5.png b/docs/Lecture4/media/5.png new file mode 100644 index 0000000000000000000000000000000000000000..078e5e8ec18346eb625572fdc6ec953dba6c61ab Binary files /dev/null and b/docs/Lecture4/media/5.png differ diff --git a/docs/Lecture4/media/6.png b/docs/Lecture4/media/6.png new file mode 100644 index 0000000000000000000000000000000000000000..17616a6316f668f7aa959c399f8181f3c8990677 Binary files /dev/null and b/docs/Lecture4/media/6.png differ diff --git a/docs/Lecture4/media/7.png b/docs/Lecture4/media/7.png new file mode 100644 index 0000000000000000000000000000000000000000..43461d566265935ca0e4bd5058b0831961713e51 Binary files /dev/null and b/docs/Lecture4/media/7.png differ diff --git a/docs/Lecture4/media/8.png b/docs/Lecture4/media/8.png new file mode 100644 index 0000000000000000000000000000000000000000..4032b06bcfe47eb2742a14366398bd77ab93b727 Binary files /dev/null and b/docs/Lecture4/media/8.png differ diff --git a/docs/Lecture4/media/9.png b/docs/Lecture4/media/9.png new file mode 100644 index 0000000000000000000000000000000000000000..3bb752f24c5c1885556976b58fd81e77172850b9 Binary files /dev/null and b/docs/Lecture4/media/9.png differ