1 - Hizdm Blog

                                
        <div class="c-contents" style="">
            <div class="contents-inner">
                                    <h6 class="contents-title"><span><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">文章目录</font></font></span></h6>
                                <ul class="contents-chapters"><li class="m-element-type-h2"><a href="#jiazhi"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">为什么需要RNN ？</font><font style="vertical-align: inherit;">独特价值是什么？</font></font></a></li><li class="m-element-type-h2"><a href="#yuanli"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">RNN 的基本原理</font></font></a></li><li class="m-element-type-h2"><a href="#youhua"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">RNN 的优化演算法</font></font></a></li><li class="m-element-type-h2"><a href="#yingyong"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">RNN 的应用和使用场景</font></font></a></li><li class="m-element-type-h2"><a href="#zongjie"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">总结</font></font></a></li><li class="m-element-type-h2"><a href="#baidu"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">百度百科+维基百科</font></font></a></li><li class="m-element-type-h2"><a href="#links"><font style="vertical-align: inherit;"><font style="vertical-align: inherit;">扩展阅读</font></font></a></li></ul></div>
        </div>

                        
            <p><picture class="alignnone">

<source type="image/webp" data-lazy-srcset="https://easyai.tech/wp-content/uploads/2022/08/7fe86-2019-07-04-yiwen.png.webp"; srcset="https://easyai.tech/wp-content/uploads/2022/08/7fe86-2019-07-04-yiwen.png.webp";>
一文看懂循环神经网路RNN
</picture>
<noscript><picture class="alignnone">
<source type="image/webp" srcset="https://easyai.tech/wp-content/uploads/2022/08/7fe86-2019-07-04-yiwen.png.webp";/>
一文看懂循環神經網路RNN
</picture>
</noscript>

卷积神经网路– CNN 已经很强大的，为什么还需要RNN？ 本文会用通俗易懂的方式来解释RNN 的独特价值——处理序列数据。同时还会说明RNN 的一些缺陷和它的变种演算法。 最后给大家介绍一下RNN 的实际应用价值和使用场景。

<h2 id="jiazhi">为什么需要RNN ？独特价值是什么？</h2>
卷积神经网路– CNN和普通的演算法大部分都是输入和输出的一一对应，也就是一个输入得到一个输出。不同的输入之间是没有联系的。
<picture>
<source type="image/webp" data-lazy-srcset="https://easyai.tech/wp-content/uploads/2022/08/46b63-2019-07-04-input-output.png.webp"; srcset="https://easyai.tech/wp-content/uploads/2022/08/46b63-2019-07-04-input-output.png.webp";>
大部分演算法都是输入和输出的一一对应
</picture>
<noscript><picture>
<source type="image/webp" srcset="https://easyai.tech/wp-content/uploads/2022/08/46b63-2019-07-04-input-output.png.webp";/>
大部分演算法都是輸入和輸出的一一對應
</picture>
</noscript>
但是在某些场景中，一个输入就不够了！
为了填好下面的空，取前面任何一个词都不合适，我们不但需要知道前面所有的词，还需要知道词之间的顺序。
<picture>
<source type="image/webp" data-lazy-srcset="https://easyai.tech/wp-content/uploads/2022/08/b68f4-2019-07-04-tiankong.png.webp"; srcset="https://easyai.tech/wp-content/uploads/2022/08/b68f4-2019-07-04-tiankong.png.webp";>
序列数据的处理
</picture>
<noscript><picture>
<source type="image/webp" srcset="https://easyai.tech/wp-content/uploads/2022/08/b68f4-2019-07-04-tiankong.png.webp";/>
序列數據的處理
</picture>
</noscript>
这种需要处理「序列数据– 一串相互依赖的数据流」的场景就需要使用RNN 来解决了。
典型的集中序列数据：

文章里的文字内容
语音里的音频内容
股票市场中的价格走势
……

RNN 之所以能够有效的处理序列数据，主要是基于他的比较特殊的运行原理。下面给大家介绍一下RNN 的基本运行原理。
 
<h2 id="yuanli">RNN 的基本原理</h2>
传统神经网路的结构比较简单：输入层– 隐藏层– 输出层。如下图所示：
<picture>
<source type="image/webp" data-lazy-srcset="https://easyai.tech/wp-content/uploads/2022/08/6015f-2019-07-02-chuantong.png.webp"; srcset="https://easyai.tech/wp-content/uploads/2022/08/6015f-2019-07-02-chuantong.png.webp";>

</picture>
<noscript><picture>
<source type="image/webp" srcset="https://easyai.tech/wp-content/uploads/2022/08/6015f-2019-07-02-chuantong.png.webp";/>
傳統神經網路
</picture>
</noscript>
RNN 跟传统神经网路最大的区别在于每次都会将前一次的输出结果，带到下一次的隐藏层中，一起训练。如下图所示：<div class="google-auto-placed ap_container" style="width: 100%; height: auto; clear: both; text-align: center;"><ins data-ad-format="auto" class="adsbygoogle adsbygoogle-noablate" data-ad-client="ca-pub-5719288141555693" data-adsbygoogle-status="done" style="display: block; margin: auto; background-color: transparent; height: 280px;" data-ad-status="filled"><div id="aswift_2_host" style="border: none; height: 280px; width: 732px; margin: 0px; padding: 0px; position: relative; visibility: visible; background-color: transparent; display: inline-block; overflow: visible;" tabindex="0" title="Advertisement" aria-label="Advertisement"></div></ins></div>
 RNN区别 <noscript> RNN區別 </noscript>
下面用一个具体的案例来看看RNN 是如何工作的：
假如需要判断用户的说话意图（问天气、问时间、设置闹钟…），用户说了一句「what time is it？」我们需要先对这句话进行分词：
 对输入进行分词 <noscript> 對輸入進行分詞 </noscript>
然后按照顺序输入RNN ，我们先将「what」作为RNN 的输入，得到输出「01」
 输入what，得到输出01 <noscript> 輸入what，得到輸出01 </noscript>
然后，我们按照顺序，将「time」输入到RNN 网路，得到输出「02」。
这个过程我们可以看到，输入「time」的时候，前面「what」的输出也产生了影响（隐藏层中有一半是黑色的）。
<noscript></noscript><div class="google-auto-placed ap_container" style="width: 100%; height: auto; clear: both; text-align: center;"><ins data-ad-format="auto" class="adsbygoogle adsbygoogle-noablate" data-ad-client="ca-pub-5719288141555693" data-adsbygoogle-status="done" style="display: block; margin: auto; background-color: transparent; height: 280px;" data-ad-status="filled"><div id="aswift_3_host" style="border: none; height: 280px; width: 732px; margin: 0px; padding: 0px; position: relative; visibility: visible; background-color: transparent; display: inline-block; overflow: visible;" tabindex="0" title="Advertisement" aria-label="Advertisement"></div></ins></div>
以此类推，前面所有的输入都对未来的输出产生了影响，大家可以看到圆形隐藏层中包含了前面所有的颜色。如下图所示：
 RNN 对前面输入有「记忆」作用的体现 <noscript> RNN 對前面輸入有「記憶」作用的體現 </noscript>
当我们判断意图的时候，只需要最后一层的输出「05」，如下图所示：
 RNN 最后一层的输出是我们最终想要的 <noscript> RNN 最後一層的輸出是我們最終想要的 </noscript>
RNN 的缺点也比较明显
<picture>
<source type="image/webp" data-lazy-srcset="https://easyai.tech/wp-content/uploads/2022/08/697a8-2019-07-02-010144.jpg.webp"; srcset="https://easyai.tech/wp-content/uploads/2022/08/697a8-2019-07-02-010144.jpg.webp";>
隐藏层中的颜色分布
</picture>
<noscript><picture>
<source type="image/webp" srcset="https://easyai.tech/wp-content/uploads/2022/08/697a8-2019-07-02-010144.jpg.webp";/>
隱藏層中的顏色分布
</picture>
</noscript>
通过上面的例子，我们已经发现，短期的记忆影响较大（如橙色区域），但是长期的记忆影响就很小（如黑色和绿色区域），这就是RNN 存在的短期记忆问题。

RNN 有短期记忆问题，无法处理很长的输入序列
训练RNN 需要投入极大的成本

由于RNN 的短期记忆问题，后来又出现了基于RNN 的优化演算法，下面给大家简单介绍一下。
 
<h2 id="youhua">RNN 的优化演算法</h2>
<h3>RNN 到LSTM – 长短期记忆网路</h3>
RNN 是一种死板的逻辑，越晚的输入影响越大，越早的输入影响越小，且无法改变这个逻辑。
LSTM做的最大的改变就是打破了这个死板的逻辑，而改用了一套灵活了逻辑——只保留重要的信息。
简单说就是：抓重点！
<picture>
<source type="image/webp" data-lazy-srcset="https://easyai.tech/wp-content/uploads/2022/08/933c5-2019-07-04-rnn-lstm.png.webp"; srcset="https://easyai.tech/wp-content/uploads/2022/08/933c5-2019-07-04-rnn-lstm.png.webp";>
RNN的序列逻辑到LSTM的抓重点逻辑
</picture>
<noscript><picture>
<source type="image/webp" srcset="https://easyai.tech/wp-content/uploads/2022/08/933c5-2019-07-04-rnn-lstm.png.webp";/>
RNN的序列邏輯到LSTM的抓重點邏輯
</picture>
</noscript>
举个例子，我们先快速的阅读下面这段话：
<picture>
<source type="image/webp" data-lazy-srcset="https://easyai.tech/wp-content/uploads/2022/08/4e81a-2019-07-03-pinglun.png.webp"; srcset="https://easyai.tech/wp-content/uploads/2022/08/4e81a-2019-07-03-pinglun.png.webp";>
快速阅读这段话
</picture>
<noscript><picture>
<source type="image/webp" srcset="https://easyai.tech/wp-content/uploads/2022/08/4e81a-2019-07-03-pinglun.png.webp";/>
快速閱讀這段話
</picture>
</noscript>
当我们快速阅读完之后，可能只会记住下面几个重点：
<picture>
<source type="image/webp" data-lazy-srcset="https://easyai.tech/wp-content/uploads/2022/08/5a1a2-2019-07-03-pinglun-hzd.png.webp"; srcset="https://easyai.tech/wp-content/uploads/2022/08/5a1a2-2019-07-03-pinglun-hzd.png.webp";>

</picture>
<noscript><picture>
<source type="image/webp" srcset="https://easyai.tech/wp-content/uploads/2022/08/5a1a2-2019-07-03-pinglun-hzd.png.webp";/>
劃重點
</picture>
</noscript>
LSTM 类似上面的划重点，他可以保留较长序列数据中的「重要信息」，忽略不重要的信息。这样就解决了RNN 短期记忆的问题。<div class="google-auto-placed ap_container" style="width: 100%; height: auto; clear: both; text-align: center;"><ins data-ad-format="auto" class="adsbygoogle adsbygoogle-noablate" data-ad-client="ca-pub-5719288141555693" data-adsbygoogle-status="done" style="display: block; margin: auto; background-color: transparent; height: 280px;" data-ad-status="filled"><div id="aswift_4_host" style="border: none; height: 280px; width: 732px; margin: 0px; padding: 0px; position: relative; visibility: visible; background-color: transparent; display: inline-block; overflow: visible;" tabindex="0" title="Advertisement" aria-label="Advertisement"></div></ins></div>
具体技术上的实现原理就不在这里展开了，感兴趣的可以看看LSTM 的详细介绍《长短期记忆网路– LSTM》
 
<h3>从LSTM 到GRU</h3>
Gated Recurrent Unit – GRU 是LSTM 的一个变体。他保留了LSTM 划重点，遗忘不重要信息的特点，在long-term 传播的时候也不会被丢失。
<picture>
<source type="image/webp" data-lazy-srcset="https://easyai.tech/wp-content/uploads/2022/08/6839b-2019-07-03-lstm-gru.png.webp"; srcset="https://easyai.tech/wp-content/uploads/2022/08/6839b-2019-07-03-lstm-gru.png.webp";>
GRU 主要是在LSTM的模型上做了一些简化和调整
</picture>
<noscript><picture>
<source type="image/webp" srcset="https://easyai.tech/wp-content/uploads/2022/08/6839b-2019-07-03-lstm-gru.png.webp";/>
GRU 主要是在LSTM的模型上做了一些簡化和調整
</picture>
</noscript>
GRU 主要是在LSTM 的模型上做了一些简化和调整，在训练数据集比较大的情况下可以节省很多时间。
 
<h2 id="yingyong">RNN 的应用和使用场景</h2>
只要涉及到序列数据的处理问题，都可以使用到，NLP就是一个典型的应用场景。
<picture>
<source type="image/webp" data-lazy-srcset="https://easyai.tech/wp-content/uploads/2022/08/b3243-2019-07-04-yingyong.png.webp"; srcset="https://easyai.tech/wp-content/uploads/2022/08/b3243-2019-07-04-yingyong.png.webp";>
RNN的应用和使用场景
</picture>
<noscript><picture>
<source type="image/webp" srcset="https://easyai.tech/wp-content/uploads/2022/08/b3243-2019-07-04-yingyong.png.webp";/>
RNN的應用和使用場景
</picture>
</noscript>
文本生成：类似上面的填空题，给出前后文，然后预测空格中的词是什么。
机器翻译：翻译工作也是典型的序列问题，词的顺序直接影响了翻译的结果。
语音识别：根据输入音频判断对应的文字是什么。
生成图像描述：类似看图说话，给一张图，能够描述出图片中的内容。这个往往是RNN 和CNN 的结合。
<picture>
<source type="image/webp" data-lazy-srcset="https://easyai.tech/wp-content/uploads/2022/08/69b01-2019-07-04-kantu.png.webp"; srcset="https://easyai.tech/wp-content/uploads/2022/08/69b01-2019-07-04-kantu.png.webp";>
生成图像描述
</picture>
<noscript><picture>
<source type="image/webp" srcset="https://easyai.tech/wp-content/uploads/2022/08/69b01-2019-07-04-kantu.png.webp";/>
生成圖像描述
</picture>
</noscript>
视频标记：他将视频分解为图片，然后用图像描述来描述图片内容。
 
<h2 id="zongjie">总结</h2>
RNN的独特价值在于：它能有效的处理序列数据。比如：文章内容、语音音频、股票价格走势…
之所以他能处理序列数据，是因为在序列中前面的输入也会影响到后面的输出，相当于有了「记忆功能」。但是RNN 存在严重的短期记忆问题，长期的数据影响很小（哪怕他是重要的信息）。
于是基于RNN 出现了LSTM 和GRU 等变种演算法。这些变种演算法主要有几个特点：

长期信息可以有效的保留
挑选重要信息保留，不重要的信息会选择「遗忘」

RNN 几个典型的应用如下：

文本生成
语音识别
机器翻译
生成图像描述
视频标记

<h2 id="baidu">百度百科+维基百科</h2>
<div class="su-box su-box-style-default" id="" style="border-color:#007870;border-radius:5px"><div class="su-box-title" style="background-color:#28aba3;color:#FFFFFF;border-top-left-radius:3px;border-top-right-radius:3px">百度百科版本</div><div class="su-box-content su-u-clearfix su-u-trim" style="border-bottom-left-radius:3px;border-bottom-right-radius:3px">
循环神经网路（Recurrent Neural Network, RNN）是一类以序列（sequence）数据为输入，在序列的演进方向进行递归（recursion）且所有节点（循环单元）按链式连接形成闭合回路的递归神经网路（recursive neural network）。
对循环神经网路的研究始于二十世纪80-90年代，并在二十一世纪初发展为重要的深度学习（deep learning）演算法，其中双向循环神经网路（Bidirectional RNN, Bi-RNN）和长短期记忆网路（Long Short-Term Memory networks，LSTM）是常见的的循环神经网路。
循环神经网路具有记忆性、参数共享并且图灵完备（Turing completeness），因此能以很高的效率对序列的非线性特征进行学习。循环神经网路在自然语言处理（Natural Language Processing, NLP），例如语音识别、语言建模、机器翻译等领域有重要应用，也被用于各类时间序列预报或与卷积神经网路（Convoutional Neural Network,CNN）相结合处理计算机视觉问题。
查看详情
</div></div>
<div class="su-box su-box-style-default" id="" style="border-color:#007870;border-radius:5px"><div class="su-box-title" style="background-color:#28aba3;color:#FFFFFF;border-top-left-radius:3px;border-top-right-radius:3px">维基百科版本</div><div class="su-box-content su-u-clearfix su-u-trim" style="border-bottom-left-radius:3px;border-bottom-right-radius:3px">
循环神经网路（RNN）是一类神经网路，其中节点之间的连接形成一个有向图沿着序列。这允许它展示时间序列的时间动态行为。与前馈神经网路不同，RNN可以使用其内部状态（存储器）来处理输入序列。这使它们适用于诸如未分段，连接手写识别或语音识别等任务。
术语「递归神经网路」被不加选择地用于指代具有类似一般结构的两大类网路，其中一个是有限脉冲而另一个是无限脉冲。两类网路都表现出时间动态行为。有限脉冲递归网路是一种有向无环图，可以展开并用严格的前馈神经网路代替，而无限脉冲循环网路是一种无法展开的有向循环图。
有限脈衝和無限脈衝周期性網路都可以具有額外的存儲狀態，並且存儲可以由神經網路直接控制。如果存儲包含時間延遲或具有反饋循環，則存儲也可以由另一個網路或圖表替換。這種受控狀態稱為門控狀態或門控存儲器，並且是長短期存儲器網路（LSTM）和門控循環單元的一部分。
查看詳情
</div></div>
 
<h2 id="links">擴展閱讀</h2>
<div class="su-spoiler su-spoiler-style-fancy su-spoiler-icon-arrow-circle-1 su-spoiler-closed" data-scroll-offset="0" data-anchor-in-url="no"><div class="su-spoiler-title" tabindex="0" role="button">入門類文章（2）</div><div class="su-spoiler-content su-u-clearfix su-u-trim">
如何理解RNN？（理論篇）
2種簡單的方式緩解RNN的優化問題
</div></div>
<div class="su-spoiler su-spoiler-style-fancy su-spoiler-icon-arrow-circle-1 su-spoiler-closed" data-scroll-offset="0" data-anchor-in-url="no"><div class="su-spoiler-title" tabindex="0" role="button">實踐類文章（1）</div><div class="su-spoiler-content su-u-clearfix su-u-trim">
ACL 2018｜西北大學：RNN語言模型的重要訓練數據抽樣
</div></div>

        </div>

1

添加新评论

最新文章

最近回复

分类

归档