1、L1Recurrent Neural NetworksL2Classification of NNsFeedforward NNsRecurrent NNsNeural NetworksL3 3视网膜信息处理的基本系统n视网膜分视网膜分3层神经细胞(自下而上):层神经细胞(自下而上):外层、中间层、最后层外层、中间层、最后层n光信息自光信息自光感受器光感受器经经双极细胞双极细胞传至传至神经节细胞,神经节细胞,神经节细胞的轴突汇神经节细胞的轴突汇聚成聚成视神经视神经离开眼球。离开眼球。n水平细胞和无长突细胞通过侧向联系调水平细胞和无长突细胞通过侧向联系调节双极细胞和神经节细胞的反应。节双极细胞和
2、神经节细胞的反应。L4Feedforward NNs神经节细胞层神经节细胞层内核层内核层外核层外核层n三层神经网络:神经节细胞层三层神经网络:神经节细胞层-内核层内核层-外核层外核层n每层内各神经元之间无连接每层内各神经元之间无连接n前一层神经元计算完后传递给下一层神经元进行计算前一层神经元计算完后传递给下一层神经元进行计算L5Feedforward NNs1x21w13w12w11wRx2x22w23w1Rw2Rw3Rw12f11sf11sn12n11f11n22sf22sn22f22n21f21n31f31n32f32n33sf33sn11sa12a11a22sa22a21a22sa22a
3、21aWxfaL6Contain feedback among neuronsRecurrent NNsL7Recurrent NNsL8Recurrent NNsHow to derive math models of RNNs?L9Recurrent NNs1f1n2n2fL10Recurrent NNs1f1n2n2f)(1kx)(2kx)1(2kx)1(1kx)(2kx)(1kxL11Recurrent NNs1f1n2n2f)(1kx)(2kx)1(2kx)1(1kx)(2kx)(1kx22w21w12w11wL12Recurrent NNs1f111 1122()()nw x kw
4、 x k)()(2221212kxwkxwn2f)(1kx)(2kx)1(2kx)1(1kx)(2kx)(1kx22w21w12w11wL13Recurrent NNs1f)()(121111kxwkxwn)()(2221212kxwkxwn2f)(1kx)(2kx)()()1(21211111kxwkxwfkx)(2kx)(1kx22w21w12w11w)()()1(22212122kxwkxwfkxL14Recurrent NNs)()()1()()()1(2221212221211111kxwkxwfkxkxwkxwfkx1f1n2n2f)(1kx)(2kx)1(2kx)1(1kx)(2
5、kx)(1kx22w21w12w11wL15Recurrent NNs1f1n2n2f)(1kx)(2kx)1(2kx)1(1kx)(2kx)(1kx22w21w12w11w)()()1()()()1(2221212221211111kxwkxwfkxkxwkxwfkxnjjijiikxwfkx1)()1(L16Recurrent NNs)()1(kwxfkxL17Recurrent NNsbbbbL18Discrete Time RNNsbkwxfkx)()1(L19Discrete Time RNNsbkwxfkx)()1(Network computing?0 x1x 6x 3x 5x
6、2x 7x 4x 8x xL20Discrete Time RNNsbkwxfkx)()1(Network computing 0 x1x 6x 3x 5x 2x 7x 4x 8x xRNNInputOutput x 0 xL21Computing:Discrete or Continuous?L22Discrete vs Continuous Discrete time computingContinuous time computingL23Discrete vs Continuous Continuous time computingHow to derive continuous ti
7、me computing math models of RNNs?L24From Discrete Computing to Continuous ComputingChanging time stepsL25btwxftxtxtx)()()()1(btwxftx)()1(From Discrete Computing to Continuous ComputingL26btwxftxtxtx)()(1)()1(btwxftxtxtx)()()()1(From Discrete Computing to Continuous ComputingL27btwxftxtxtx)()()()(btw
8、xftxtxtx)()(1)()1(From Discrete Computing to Continuous ComputingL28btwxftxtxtx)()()()(btwxftxtxtx)()()()(From Discrete Computing to Continuous ComputingL29btwxftxtxtx)()()()(btwxftxdttdx)()()(0From Discrete Computing to Continuous ComputingL30btwxftxdttdx)()()(Continuous Computing RNNsL31Recurrent
9、NNsNoImagettptagdttda),(),()(RNN model:Network stateNetwork inputNetwork timeL32Recurrent NNsWhats the output of a RNN?Network stateNetwork inputNetwork timettptagdttda),(),()(Network outputtata )(L33Convergence of RNNsNetwork statettptagdttda),(),()(tata )(Converge?0 allfor 0),(,tttpagEquilibrium p
10、oint:NoImageL34TrajectoriesNoImagettptagdttda),(),()()0(,(y trajectora is there),0(condition initialany given ataa nR 0,0 0,space esTrajectoriatataL35TrajectoriesNoImagettptagdttda),(),()(0any for ,then,If2121tataataaaL36TrajectoriesNoImagettptagdttda),(),()(0any for ,then,If2121tataataaaL37A Simple
11、 ExampleNoImageptadttda)()(ttepeata1)0()(L38Equilibrium PointsNoImagettptagdttda),(),()(0 allfor 0),(,tttpagEquilibrium point:L39Equilibrium Pointsptadttda)()(ttepeata1)0()(tappa L40Convergence of RNNsttptagdttda),(),()(tata )(AttractorsL41Convergence of RNNsNoImagettptagdttda),(),()(tata )(Does eac
12、h trajectory of a RNN converge to an equilibrium?Methods:1.Solving differential equation directly;2.Energy method.L42Method OneNoImageSolving Differential EquationsL43A Simple ExampleNoImageptadttda)()(ttepeata1)0()(tapL44Linear RNNsNoImageptWatadttda)()()(tata )(L45Linear RNNsNoImage2122212111)()()
13、()()()(ptatataptatatatata )(L46http:/hebb.mit.edu/people/seung/index.htmlNoImageL47http:/hebb.mit.edu/people/seung/index.htmlNoImageL48Linear RNNsnH.S.Seung,How the brain keeps the eyes still,Proc.Natl.Acad.Sci.USA,vol.93,pp.13339-13344,1996y sensitivitposition the-0 gaze centralat rate firing the-r
14、ate firing the-00iiiiiikEvvEkvvL49How the brain keeps the eyes stillnH.S.Seung,How the brain keeps the eyes still,Proc.Natl.Acad.Sci.USA,vol.93,pp.13339-13344,1996ABSTRACT The brain can hold the eyes still because it stores a memory of eye position.The brains memory of horizontal eye position appear
15、s to be represented by persistent neural activity in a network known as the neural integrator,which is localized in the brainstem and cerebellum.Existingexperimental data are reinterpreted as evidence for an“attractor hypothesis”that the persistent patterns of activity observed in this network form
16、an attractive line of fixed points in its state space.Line attractor dynamics can be produced in linear or nonlinear neural networks by learning mechanisms that precisely tune positive feedback.L50Line Attractor 1injjijiiiiibxwtxdttdxEEkxSeung 1996L51文字阅读文字阅读n人眼的运动方式。尽管人的阅读文字总是遵循一定的顺序,人眼的运动方式。尽管人的阅读
17、文字总是遵循一定的顺序,但通过捕捉人员阅读时的目光定位,可以发现人眼的注意但通过捕捉人员阅读时的目光定位,可以发现人眼的注意是跳跃性的,当人脑找到和过去经验、记忆相近的形象时,是跳跃性的,当人脑找到和过去经验、记忆相近的形象时,目光才更多地集中到具体内容上。目光才更多地集中到具体内容上。n完型理论(格式塔理论)完型理论(格式塔理论):人对事物的认知具有强大的人对事物的认知具有强大的“补补完完”功能功能 。1.研表究明,汉字序顺并不定一影阅响读!事证实明了当你看这完句话研表究明,汉字序顺并不定一影阅响读!事证实明了当你看这完句话之后才发字现都乱是的之后才发字现都乱是的2.Hvae a ncie
18、day Hpoe you konw the ifnomariton.L52阅读实验阅读实验L53Linear RNNsnH.S.Seung,Pattern analysis and synthesis in attractor neural networks,1997AnalysisSynthesisL54Pattern analysis and synthesis in attractor neural networksnH.S.Seung.Pattern analysis and synthesis in attractor neural networks.In K.-Y.M.Wong,I
19、.King,and D.-Y.Yeung,editors,Theoretical Aspects of Neural Computation:A Multidisciplinary Perspective,Singapore,1997.Springer-Verlag.Abstract The representation of hidden variable models by attractor neural net works is studied Memories are stored in a dynamical attractor that is a continuous manif
20、old of fixed points as illustrated by linear and nonlinear networks with hidden neurons.Pattern analysis and synthesis are forms of pattern completion by recall of a stored memory.Analysis and synthesis in the linear network are performed by bottom-up and top-down connections.In the nonlinear networ
21、k,the analysis computation additionally requires rectification nonlinearity and inner product inhibition between hidden neurons.L55Pattern analysis and synthesis in attractor neural networksEnergy functionL56Pattern analysis and synthesis in attractor neural networksnH.S.Seung.Pattern analysis and s
22、ynthesis in attractor neural networks.In K.-Y.M.Wong,I.King,and D.-Y.Yeung,editors,Theoretical Aspects of Neural Computation:A Multidisciplinary Perspective,Singapore,1997.Springer-Verlag.Energy functionL57nV.Jain,V.Zhigulin,and H.S.Seung.Representing part-whole relationships in recurrent neural net
23、works.Adv.Neural Info.Proc.Syst.18,563-70(2006).Abstract There is little consensus about the computational function of top-down synaptic connections in the visual system.Here we explore the hypothesisthat top-down connections,like bottom-up connections,reflect partwholerelationships.We analyze a rec
24、urrent network with bidirectional synaptic interactions between a layer of neurons representing parts and a layer of neurons representing wholes.Within each layer,there is lateral inhibition.When the network detects a whole,it can rigorously enforce part-whole relationships by ignoring parts that do
25、 not belong.The network can complete the whole by filling in missing parts.The network can refuse to recognize a whole,if the activated parts do not conform to a stored part-whole relationship.Parameter regimes in which these behaviors happen are identified using the theory of permitted and forbidde
26、n sets.The network behaviors are illustrated by recreating Rumelhart and McClellands“interactive activation”model.L58nV.Jain,V.Zhigulin,and H.S.Seung.Representing part-whole relationships in recurrent neural networks.Adv.Neural Info.Proc.Syst.18,563-70(2006).L59NoImageL60NoImageL61Method TwoNoImageE
27、nergy Functions MethodL62Energy Function MethodnLyapunov MethodnA.M.LyapunovnStability theorynLaSalle PrinciplenEnergy Functionttptagdttda),(),()(L63Energy Function Method xfdttdx)(point.mequilibriuan toconvergesctory each traje Then,0)(:such that)(function Energy a exists There bounded.is RNN a ofc
28、tory each traje that SupposxVxETheorem:Recurrent Neural Network ModelL64Example)()()()()(2ln53)(txtxtxtxeeeetxdttdxL65Example)()(2ln53)(txgtxdttdxEquilibrium Points:0,ln2,-ln2xxxxeeeexg)(L66Example)()()()()(2ln53)(txtxtxtxeeeetxdttdx)(012)(2ln53)(21)(xgdssgxgxVxxxxeeeexg)(L67Example)()()()()(2ln53)(
29、txtxtxtxeeeetxdttdxxeeeexVxxxx2ln534)(3L68ExamplexeeeexVxxxx2ln534)(3422342ln53)(xxxxxxeeeeeexV0)0(,0)0(VVL69Example02ln534)(3xeeeexVxxxx0)0(,0)0(VVL70Example)()()()()(2ln53)(txtxtxtxeeeetxdttdxL71Reference Books1.Zhang Yi and K.K.Tan,Convergence Analysis of Recurrent Neural Networks,Kluwer Academic Publishers,ISBN 1-4020-7694-0,2004.2.H.J.Tang,K.C.Tan and Zhang Yi,Neural Networks:Computational Models and Applications,Springer-Verlag ISBN:978-3-540-69225-6,2007.