1、Pattern Recognition&artificial IntelligenceLecture 11:聚聚类类算法(七)算法(七)1Artificial Neural Networks Biological and artificial networks Competitive Learning NetworksCompetitive LearningSelf-Organizing Map(SOM)Adaptive Resonance Theory(ART)Relationship between K-means,FCM and Competitive Learning NetworkM
2、odel-based clustering(2)2Biological and artificial networks3Biology:Biological and artificial networks4Artificial The artificial neural network is a group of neurons organized in several layers:Input layer:receives inputs from sources external to the network;Output layer:generates outputs to the ext
3、ernal world.Hidden layer(s):layers in between of the input and output layers,not visible from outside the network.Learning laws:mathematical rules for modifying the weights of a network iteratively according the inputs(and outputs if the learning is supervised).Biological and artificial networks5Mat
4、hematical Explanation A neuron is modeled mathematically as the following:Activation-the net input signal:The net input to the ith node is a weighted sum of all inputs:iijjjnetw xWhere is the input signal from the jth node,is the synaptic connectivity between the jth node and the ith node.jxijw000ij
5、inhibitoryexcitatorywno connectionBiological and artificial networks6Mathematical Explanation Output signal:The output of the ith node is a function of the net input Where is an activation function which can be a sigmoid function,as plotted in the figure below.This function can be either one-sided (
6、top)or two-sided(bottom):()iiijjjyg netgw x()g x0()1g x1()1g x Biological and artificial networks7Mathematical Explanation One-sided:Two-sided:21()(,),()1()(1()aexpaxg x ag xexpaxexpax222()(,)1,()1()(1()aexpaxg x ag xexpaxexpaxwhere is a parameter that controls the slop of Specially,when becomes lin
7、ear,but when becomes a threshold function:a g x,(,)ag x a 0,(,)ag x a0 or 10lim(,)10axg x axBiological and artificial networks8Mathematical Explanation Curve of Sigmoid function Competitive Learning Networks9Competitive learningCompetitive learning is a typical unsupervised learning network,similar
8、to the statistical clustering analysis methods(k-means).The purpose is to discover groups/clusters composed of similar patterns represented by vectors in the n-D space.The competitive learning network has two layers.Competitive Learning Networks10Competitive learningthe input layer composed of nodes
9、 to which an input pattern is presented,and the output layer composed of nodes There different variations in the computation for the output,depending on the specific data and purpose.In the simplest computation,the net input,the activation,of the ith output node is just the inner product of the weig
10、ht vector of the node and the current input vector:1,Tnxxxm,(1,)iyimnCompetitive Learning Networks11Competitive learning1nTiijjijnetw xw xThe outputs of the network are determined by a winner-take-all competition such that only the output node receiving the maximal net input will output 1 while all
11、others output 0:1if.1,0otherwisekiknetmax netimyCompetitive Learning Networks12Competitive learningThe input patterns can be either a binary or a real n-D vector All weights are real(or positive and the weight vector is normalized in a certain sense:01ixor1,Tnxxx0ijw iw211|1,.1,(0)nniijijijjjwwif ww
12、The net input can be considered as the inner product of two vectors and iwxinet|Tiijjiijnetw xcosw xwxCompetitive Learning Networks13Competitive learning222|2|iiicoswxwxwxwhich is closely related to the difference between the two vectors:where is the angle between the two vectors.We see that the wei
13、ght vector of the winning node is closest to the current input pattern vector in the n-D feature space,with smallest angle and therefore smallest Euclidean distance Therefore the competition can also be carried out as:kwx|iid wxCompetitive Learning Networks14Competitive learningTherefore the competi
14、tion can also be carried out as:1if(1,)0otherwiseijkddjmyThe competitive learning law is where()(1)10newoldoldoldiiiiiiioldiioldiiuuuuwwxwwwwxw1 the ith node is winner (),0ioldiiifanduelsewxw15 is the learning rate which reduces from some set initial value(e.g.,0.8)toward 0 by certain scaling factor
15、(e.g.,0.996).01Note that is between and i.e.,the effect of this learning process is to pull the weight vector closest to the current input pattern .newkwoldkwxkw,xCompetitive Learning NetworksCompetitive learningCompetitive Learning NetworksCompetitive learningLearning steps:Generate a set of output
16、 nodes with random weights Choose a random input pattern from the high-dimensional vector space and calculate the activations for each of the output nodes Find the winning node and update its weights Go back to step 2 until the weights are no longer changing,or a set maximum number of iterations is
17、reached.Competitive Learning NetworksCompetitive learning *)()()()(*ttttjpWXW *1W *jW *)(*1tjW )(tpX jW mW *18Example:clustering using competition learning networks Competitive Learning Networks6.08.01X9848.01736.02X707.0707.03X9397.0342.04X8.06.05XFor convenience,we convert the data into the polar
18、For convenience,we convert the data into the polar coordinatecoordinate89.3611X2180 X5.4413X7014X13.5315XAssuming that we have two weighting vectors,their Assuming that we have two weighting vectors,their initial values areinitial values are:0101)0(1W180101)0(2W19Example:clustering using competition
19、 learning networks Competitive Learning Networks x5 x3 x1 w2 w1 x2 x4 T T i i m m e e s s W W1 1 W W2 2 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 1 1 0 0 1 1 1 1 1 1 2 2 1 1 3 3 1 1 4 4 1 1 5 5 1 1 6 6 1 1 7 7 1 1 8 8 1 1 9 9 2 2 0 0 1 1 8 8.4 4 3 3 -3 3 0 0.8 8 7 7 -3 3 2 2 1 1 1 1 2 2 4 4 2 2 4 4 3 3 4
20、4 3 3 4 4 4 4 4 4 4 4 0 0.5 5 4 4 0 0.5 5 4 4 3 3 4 4 3 3 4 4 7 7.5 5 4 4 2 2 4 4 2 2 4 4 3 3.5 5 4 4 3 3.5 5 4 4 8 8.5 5 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 3 3 0 0 -1 1 3 3 0 0 -1 1 0 0 0 0 -1 1 0 0 0 0 -1 1 0 0 0 0 -9 9 0 0 -9 9 0 0 -8 8 1 1 -8 8 1 1
21、 -8 8 1 1 -8 8 0 0.5 5 -8 8 0 0.5 5 -7 7 5 5 -7 7 5 5 20Example:clustering using competition learning networks Competitive Learning Networks x5 x3 x1 w2 x2 x4 w1 T T i i m m e e s s W W1 1 W W2 2 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 1 1 0 0 1 1 1 1 1 1 2 2 1 1 3 3 1 1 4 4 1 1 5 5 1 1 6 6 1 1 7 7 1 1
22、8 8 1 1 9 9 2 2 0 0 1 1 8 8.4 4 3 3 -3 3 0 0.8 8 7 7 -3 3 2 2 1 1 1 1 2 2 4 4 2 2 4 4 3 3 4 4 3 3 4 4 4 4 4 4 4 4 0 0.5 5 4 4 0 0.5 5 4 4 3 3 4 4 3 3 4 4 7 7.5 5 4 4 2 2 4 4 2 2 4 4 3 3.5 5 4 4 3 3.5 5 4 4 8 8.5 5 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 3 3
23、 0 0 -1 1 3 3 0 0 -1 1 0 0 0 0 -1 1 0 0 0 0 -1 1 0 0 0 0 -9 9 0 0 -9 9 0 0 -8 8 1 1 -8 8 1 1 -8 8 1 1 -8 8 0 0.5 5 -8 8 0 0.5 5 -7 7 5 5 -7 7 5 5 21Example:clustering using competition learning networks Competitive Learning Networks x5 x3 x1 w2 x2 x4 w1 T T i i m m e e s s W W1 1 W W2 2 1 1 2 2 3 3
24、4 4 5 5 6 6 7 7 8 8 9 9 1 1 0 0 1 1 1 1 1 1 2 2 1 1 3 3 1 1 4 4 1 1 5 5 1 1 6 6 1 1 7 7 1 1 8 8 1 1 9 9 2 2 0 0 1 1 8 8.4 4 3 3 -3 3 0 0.8 8 7 7 -3 3 2 2 1 1 1 1 2 2 4 4 2 2 4 4 3 3 4 4 3 3 4 4 4 4 4 4 4 4 0 0.5 5 4 4 0 0.5 5 4 4 3 3 4 4 3 3 4 4 7 7.5 5 4 4 2 2 4 4 2 2 4 4 3 3.5 5 4 4 3 3.5 5 4 4 8
25、8.5 5 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 3 3 0 0 -1 1 3 3 0 0 -1 1 0 0 0 0 -1 1 0 0 0 0 -1 1 0 0 0 0 -9 9 0 0 -9 9 0 0 -8 8 1 1 -8 8 1 1 -8 8 1 1 -8 8 0 0.5 5 -8 8 0 0.5 5 -7 7 5 5 -7 7 5 5 22Example:clustering using competition learning networks Compe
26、titive Learning Networks x5 x3 x1 w2 x2 x4 w1 T T i i m m e e s s W W1 1 W W2 2 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 1 1 0 0 1 1 1 1 1 1 2 2 1 1 3 3 1 1 4 4 1 1 5 5 1 1 6 6 1 1 7 7 1 1 8 8 1 1 9 9 2 2 0 0 1 1 8 8.4 4 3 3 -3 3 0 0.8 8 7 7 -3 3 2 2 1 1 1 1 2 2 4 4 2 2 4 4 3 3 4 4 3 3 4 4 4 4 4 4 4 4 0
27、0.5 5 4 4 0 0.5 5 4 4 3 3 4 4 3 3 4 4 7 7.5 5 4 4 2 2 4 4 2 2 4 4 3 3.5 5 4 4 3 3.5 5 4 4 8 8.5 5 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 3 3 0 0 -1 1 3 3 0 0 -1 1 0 0 0 0 -1 1 0 0 0 0 -1 1 0 0 0 0 -9 9 0 0 -9 9 0 0 -8 8 1 1 -8 8 1 1 -8 8 1 1 -8 8 0 0.5 5 -
28、8 8 0 0.5 5 -7 7 5 5 -7 7 5 5 23Example:clustering using competition learning networks Competitive Learning Networks x5 x3 x1 w2 x2 x4 w1 T T i i m m e e s s W W1 1 W W2 2 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 1 1 0 0 1 1 1 1 1 1 2 2 1 1 3 3 1 1 4 4 1 1 5 5 1 1 6 6 1 1 7 7 1 1 8 8 1 1 9 9 2 2 0 0 1 1
29、8 8.4 4 3 3 -3 3 0 0.8 8 7 7 -3 3 2 2 1 1 1 1 2 2 4 4 2 2 4 4 3 3 4 4 3 3 4 4 4 4 4 4 4 4 0 0.5 5 4 4 0 0.5 5 4 4 3 3 4 4 3 3 4 4 7 7.5 5 4 4 2 2 4 4 2 2 4 4 3 3.5 5 4 4 3 3.5 5 4 4 8 8.5 5 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 3 3 0 0 -1 1 3 3 0 0 -1 1 0
30、 0 0 0 -1 1 0 0 0 0 -1 1 0 0 0 0 -9 9 0 0 -9 9 0 0 -8 8 1 1 -8 8 1 1 -8 8 1 1 -8 8 0 0.5 5 -8 8 0 0.5 5 -7 7 5 5 -7 7 5 5 24Example:clustering using competition learning networks Competitive Learning Networks x5 x3 x1 x2 x4 w1 w2 T T i i m m e e s s W W1 1 W W2 2 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9
31、1 1 0 0 1 1 1 1 1 1 2 2 1 1 3 3 1 1 4 4 1 1 5 5 1 1 6 6 1 1 7 7 1 1 8 8 1 1 9 9 2 2 0 0 1 1 8 8.4 4 3 3 -3 3 0 0.8 8 7 7 -3 3 2 2 1 1 1 1 2 2 4 4 2 2 4 4 3 3 4 4 3 3 4 4 4 4 4 4 4 4 0 0.5 5 4 4 0 0.5 5 4 4 3 3 4 4 3 3 4 4 7 7.5 5 4 4 2 2 4 4 2 2 4 4 3 3.5 5 4 4 3 3.5 5 4 4 8 8.5 5 -1 1 8 8 0 0 -1 1
32、8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 8 8 0 0 -1 1 3 3 0 0 -1 1 3 3 0 0 -1 1 0 0 0 0 -1 1 0 0 0 0 -1 1 0 0 0 0 -9 9 0 0 -9 9 0 0 -8 8 1 1 -8 8 1 1 -8 8 1 1 -8 8 0 0.5 5 -8 8 0 0.5 5 -7 7 5 5 -7 7 5 5 25Competitive Learning NetworksThe problem of original competitive learning networks(C
33、LN)The output neuros are independent How to determine the learning rate to make ()(1)10newoldoldoldiiiiiiioldiioldiiuuuuwwxwwwwxw)()1(tt26Competitive Learning NetworksThe problem of original competitive learning networks(CLN)The initialization of the weight vectors may be of critical importance for
34、the performance of the competitive learning process.Results also depend on sequence of sample presentationw1w2w1 will always win no matter the sample is from which classw2 is stuck and will not participate in learning27Competitive Learning NetworksSelf-Organizing Map(SOM):Kohonen NetworkOur brain is
35、 dominated by the cerebral cortex,a very complex structure of billions of neurons and hundreds of billions of synapses.The cortex includes areas that are responsible for different human activities(motor,visual,auditory,somatosensory,etc.),and associated with different sensory inputs.We can say that
36、each sensory input is mapped into a corresponding area of the cerebral cortex.The cortex is a self-organizing computational map in the human brain.What is a self-organizing feature map?28Competitive Learning NetworksSelf-Organizing Map(SOM):Kohonen NetworkWhat is a self-organizing feature map?Self-o
37、rganizing map(SOM),also referred to as self-organized feature mapping(SOFM),is a process that maps the input patterns in a high-dimensional vector space to a low-dimensional(typically 2-D)output space,the feature map,so that the nodes in the neighborhood of this map respond similarly to a group of s
38、imilar input patterns.29Competitive Learning Networks (a)1D SOM (b)2D SOM Self-Organizing Map(SOM):Kohonen NetworkTopology of SOM30Competitive Learning NetworksSelf-Organizing Map(SOM):Kohonen NetworkArchitecture of SOMInputlayerx1x2Outputlayeryy21y3Learning Process:Competition Cooperation Adaptatio
39、n Cooperation?31Competitive Learning NetworksSelf-Organizing Map(SOM):Kohonen NetworkThe goal of SOM to find values of weight vectors from the input layer to the lattice nodes(outputs)in such a way that adjacent neurons(output layer)will have similar weight vectors-cooperation32Competitive Learning
40、NetworksSelf-Organizing Map(SOM):Kohonen NetworkLearning process of SOM:simple description Given:an input pattern(vector)x Find:the neuron i which has closest weight vector by competition(inner product wiT x will be the highest).For each neuron j in the neighborhood N(i)of the winning neuron i:updat
41、e the weight vector of j.Neurons which are not in the neighborhood are left unchanged.33Competitive Learning NetworksSelf-Organizing Map(SOM):Kohonen NetworkLearning process of SOM Competition:Input vector X is compared to node parameters W.Similar=minimal distance or maximal scalar product.Competit
42、ion:find node j=c with W most similar to X.2()()()argminijjiijjXWcXWXWNode number c is most similar to the input vector X It is a winner,and it will learn to be more similar to X,hence this is a“competitive learning”procedure.Brain:those neurons that react to some signals pick it up and learn.34Comp
43、etitive Learning NetworksSelf-Organizing Map(SOM):Kohonen NetworkLearning process of SOM Cooperationnodes on a grid close to the winner c should behave similarly.Define the“neighborhood function”O(c):220(,)()exp/()ccch r r th trrtt iteration number(or time);rc position of the winning node c(in physi
44、cal space,usually 2D).|r-rc|distance from the winning node,scaled by c(t).h0(t)slowly decreasing multiplicative factor35Competitive Learning NetworksSelf-Organizing Map(SOM):Kohonen NetworkLearning process of SOM Neighborhood function0hidji1.02220(,)()exp/()ccch r r th trrt01()expttT002()expth tT36C
45、ompetitive Learning NetworksSelf-Organizing Map(SOM):Kohonen NetworkLearning process of SOMAdaption:take the winner node c,and those in its neighborhood O(rc),change their parameters making them more similar to the data X ()()(),For 1,iiiiciO ctth r r ttt WWXWSelect randomly new sample vector X,and
46、repeat.Decrease h0(t)slowly until there will be no changes.Result:W(i)point to the center of local clusters in the X feature space Nodes in the neighborhood point to adjacent areas in X space37Competitive Learning NetworksSelf-Organizing Map(SOM):Kohonen NetworkSOM AlgorithmXT=(X1,X2.Xd),samples fro
47、m feature space.Create a grid with nodes i=1.K in 1D,2D or 3D,each node with d-dimensional vector W(i)T=(W1(i)W2(i).Wd(i),W(i)=W(i)(t),changing with t discrete time.1.Initialize:random small W(i)(0)for all i=1.K.Define parameters of neighborhood function h(|ri rc|/(t),t)2.Iterate:select randomly inp
48、ut vector X3.Calculate distances d(X,W(i),find the winner node W(c)most similar(closest to)X4.Update weights of all neurons in the neighborhood O(rc)5.Decrease the influence h0(t)and shrink neighborhood (t).6.If in the last T steps all W(i)changed less than e e then stop.38Competitive Learning Netwo
49、rksAdaptive Resonance Theory(ART)Fundamental problem of perception and cognition:How do we continue to learn throughout life?Is it possible to design a systemthat learns continouslywithout forgetting the past?39Competitive Learning NetworksAdaptive Resonance Theory(ART)Plasticity:Ability to learn ne
50、w patterns.Stability:Ability to generate appropriate patterns without changing weight value.Plasticity/stability dilemma Current weights are destroyed when new patterns are used.Weights wont be stable.Stability is required for surviving!Plasticity is required for learning!Cant get Plasticity and sta