基础科学研究中心课件.ppt

上传人(卖家):晟晟文业 文档编号:3733106 上传时间:2022-10-07 格式:PPT 页数:33 大小:3.63MB
下载 相关 举报
基础科学研究中心课件.ppt_第1页
第1页 / 共33页
基础科学研究中心课件.ppt_第2页
第2页 / 共33页
基础科学研究中心课件.ppt_第3页
第3页 / 共33页
基础科学研究中心课件.ppt_第4页
第4页 / 共33页
基础科学研究中心课件.ppt_第5页
第5页 / 共33页
点击查看更多>>
资源描述

1、 关于机器学习的若干理论问题关于机器学习的若干理论问题徐宗本(西安交通大学)Email: 主页:http:/纲纲 要要l线性学习机的万能性理论线性学习机的万能性理论l基于误差建模的正则化理论基于误差建模的正则化理论l稀疏信息处理的新模型与新理论稀疏信息处理的新模型与新理论l线性学习机的万能性理论线性学习机的万能性理论A New Learning Paradigm:LtDAHP(Learning through Deterministic Assignment of Hidden Parameters)Zongben Xu(Xian Jiaotong University,Xian,China)

2、Email: Homepage:http:/lA supervised learning problem:difficult or easy?lCan a difficult learning problem be solved more simply?lIs a linear machine universal?OutlinelSome Related ConceptslLtRAHP:Learning through Random Assignment of Hidden ParameterslLtDAHP:Learning through Deterministic Assignment

3、of Hidden ParameterslConcluding RemarksOutlinelSome Related ConceptslLtRAHP:Learning through Random Assignment of Hidden ParameterslLtDAHP:Learning through Deterministic Assignment of Hidden ParameterslConcluding RemarksSupervised Learning:Given a finite number of input/output samples,to find a func

4、tion f in a machine H that approximates the unknown relation between the input and output spaces.1x2xmx1y2ymySome Related Concepts:Supervised LearningBlack boxFace RecognitionSocial NetworkStock Index Tracking11(,);:miiiDx yfH l HDR*11argmin()(,)mempiifHifEfl f x ymERMdixRMachine:FNNs:1,:,1,2,.,Niii

5、iiHaT xTiN211(,)()()NTFNNiiiifa WxaW x,j kW,i j(1)x(2)x()x m(1)y()y m(2)y12121NNHidden Parameter:Determine the hidden predictors(non-linear mechanism).Bright Parameter:Determine how the hidden predictors are linearly combined(linear mechanism)Some Related Concepts:HP vs BP 1,:,1,2,.,NiiiiiHxTiTNa211

6、(,)()()TiiNF NiNiaafWxxWBright parameterHidden parameterBright parameterHidden parameterHidden parameterHidden parameterBright parameter,j kW,i j(1)x(2)x()x m(1)y()y m(2)y12121NNOne-Stage Learning:HPs and BPs are trained simultaneously in one stage.Two-Stage Learning:HPs and BPs are trained separate

7、ly in two stages.Machine:1,:,1,2,.,NiiiiiHxTiTNaSome Related Concepts:OSL vs TSL 111argmin,imNjjiijjaRilaxyTa 12assign(),.,NTTT TT,TSLStage 1:Stage 2:TaBright Bright parameterparameterHidden Hidden parameterparameter1,11(,)argmin,ijmNjjjiiaR TijlaT xyTa OSL Q1:How to specify assign function?11argmin

8、,jmNjjiiTijjlaT xy ADMLtRAHPLtDAHP Q2:Can TSL work?Some Related Concepts:Main ConcernslT=assign(a)=lT=assign()=random assignmentlT=assign(n)=deterministic assignmentlUniversal approximation?lDoes it degrade the generalization ability?lConsistency/Convergence?lEffectiveness&Efficiency?OutlinelSome Re

9、lated ConceptslLtRAHP:Learning through Random Assignment of Hidden ParameterslLtDAHP:Learning through Deterministic Assignment of Hidden ParameterslConcluding RemarksLtRAHP:An OverviewLtRAHP TypicalsRandom vector functional-link networks(RVFLs)Echo-state neural networks(ESNs)Extreme learning machine

10、(ELM)(Y.H.Pao,Adaptive Pattern Recognition and Neural Networks,Reading,MA:Addison-Wesley,1989)(H.Jaeger and H.Haas.Harnessing nonlinearity:Predicting chaotics systems and saving energy in wireless communication.Science,304:78-80,2004.)(G.B.Huang,Q.Y.Zhu and C.K.Siew.Extreme learning machine:Theory a

11、nd applications.Neurocomputing,70:489-501,2006.)5L,j kWia(1)x(2)x(5)xyRandom assignmentStage 1:Stage 2:LtRAHP Training(,)assign()W111arg min,jmNjjiiaRjijjWlaxya121NNLtRAHP:Experimental EvidencesTestRMSE of UCI dataTraining timeFace Recoginition Marques et al.2012Handwritten Character Recognition Cha

12、cko et al.2012Object Recognition Xu et al.2012Experimental Support Huang et al.2006Application SupportData setsBPSVMELMTrianzines0.21970.12890.2002Housing0.12850.11800.1267Abalone0.08740.07840.0824Airelone0.04810.04290.0431Census0.06850.07460.0660Data setsBPSVMELMTrianzines0.54840.0086=d-1)Configura

13、tion Problem can be approximately solved by:(log)nnLtDAHP:Mathematical Foundations(II)lEqual-area partition(EAP)lRecursive zonal sphere partition(RZSP)http:/ Complexity:LtDAHP:FNN InstanceArchitecture of FNN1jNFNNjjjjfaxW11jknlkkFNNjjkfxa*11*nlFNNjjjkkkfxaConventional FNNsLtDAHP based FNNs *1*2()/kj

14、kkjkkkkxxCxxCx Architecture of LtDAHPdjWB11,kdjkSSNnl,/dlNnN l Stage 1:Stage 2:assign()WN1,.,:nWW1,.,:lMinimal Riesz(d-1)-energy points on Sd-1(EZSP)Best packing points on S1 LtDAHP:Learning procedure(FNN instance)Architecture of for LtDAHP11211122*,arg minarg min|()jjmnlikjjkaRijkiaRniNkkmjWWyaxHYH

15、 YHxaLtDAHP AlgorithmGeneralization CapabilityLtDAHP:Theoretical assessment(FNN instance)LtDAHP:If ,OSL 2rfW 1(2)drlm(1)(2)ddrnm2222212sup(|()|)logmXrrrrdrdMLDHPfWC mEffC mm 2222212sup(|()|)logmXrrrrdrdMOSLfWC mEffC mm Generalization CapabilityLtDAHP:Theoretical assessment(FNN instance)LtDAHP:If ,EL

16、M:If T is randomly fixed according to (2),ddrnm 2222212sup(|()|)lognmXrrrr dr dMELMfWC mEEffC mm 2rfW 1(2)drlm(1)(2)ddrnm2222212sup(|()|)logmXrrrrdrdMLDHPfWC mEffC mm Multiple times of trials are requiredMultiple times of trials are requiredNumber of hidden nodes(N)Number of samples(m)Number of samp

17、les(m)Number of hidden nodes(N)LtDAHP:Toy simulations(FNN instance)ELM(LtRAHP)LtDAHPTraining timeTest errorNumber of hidden nodes(N)Number of samples(m)Number of samples(m)Number of hidden nodes(N)LtDAHP:Simulations on UCI data setsData setsTraining samplesTesting samplesAttributesAuto_Price1065315S

18、tock6333179Bank(Bank8FM)299915008Delta_ailerons356535645Delta_Elevators475947586Data setsTestRMSETrainMTMSparsitySVMELMLtDAHPSVMELMLtDAHPSVMELMLtDAHPAuto_price0.04270.03240.03571603.223.22116.2240.172.2Stock0.04780.03470.03065.640.3250.32526.7108.1148.3Bank8FM0.04540.04460.042182.11.421.42112.988.46

19、0.5Delta_airelons0.04220.03870.039960.12.322.32169.356.248.1Delta_Elevators0.05340.05350.05376843.103.10597.652.652.1LtDAHP:Real world data experimentsMethodsTestRMSETrainMTMsparsityELM10.891989601LtDAHP9.211989512Million Song Dataset(Bertin et al.,2011)describes a learning task of predicting the ye

20、ar in which a song is released based on audio features associated with the song.The dataset consists of 463,715 training examples and 51,630 testing examples with d=90.Each example is a song released between 1922 and 2011,and the song is represented as a vector of timbre information computed about t

21、he song.Million song datasetMethodsTestRMSETrainMTMsparsityELM0.00371523534LtDAHP0.00171523186Buzz Prediction dataset is collected fromTwitter,a famous social network and a micro-blogging platform with exponential growth and extremely fast dynamics.The task is to predict the mean number of active di

22、scussion(NAD)from d=77 primary features,including number of created discussions,average numberof author interaction,average discussion length,and etc.The dataset contains m=583,250 samples,and so a real large scale problem.Buzz in Social MediaLtDAHP:Real world data experimentsConcluding RemarkslLtDA

23、HP provides a very efficient way of overcoming both high computation burden of OSL and the uncertainty difficulty in LtRAHP.lLtDAHP establishes a new paradigm in which supervised learning problems can be very simply but still effectively solved by preassigning the hidden parameters and solving the bright parameters only,while not sacrificing the generalization capability.lMany problems are still open on LtDAHP.Deserve further study.Thank You!

展开阅读全文
相关资源
猜你喜欢
相关搜索
资源标签

当前位置:首页 > 办公、行业 > 各类PPT课件(模板)
版权提示 | 免责声明

1,本文(基础科学研究中心课件.ppt)为本站会员(晟晟文业)主动上传,163文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。
2,用户下载本文档,所消耗的文币(积分)将全额增加到上传者的账号。
3, 若此文所含内容侵犯了您的版权或隐私,请立即通知163文库(发送邮件至3464097650@qq.com或直接QQ联系客服),我们立即给予删除!


侵权处理QQ:3464097650--上传资料QQ:3464097650

【声明】本站为“文档C2C交易模式”,即用户上传的文档直接卖给(下载)用户,本站只是网络空间服务平台,本站所有原创文档下载所得归上传人所有,如您发现上传作品侵犯了您的版权,请立刻联系我们并提供证据,我们将在3个工作日内予以改正。


163文库-Www.163Wenku.Com |网站地图|