ImageVerifierCode 换一换
格式:PPT , 页数:61 ,大小:676.52KB ,
文档编号:3228742      下载积分:28 文币
快捷下载
登录下载
邮箱/手机:
温馨提示:
系统将以此处填写的邮箱或者手机号生成账号和密码,方便再次下载。 如填写123,账号和密码都是123。
支付方式: 支付宝    微信支付   
验证码:   换一换

优惠套餐
 

温馨提示:若手机下载失败,请复制以下地址【https://www.163wenku.com/d-3228742.html】到电脑浏览器->登陆(账号密码均为手机号或邮箱;不要扫码登陆)->重新下载(不再收费)。

已注册用户请登录:
账号:
密码:
验证码:   换一换
  忘记密码?
三方登录: 微信登录  
下载须知

1: 试题类文档的标题没说有答案,则无答案;主观题也可能无答案。PPT的音视频可能无法播放。 请谨慎下单,一旦售出,概不退换。
2: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。
3: 本文为用户(三亚风情)主动上传,所有收益归该用户。163文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知163文库(点击联系客服),我们立即给予删除!。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 本站仅提供交流平台,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

版权提示 | 免责声明

1,本文(A-Hierarchical-Approach-to-POMDP-Planning-and-ExecutionPOMDP规划与执行一个分层的方法.ppt)为本站会员(三亚风情)主动上传,163文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。
2,用户下载本文档,所消耗的文币(积分)将全额增加到上传者的账号。
3, 若此文所含内容侵犯了您的版权或隐私,请立即通知163文库(发送邮件至3464097650@qq.com或直接QQ联系客服),我们立即给予删除!

A-Hierarchical-Approach-to-POMDP-Planning-and-ExecutionPOMDP规划与执行一个分层的方法.ppt

1、Hierarchical Methods forPlanning under UncertaintyThesis ProposalJoelle PineauThesis Committee:Sebastian Thrun,ChairMatthew MasonAndrew MooreCraig Boutilier,U.of TorontoJoelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyIntegrating robots in living environmentsThe robots

2、 role:-Social interaction-Mobile manipulation-Intelligent reminding-Remote-operation-Data collection/monitoringJoelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyA broad perspectiveGOAL=Selecting appropriate actionsUSER+WORLD+ROBOTACTIONS OBSERVATIONSBeliefstateSTATEJoel

3、le PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyCause#1:Non-deterministic effects of actionsCause#2:Partial and noisy sensor informationCause#3:Inaccurate model of the world and the userWhy is this a difficult problem?UNCERTAINTYJoelle PineauThesis Proposal:Hierarchical M

4、ethods for Planning under UncertaintyCause#1:Non-deterministic effects of actionsCause#2:Partial and noisy sensor informationCause#3:Inaccurate model of the world and the userWhy is this a difficult problem?UNCERTAINTYA solution:Partially Observable MarkovDecision Processes(POMDPs)S3o1,o2S1o1,o2S2o1

5、,o2a1a2Joelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyThe truth about POMDPs Bad news:Finding an optimal POMDP action selection policy is computationally intractable for complex problems.Joelle PineauThesis Proposal:Hierarchical Methods for Planning under Uncertainty

6、The truth about POMDPs Bad news:Finding an optimal POMDP action selection policy is computationally intractable for complex problems.Good news:Many real-world decision-making problems exhibit structure inherent to the problem domain.By leveraging structure in the problem domain,I propose an algorith

7、m that makes POMDPs tractable,even for large domains.Joelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyHow is it done?Use a“Divide-and-conquer”approach:We decompose a large monolithic problem into a collection of loosely-related smaller problems.DialoguemanagerHealthman

8、agerSocialmanagerRemindingmanagerJoelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyThesis statementDecision-making under uncertaintycan be made tractable for complex problemsby exploiting hierarchical structurein the problem domain.Joelle PineauThesis Proposal:Hierarchi

9、cal Methods for Planning under UncertaintyOutline Problem motivationPartially observable Markov decision processes The hierarchical POMDP algorithm Proposed researchJoelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyPOMDPs within the family of Markov modelsMarkov ChainHi

10、dden Markov Model(HMM)Markov Decision Process(MDP)Partially Observable MDP(POMDP)Uncertainty in sensor input?nonoControlproblem?yesyesJoelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyPOMDP parameters:Initial belief:b0(s)=Pr(so=s)Observation probabilities:O(s,a,o)=Pr(o|

11、s,a)Transition probabilities:T(s,a,s)=Pr(s|s,a)Rewards:R(s,a)HMMWhat are POMDPs?Components:Set of states:sSSet of actions:aASet of observations:oO 0.50.51MDPS2Pr(o1)=0.9Pr(o2)=0.1S1Pr(o1)=0.5Pr(o2)=0.5a1a2S3Pr(o1)=0.2Pr(o2)=0.8Joelle PineauThesis Proposal:Hierarchical Methods for Planning under Unce

12、rtaintyA POMDP example:The tiger problemS1“tiger-left”Pr(o=growl-left)=0.85Pr(o=growl-right)=0.15S2“tiger-right”Pr(o=growl-left)=0.15Pr(o=growl-right)=0.85Actions=listen,open-left,open-rightReward Function:R(a=listen)=-1R(a=open-right,s=tiger-left)=10R(a=open-left,s=tiger-left)=-100Joelle PineauThes

13、is Proposal:Hierarchical Methods for Planning under UncertaintyWhat can we do with POMDPs?1)State tracking:After an action,what is the state of the world,st?2)Computing a policy:Which action,aj,should the controller apply next?Very hard!Not so hard.bt-1?at-1otRobot:St-1stWorld:Control layer:.?Joelle

14、 PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyThe tiger problem:State trackingS1“tiger-left”S2“tiger-right”Belief vectorb0BeliefJoelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyThe tiger problem:State trackingS1“tiger-left”S2“tiger-right”Bel

15、ief vectorb0Beliefobs=growl-leftaction=listenJoelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyThe tiger problem:State trackingb1obs=growl-leftS1“tiger-left”S2“tiger-right”Belief vectorBeliefb0action=listen baoPsbassPasoPsbSsjjiiij,|,|,|01Joelle PineauThesis Proposal:Hi

16、erarchical Methods for Planning under UncertaintyPolicy OptimizationWhich action,aj,should the controller apply next?In MDPs:Policy is a mapping from state to action,:si aj In POMDPs:Policy is a mapping from belief to action,:b ajRecursively calculate expected long-term reward for each state/belief:

17、Find the action that maximizes the expected reward:)(),|Pr(),(max)(1jNjijiaisVassasRsV)(),|Pr(),(maxarg)(1jNjijiaisVassasRsJoelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyThe tiger problem:Optimal policyBelief vector:open-leftopen-rightlistenS1“tiger-left”S2“tiger-rig

18、ht”Optimal policy:Joelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyFinite-horizon POMDPs are in worse-case doubly exponential:Infinite-horizon undiscounted stochastic POMDPs are EXPTIME-hard,and may not be decidable(|n|).POMDPComplexity(per step ofvalue iteration)MDPre

19、cursiveupper-boundTimeSpaceComplexity of policy optimizationnOAS|2|nOA|12|OnAS|1|OnA|2AS|S|nJoelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyThe essence of the problem How can we find good policies for complex POMDPs?Is there a principled way to provide near-optimal po

20、licies in reasonable time?Joelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyOutline Problem motivation Partially observable Markov decision processesThe hierarchical POMDP algorithm Proposed researchJoelle PineauThesis Proposal:Hierarchical Methods for Planning under Un

21、certaintyA hierarchical approach to POMDP planningKey Idea:Exploit hierarchical structure in the problem domain to break a problem into many“related”POMDPs.What type of structure?Action set partitioningActInvestigateHealthMoveNavigateCheckPulseAskWhereLeft Right Forward BackwardCheckMedssubtaskabstr

22、actactionJoelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyAssumptionsEach POMDP controller has a subset of Ao.Each POMDP controller has full state set S0,observation set O0.Each controller includes discriminative reward information.We are given the action set partition

23、ing graph.We are given a full POMDP model of the problem:So,Ao,Oo,Mo.Joelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyThe tiger problem:An action hierarchyPinvestigate=S0,Ainvestigate,O0,MinvestigateAinvestigate=listen,open-rightactopen-leftinvestigateopen-rightlistenJ

24、oelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyOptimizing the“investigate”controllerS1“tiger-left”S2“tiger-right”Locally optimal policy:Belief vector:open-rightlistenJoelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyThe tiger problem:An a

25、ction hierarchyPact=S0,Aact,O0,MactAact=open-left,investigateactopen-leftinvestigateopen-rightlistenBut.R(s,a=investigate)is not defined!Joelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyModeling abstract actionsInsight:Use the local policy of corresponding low-level co

26、ntroller.General form:R(si,ak)=R(si,Policy(controllerk,si)Example:R(s=tiger-left,ak=investigate)=open-right listen open-lefttiger-left 10-1 -100tiger-right -100-1 10Policy(investigate,s=tiger-left)=open-rightJoelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyOptimizing t

27、he“act”controllerS1“tiger-left”S2“tiger-right”Locally optimal policy:investigateBelief vector:open-leftJoelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyThe complete hierarchical policyS1“tiger-left”S2“tiger-right”Hierarchical policy:Belief vector:open-leftopen-rightlis

28、tenJoelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyThe complete hierarchical policyS1“tiger-left”S2“tiger-right”Hierarchical policy:open-leftopen-rightlistenOptimal policy:Belief vector:Joelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyRe

29、sults for larger simulation domainsPOMDPH-POMDPMDPNavigation Problem:|S|=11,|A|=6,|O|=6 CPU Time(secs):1119.932.840.000654 Average Reward:12.512.20.0Dialogue Problem:|S|=20,|A|=30,|O|=27 CPU Time(secs):24hrs77.996.46 Average Reward:64.4353.33%Correct actions:93.280.0Joelle PineauThesis Proposal:Hier

30、archical Methods for Planning under UncertaintyRelated work on hierarchical methodsHierarchical HMMsFine et al.,2019Hierarchical MDPsDayan&Hinton,1993;Dietterich,2019;McGovern et al.,2019;Parr&Russell,2019;Singh,1992.Loosely-coupled MDPsBoutilier et al.,2019;Dean&Lin,2019;Meuleau et al.2019;Singh&Co

31、hn,2019;Wang&Mahadevan,2019.Factored state POMDPsBoutilier et al.,2019;Boutilier&Poole,2019;Hansen&Feng,2000.Hierarchical POMDPsCastanon,2019;Hernandez-Gardiol&Mahadevan,2019;Theocharous et al.,2019;Wiering&Schmidhuber,2019.Joelle PineauThesis Proposal:Hierarchical Methods for Planning under Uncerta

32、intyOutline Problem motivation Partially observable Markov decision processes The hierarchical POMDP algorithmProposed researchJoelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyProposed research1)Algorithmic design2)Algorithmic analysis3)Model learning4)System developme

33、nt and application Joelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyResearch block#1:Algorithmic designGoal 1.1:Developing/implementing hierarchical POMDP algorithm.Goal 1.2:Extending H-POMDP for factorized state representation.Goal 1.3:Using state/observation abstract

34、ion.Goal 1.4:Planning for controllers with no local reward information.Joelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyAssumption#2:“Each POMDP controller has full state set S0,and observation set O0.”Can we reduce the number of states/observations,|S|and|O|?Goal 1.3:

35、State/observation abstractionJoelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyAssumption#2:“Each POMDP controller has full state set S0,and observation set O0.”Can we reduce the number of states/observations,|S|and|O|?Yes!Each controller only needs subset of state/obse

36、rvation features.What is the computational speed-up?Goal 1.3:State/observation abstractionNavigateLeftRight Forward BackwardInvestigateHealthCheckPulseCheckMeds POMDP recursiveupper-boundTime complexity:nOAS|2|12|OnASJoelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyGoa

37、l 1.4:Local controller reward informationAssumption#3:“Each controller includes some amount of discriminative reward information.”Can we relax this assumption?Joelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyGoal 1.4:Local controller reward informationAssumption#3:“Eac

38、h controller includes some amount of discriminative reward information.”Can we relax this assumption?Possibly.Use reward shaping to select policy-invariant reward function.What is the benefit?H-POMDP could solve problems with sparse reward functions.Joelle PineauThesis Proposal:Hierarchical Methods

39、for Planning under UncertaintyResearch block#2:Algorithmic analysisGoal 2.1:Evaluating performance of the H-POMDP algorithm.Goal 2.2:Quantifying the loss due to the hierarchy.Goal 2.3:Comparing different possible decompositions of a problem.Joelle PineauThesis Proposal:Hierarchical Methods for Plann

40、ing under UncertaintyGoal 2.1:Performance evaluationHow does the hierarchical POMDP algorithm compare to:Exact value function methods Sondik,1971;Monahan,1982;Littman,2019;Cassandra et al,2019.Policy search methods Hansen,2019;Kearns et al.,2019;Ng&Jordan,2000;Baxter&Bartlett,2000.Value approximatio

41、n methods Parr&Russell,2019;Thrun,2000.Belief approximation methods Nourbakhsh,2019;Koenig&Simmons,2019;Hauskrecht,2000;Roy&Thrun,2000.Memory-based methods McCallum,2019.Consider problems from POMDP literature and dialogue management domain.Joelle PineauThesis Proposal:Hierarchical Methods for Plann

42、ing under UncertaintyGoal 2.2:Quantifying the lossThe hierarchical POMDP planning algorithm provides an approximately-optimal policy.How“near-optimal”is the policy?Subject to some(very restrictive)conditions:“The value function of top-level controlleris an upper-bound on the valueof the approximatio

43、n.”Can we loosen the restrictions?Tighten the bound?Find a lower-bound?AtopA1.Vtop(b)Vactual(b)Joelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyGoal 2.3:Comparing different decompositionAssumption#4:“We are given an action set partitioning graph.”What makes a good hier

44、archical action decomposition?Comparing decompositions is the first step towards automatic decomposition.ManufactureExamineInspectReplacea1a2ManufactureReplaceExamine Inspecta1a2a3Joelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyResearch block#3:Model learningGoal 3.1:

45、Automatically generating good action hierarchies.Assumption#4:“We are given an action set partitioning graph.”Can we automatically generate a good hierarchical decomposition?Maybe.It is being done for hierarchical MDPs.Goal 3.2:Including parameter learning.Assumption#5:“We are given a full POMDP mod

46、el of the problem.”Can we introduce parameter learning?Yes!Maximum-likelihood parameter optimization(Baum-Welch)can be used for POMDPs.Joelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyTouchscreen inputSpeech utteranceResearch block#4:System development and applicationG

47、oal 4.1:Building an extensive dialogue manager Touchscreen messageSpeech utteranceDialogue ManagerRemindermessageRobot sensor readingsMotion commandStatus informationFacemail operationsRobot moduleReminding moduleTeleoperation moduleUserRemote-controlcommandJoelle PineauThesis Proposal:Hierarchical

48、Methods for Planning under UncertaintyAn implemented scenarioPhysiotherapyPatientroomRobothomeProblem size:|S|=288,|A|=14,|O|=15State Features:RobotLocation,UserLocation,UserStatus,ReminderGoal,UserMotionGoal,UserSpeechGoalTest subjects:3 elderly residents in assisted living facilityJoelle PineauThe

49、sis Proposal:Hierarchical Methods for Planning under UncertaintyContributions Algorithmic contribution:A novel POMDP algorithm based on hierarchical structure.Enables use of POMDPs for much larger problems.Application contribution:Application of POMDPs to dialogue management is novel.Allows design o

50、f robust robot behavioural managers.Joelle PineauThesis Proposal:Hierarchical Methods for Planning under UncertaintyResearch schedule1)Algorithmic design/implementation2)Algorithmic analysis3)Model learning4)System development and application5)Thesis writingfall 01spring/summer 02spring/summer/fall

侵权处理QQ:3464097650--上传资料QQ:3464097650

【声明】本站为“文档C2C交易模式”,即用户上传的文档直接卖给(下载)用户,本站只是网络空间服务平台,本站所有原创文档下载所得归上传人所有,如您发现上传作品侵犯了您的版权,请立刻联系我们并提供证据,我们将在3个工作日内予以改正。


163文库-Www.163Wenku.Com |网站地图|