1、分布式系统全册配套完整分布式系统全册配套完整 精品课件精品课件 课程范围 教材共教材共1313章,我们课程介绍章,我们课程介绍1212章;章; 第一章:引言第一章:引言 第二章第二章第九章:原理部分第九章:原理部分 第十章第十章第十三章:范例第十三章:范例 基本内容 Chapter 1 Introduction Chapter 2 Communication(RPC/RMI/MOM) Chapter 3 Distributed Computing Paradigm Chapter 4 Processes(虚拟化;代码迁移的模型;进程虚拟化;代码迁移的模型;进程/处理机的分配方法处理机的分配方法
2、) Chapter 5 Naming Chapter 6 Synchronization Chapter 7 Consistency and Replication(DATA-CENTRIC/CLIENT- CENTRIC CONSISTENCY MODELS) Chapter 8 Fault Tolerance Chapter 9 Security Chapter 10 Distributed File Systems Chapter 11 Distributed Object-based Systems Chapter 12 Distributed Web-Based Systems 基本
3、要求基本要求 基本概念基本概念; 基本原理基本原理; 应用范例应用范例; Introduction Chapter 1 1.1 Definition of a Distributed System A distributed system is a collection of A distributed system is a collection of independent computers that appears to its independent computers that appears to its users as a single coherent system.use
4、rs as a single coherent system. 硬件角度:各个计算机是自治的,通过网络互联;硬件角度:各个计算机是自治的,通过网络互联; 软件角度:用户看到的是一台逻辑计算机。软件角度:用户看到的是一台逻辑计算机。 (互联、协作、单一视图)(互联、协作、单一视图) 分布式计算是指在分布式系统上执分布式计算是指在分布式系统上执 行的计算。诸如网络服务和网络应用等。行的计算。诸如网络服务和网络应用等。 像像WWWWWW、EmailEmail、FTPFTP、企业计算、聊天室、企业计算、聊天室 和网络游戏等。和网络游戏等。 (并发,并行,分布)(并发,并行,分布) 1.1 Concep
5、t of a Distributed Computing 分布式计算的最早形态出现在分布式计算的最早形态出现在8080年代末年代末 的的IntelIntel公司,他们利用局域网上工作站的公司,他们利用局域网上工作站的 空闲时间为芯片设计进行计算。空闲时间为芯片设计进行计算。 随着随着InternetInternet的迅速发展和普及,分布的迅速发展和普及,分布 式计算的研究在式计算的研究在9090年代后达到了高潮。互联年代后达到了高潮。互联 网上的应用本身就是一种分布式计算应用。网上的应用本身就是一种分布式计算应用。 1.1 Concept of a Distributed Computing
6、产生的背景:产生的背景:应用驱动和技术支撑应用驱动和技术支撑. . 一、应用驱动一、应用驱动 1 1、客观世界的组织活动本身具有分布客观世界的组织活动本身具有分布 特性,例如:制造、销售、银行、仓储特性,例如:制造、销售、银行、仓储 以及互联网世界等以及互联网世界等业务分布。业务分布。 2 2、性价比高、资源共享、坚定性高和、性价比高、资源共享、坚定性高和 可伸缩性。可伸缩性。 二、技术支撑二、技术支撑 20 20世纪世纪8080年代中期,两个技术的进步年代中期,两个技术的进步 在硬件上为分布式系统提供了基础:在硬件上为分布式系统提供了基础: 1 1、微处理机的出现(价格低廉)。、微处理机的出
7、现(价格低廉)。 2 2、高速网络的出现。、高速网络的出现。 互联网平台的特征互联网平台的特征 开放性开放性多变性多变性动态性动态性 n无统一控制的无统一控制的“真真”分布性分布性 n节点的高度自治性和不可预测性节点的高度自治性和不可预测性 n节点链接的开放性和灵活性节点链接的开放性和灵活性 n网络连接方式的多样性网络连接方式的多样性 n使用方式的个性化和多样性使用方式的个性化和多样性 n人、设备和软件的多重异构性人、设备和软件的多重异构性 大型机大型机-终端终端 客户机客户机-服务器服务器 微机微机-局域网局域网 互联网互联网 按需按需/普适计算普适计算 第一阶段第一阶段 第二阶段第二阶段
8、第三阶段第三阶段 第四阶段第四阶段 初期普及阶段初期普及阶段 更多专家以更多专家以 及一般用户开始及一般用户开始 使用,用户要很使用,用户要很 好地使用需要深好地使用需要深 入了解计算机知入了解计算机知 识。识。 对对 国国 民民 经经 济济 和和 社社 会会 发发 展展 的的 影影 响响 目前阶段目前阶段 专家阶段专家阶段 计算机仅供少计算机仅供少 数专家和专业人员数专家和专业人员 使用,社会公众很使用,社会公众很 难用上。难用上。 公众认识阶段公众认识阶段 因特网和因特网和Web 时代,信息技术时代,信息技术 已在公众中流行已在公众中流行 起来,但用户仍起来,但用户仍 然需要知道一些然需要
9、知道一些 计算机和信息技计算机和信息技 术的知识。术的知识。 广泛普及阶段广泛普及阶段 用户已经看用户已经看 不见技术,也不不见技术,也不 需要知道技术,需要知道技术, 只要看得到技术只要看得到技术 带来的好处。带来的好处。 其他其他: 云计算云计算,物物 联网等联网等. 时间时间 计算环境演化计算环境演化 M-to-1 1-to-1 1-to-M 专业人员专业人员 有专业要求的人员有专业要求的人员 任何人员任何人员 Mainframe PCs/Handhelds LAN Internet IoT. Calculations Information Services 计算为中心计算为中心 信息
10、为中心信息为中心 用户为中心用户为中心 计算模型:计算模型:封闭可控封闭可控 开放不确定开放不确定 Gartner:2011十大战略技术十大战略技术 云计算居首云计算居首 计算模式大爆炸计算模式大爆炸 系统软件的作用系统软件的作用 图灵机模型图灵机模型 应用系统应用系统 计算机硬件计算机硬件/网络网络 系统软件系统软件 操作系统:主机操作系统:主机OS、桌面、桌面OS、网络、网络OS 、中间件系统、中间件系统、WebOS、VMM、网构网构 编程方面:机器语言、汇编语言、面向编程方面:机器语言、汇编语言、面向 过程、面向对象、面向构件、过程、面向对象、面向构件、Eclipse 、Agent. 其
11、他其他 计算为中心计算为中心 信息为中心信息为中心 用户为中心用户为中心 GAP 实实 虚虚 不变不变/变化的地方变化的地方 无论计算环境如何变化,系统软件的2个基础作用不变: u运行时支撑; u设计时支撑; 用户体验已经作为系统软件成功与否的关键因素: u运行支撑环境:高效能、高可靠、高可用、可信、低 成本、低能耗、界面; u设计支撑:抽象层次越来越高; 例1:系统由多个工作站(终端)和1个 共享处理机池(pool of processor)经网 络联结组成,系统为用户提供单一的文件 系统,空闲的工作站和处理机动态分配。 这个从系统整体上看以及运行起来看 都像一个典型的单处理机分时系统,实际
12、 上是一个分布式系统(分布并行计算)。 例2:一个在世界各地有数百个分支机构的大 银行。每个分支机构有一台主计算机存储当地 帐目和处理本地事务。此外,每台计算机还能 与其他分支机构的计算机及总部的计算机对话。 如果交易不管顾客和帐目在哪里都能够进 行,而且用户也不会感到当前这个系统与被替 代的老的集中式主机有何不同,那么这个系统 也被认为是一个分布式系统(分布式事务处 理)。 例3:分布式文档处理,如WWW系统。 通过URL,实现对分布文档以位置透 明的方式访问,分布式文档处理。 分布式计算的主要缺点: 多点故障:一台(条)或多台多点故障:一台(条)或多台(条)(条)计算机(网计算机(网 络链
13、路)的故障会导致分布式系统出现问题。络链路)的故障会导致分布式系统出现问题。 安全问题:分散式管理使安全策略的实现和增强安全问题:分散式管理使安全策略的实现和增强 变得更为困难。变得更为困难。 分布式系统的实现方式分布式系统的实现方式: : 低级编程低级编程; ; 操作系统操作系统( (分布式分布式);); 中间件技术中间件技术; ; 低级编程:低级编程: 利用网络协议和操作系统等提供的低级服务来构利用网络协议和操作系统等提供的低级服务来构 造各种分布式应用系统。开发效率低,软件的质造各种分布式应用系统。开发效率低,软件的质 量难以保证。量难以保证。 操作系统操作系统( (分布式分布式) ):
14、 兼容现存软件适应开放异构性适应应用多样兼容现存软件适应开放异构性适应应用多样 性等方面存在不足。性等方面存在不足。 中间件技术:中间件技术: 对广泛的一类问题具有较好的适用性。它可支持对广泛的一类问题具有较好的适用性。它可支持 广泛一类应用软件的高效和高质量的开发。广泛一类应用软件的高效和高质量的开发。 1.1 A distributed system organized as middleware. Note that the middleware layer extends over multiple machines. 1.2 GOALS: 1. Connecting Users an
15、d Resources: to make it easy for users to access remote resources, and to share them with other users in a controlled way. 2. Transparency: to hide the fact that its processes and resources are physically distributed across multiple computers. 3. Openness: to offers services according to standard ru
16、le that describe the syntax and semantics of those services.(e.g. IDL-Interface Definition Language) 4. Scalability: in size, geographically, and administratively Transparency in a Distributed System Different forms of transparency in a distributed system. TransparencyDescription Access Hide differe
17、nces in data representation and how a resource is accessed LocationHide where a resource is located Migration Hide that a resource may move to another location Relocation Hide that a resource may be moved to another location while in use ReplicationHide that a resource is replicated Concurrency Hide
18、 that a resource may be shared by several competitive users FailureHide the failure and recovery of a resource Persistence Hide whether a (software) resource is in memory or on disk Scalability Problems Examples of scalability limitations. ConceptExample Centralized servicesA single server for all u
19、sers Centralized dataA single on-line telephone book Centralized algorithms Doing routing based on complete information There are basically only three techniques for scaling: 1. Hiding communication latencies (asynchronous communication) 2. Distribution(e.g. DNS) 3. Replication(leads to consistency
20、problems) Scaling Techniques (1) 1.4 The difference between letting: a) a server or b) a client check forms as they are being filled Scaling Techniques (2) 1.5 An example of dividing the DNS name space into zones. 1.3 Hardware Concepts 计算机系统的分类(按Flynn的分类): 1、SISD(Single Instruction stream,Single Dat
21、a stream)单处理机计算机。 2、SIMD(Single Instruction stream, Multiple Data stream),一些超级计算机。 3、MISD(Multiple Instruction stream,Single Data stream)。此类型计算机有多条指令流,一条数据流。目前计 算机中没有属于这一类。 4、MIMD(Multiple Instruction stream, Multiple Data stream),分布式系统属于此类型。 计算机系统的分类(续): 紧耦合型(tightly coupled)系统计算机之 间发送信息的时延很短、数据传输速
22、率高。 松耦合型(loosely coupled) 系统机器间 信息传送延迟大,数据传输速率也低。 1.6 Different basic organizations and memories in distributed computer systems Multiprocessors (1) A bus-based multiprocessor. 1.7 存储器相关(Coherent) Multiprocessors (2) 1.8 交叉开关 交换网络 Homogeneous Multicomputer Systems 1-9 Grid Hypercube Heterogeneous Mu
23、lticomputer Systems a.基于总线 b.基于交换 并行及分布式计算机系统分类: 并行分布式 计算机 多处理机 (共享存储器) 多计算机 (私有存储器) 总线开关总线开关 紧耦合松耦合 MIMD sequent 超级计算机 LAN工作站 超立方体 Encor RP3 Transputer 交换交换 基于总线: 广播方式; 基于交换: 路由方式 1.4 Software Concepts 软件系统:软件系统: 紧耦合型紧耦合型( (tightly coupled)tightly coupled)系统系统单系统映象单系统映象 (single-system imagesingle-s
24、ystem image)。往往称为分布式。往往称为分布式 操作系统操作系统. . 松耦合型松耦合型( (loosely coupled)loosely coupled) 系统系统松耦合的软松耦合的软 件允许分布式系统的机器和用户基本上各自独件允许分布式系统的机器和用户基本上各自独 立,但是也在必要的情况下进行一定程度的相立,但是也在必要的情况下进行一定程度的相 互作用。往往称为网络操作系统互作用。往往称为网络操作系统. . An overview between DOS (Distributed Operating Systems) NOS (Network Operating Systems
25、) Middleware SystemDescriptionMain Goal DOS Tightly-coupled operating system for multi-processors and homogeneous multicomputers Hide and manage hardware resources NOS Loosely-coupled operating system for heterogeneous multicomputers (LAN and WAN) Offer local services to remote clients Middleware Ad
26、ditional layer atop of NOS implementing general-purpose services Provide distribution transparency Distributed Operating Systems (1. Uniprocessor Operating Systems) DOS在功能上同UOS 2. Multiprocessor Operating Systems ? 3. Multicomputer Operating Systems Uniprocessor Operating Systems 单机操作系统可以有多种结构实现,目的是
27、管理 整个计算机系统的资源,并为用户提供使用上的 方便。 操作系统的结构: 模块结构; 层次结构; 微内核结构; 面向对象等。 Uniprocessor Operating Systems Separating applications from operating system code through a microkernel. 1.11 Multiprocessor Operating Systems (1) Multiprocessor Operating Systems aim to support high performance through multiple CPUs. A
28、n important goal is to make the number of CPUs transparent to the application. Achieving such transparency is relatively easy because the communication between different parts of application uses the same primitives as those in multitasking uniprocessor operating systems. Two important primitives ar
29、e semaphores and monitors (只需保护数据不在同一时刻受到多个访问) Multiprocessor Operating Systems (1) A monitor to protect an integer against concurrent access. monitor Counter private: int count = 0; public: int value() return count; void incr () count = count + 1; void decr() count = count 1; Multiprocessor Operati
30、ng Systems (2) A monitor to protect an integer against concurrent access, but blocking a process. monitor Counter private: int count = 0; int blocked_procs = 0; condition unblocked; public: int value () return count; void incr () if (blocked_procs = 0) count = count + 1; else signal (unblocked); voi
31、d decr() if (count =0) blocked_procs = blocked_procs + 1; wait (unblocked); blocked_procs = blocked_procs 1; else count = count 1; Multicomputer Operating Systems (1) General structure of a multicomputer operating system(through message passing) 松耦合硬件上的紧耦合软件,系统目标是使用户产生一个视觉:整个计算机网络是 一个分时系统,而不是一个互不相同的
32、机器的集合。(没有共享存储, 只能消息传递) 1.14 Multicomputer Operating Systems (2) Alternatives for blocking and buffering in message passing. (four possibly synchronization points. if the operating system blocks a sender until messages arrive at either S3 or S4, it must guarantee reliable communication. more complex)
33、 1.15 Multicomputer Operating Systems (3) Relation between blocking, buffering, and reliable communications. Synchronization pointSend buffer Reliable comm. guaranteed? Block sender until buffer not full (满时阻塞发送进程满时阻塞发送进程) YesNot necessary Block sender until message sentNoNot necessary Block sender
34、until message receivedNoNecessary Block sender until message deliveredNoNecessary 同步点与消息传递语义理解(消息可靠性) Distributed Shared Memory Systems (1) a)Pages of address space distributed among four machines b)Situation after CPU 1 references page 10 c)Situation if page 10 is read only and replication is used
35、When a processor references an address that is not present locally, a trap occurs, and the operating system fetches the page containing the address and restarts the faulting instruction. Distributed Shared Memory Systems (2) False sharing of a page between two independent processes. 1.18 要设计高效的DSM,
36、页的大小设置是一个问题? 太小: 可能中断多,传送次数多. 太大:也可能传送次数多. 例如: Network Operating System (1) General structure of a network operating system. 松耦合硬件上的松耦合软件(底层硬件和Kernel可以不同) 1-19 Network Operating System (2) 每个用户都有自己专用的工作站(有盘或无盘)。 但是它一定有自己的操作系统。所有的命令通常在本 地运行。 非透明方式使用,例如: rlogin machine rcp machine1 : file1 machine2 :
37、file2 (机器的选择完全由人工操作完成) Network Operating System (2) Two clients and a server in a network operating system. (提供一个全局文件系统的方法: 文件服务器) 1-20 Network Operating System (3) Different clients may mount the servers in different places. 缺点: 缺乏透明性 1.21 Positioning Middleware A distributed operating system is no
38、t intended to handle a collection of independent computers.(异构) A network operating system does not provide a view of a single coherent system. Positioning Middleware The solution is to be found in an additional layer of software that is used in network operating systems to more or less hide the het
39、erogeneity of the collection of underlying platforms but also to improve distribution transparency-middleware Positioning Middleware General structure of a distributed system as middleware. 1-22 Middleware Models To make development and integration of distributed applications as simple as possible,
40、most middleware is based on some model, or paradigm, for describing distribution and communication. Some models are as follows: based on distributed file system, based on RPCsRemote Procedure Calls, based on distributed objects, based on distributed documents, Middleware Services 为了力图实现访问透明性, There
41、are a number of services common to many middle systems: Naming Persistence Distributed transaction Middleware and Openness 为保证开放性/互操作性: 采用相同的中间件协 议和中间件接口. 1.23 Comparison between Systems Item Distributed OS Network OS Middleware- based OS Multiproc.Multicomp. Degree of transparency Very HighHighLowH
42、igh Same OS on all nodesYesYesNoNo Number of copies of OS 1NNN Basis for communication Shared memory MessagesFilesModel specific Resource management Global, central Global, distributed Per nodePer node ScalabilityNoModeratelyYesVaries OpennessClosedClosedOpenOpen Distributed Computing Paradigm Chapt
43、er 2 2.1 Paradigm and Abstraction u Important characteristics that distinguish distributed Important characteristics that distinguish distributed applications from conventional applications which run on a applications from conventional applications which run on a single machine: single machine: Inte
44、rprocess communication: A distributed application requires the participation of two or more independent entities (processes). To do so, the processes must have the ability to exchange data among themselves.(多方参与多方参与) Event synchronization: In a distributed application, the sending and receiving of d
45、ata among the participants of a distributed application must be synchronized. 2.1 Paradigm and Abstraction Paradigm: Paradigm means “a pattern, example, or model.” In the study of any subject of great complexity, it is useful to identify the basic patterns or models, and classify the detail accordin
46、g to these models. 一个用 于说明如何完成某项特定任务的抽象模型。 2.1 Paradigm and Abstraction Abstractions: n Arguably the most fundamental concept in computer science, abstraction is the idea of detail hiding. We often use abstraction when it is not necessary to know the exact details of how something works or is repres
47、ented, because we can still make use of it in its simplified form. Getting involved with the detail often tends to obscure what we are trying to understand, rather than illuminate it.(太细节 的东西会影响对主要问题的理解) n Abstraction plays a very important role in programming because we often want to model, in soft
48、ware, simplified versions of things that exist in the real world without having to build the real things.(编程就是对所关注事情的抽象) n In software engineering, abstraction is realized with the provision of tools or facilities which allow software to be built without the developer having to be cognizant of some
49、of the underlying complexities.(工具和设施提供抽象) n 科研过程更是如此. 2.1 Overview of presented Paradigms Message passing Remote Procedure Call Client-server Peer-to-Peer Message system: Point-to-point; Publish/Subscribe Distributed objects: Remote method invocation Network services Object Request Broker Object sp
50、ace Component Based Technologies Mobile agents Collaborative applications Message passing Remote Procedure Call Client-server Peer-to-Peer Message system: p Point-to-point; p Publish/Subscribe Distributed objects: p Remote method invocation p Network services p Object Request Broker p Object space p