1、进程间通信是一切分布式系统的核心。分布式系统中的通信都是基于底层网络提供的低层消息传递机制的。通过消息传递来描述通信过程比使用基于共享存储器的原语来描述要更困难。l远程过程调用 (remote procedure call,RPC)l远程方法调用 (remote method invocation,RMI)l面向消息的中间件 (message-oriented middleware,MOM)l面向流的通信 (stream-oriented communication)由于没有共享存储器,分布式系统中的所有通信都是基于(低层)消息交换的。Protocols are agreements/rule
2、s on communicationProtocols could be connection-oriented or connectionlessApplicationPresentationSessionTransportNetworkData linkPhysicalApplication protocolPresentation protocolSession protocolTransport protocolNetwork protocolData link protocolPhysical protocolNetwork7654321A typical message as it
3、 appears on the network.2-2Discussion between a receiver and a sender in the data link layer.2-3a)Normal operation of TCP(常规TCP的许多开销都耗费在连接的管理上).b)Transactional TCP(更经济的方式).2-4服务器执行请求的操作告诉服务器关闭该连接客户启动连接建立过程1,2,3要求客户释放连接将发送请求当作连接的建立标志(一个消息包含三条信息)把送回结果当作连接的关闭标志确认连接的终止确认已收到客户的请求Middleware:-An applicatio
4、n that logically lives in the application layer -Contains many general-purpose protocols that warrant(代表)their own layers -(会话层和表示层由一个单一的中间件层代替)ApplicationMiddlewareTransportNetworkData linkPhysicalApplication protocolMiddleware protocolTransport protocolNetwork protocolData link protocolPhysical pr
5、otocolNetwork654321Structure:group of servers offering service to clients -Servers:offer services to the users called“clients”-Clients:applications requiring services from servers -Example:Web Server/clients,File server Why use client-server model -simplicity -low(er)overheadsclientkernelfile server
6、kernelprocess serverkernelterminalserverkernelBased on a request/response paradigm -Clients send a request asking for service(e.g.,a file block)-Server processes and replies with result(or error)Techniques:-Socket,Remote Procedure Calls(RPC),Remote Method Invocation(RMI)clientkernelfile serverkernel
7、process serverkernelterminalserverkernelClient-Server provides a mechanism for services in distributed systems BUT -requires explicit communication(send-receive)Q:How do we make“distributed computing look like traditional(centralized)computing”?Can we use procedure calls?A calls B-A suspended,B exec
8、utes-B returns,A executes Information from A(caller)to B(callee)transferred using parameters Somewhat easier since both caller and callee execute in the same address spacea)Parameter passing in a local procedure call:the stack before the call to read(fd,buf,nbytes)b)The stack while the called proced
9、ure is active(执行时)调用方(主程序)把参数、返回地址反序压入堆栈In Distributed systems:the callee may be on a different system -Remote Procedure call(RPC,允许程序调用位于其它机器上的进程)-NO EXPLICIT MESSAGE PASSING(对编程人员)从而达到隐藏通信(send和receive),实现分布式系统中的访问透明性Goal:Make RPC look(as much as possible)like local procedure call -allow remote se
10、rvices to be called as procedures -caller should not be aware of the fact that the callee is (executing)on a different machine(or vice versa)Although no message passing(at user level),parameters must still be passed&results must still be returned!Principle of RPC between a client and server program.
11、客户通过执行普通的(本地)过程调用来访问远程服务,它并不需要直接调用Send和Receive,消息传递的所有细节都隐藏在双方的库过程(Stub,Skeleton)中。1.Client procedure calls client stub in normal way2.Client stub builds message,calls local OS3.Clients OS sends message to remote OS4.Remote OS gives message to server stub5.Server stub unpacks parameters,calls server
12、6.Server does work,returns result to the stub7.Server stub packs it in message,calls local OS8.Servers OS sends message to clients OS9.Clients OS gives message to client stub10.Stub unpacks result,returns to clientSteps involved in doing remote computation through RPC2-8Problem:different machines ha
13、ve different data formats -Intel:little endian,SPARC:big endianSolution:use a standard representation -Example:external data representation (XDR)Problem:how do we pass pointers?(指针只在它被使用的进程的地址空间里面是有意义的)-If it points to a well-defined data structure,pass a copy to the server and the server stub passe
14、s(使用)a pointer to the local copy(来调用服务器程序)What about data structures containing pointers?-Prohibit -Copy/restore,machine independent representationCall by Copy/restore:首先由调用者将变量拷贝到Stack,调用完成之后,将堆栈中的变量复制回去覆盖调用方该变量原先的值。参数从客户发送到服务器,在那里修改,然后发送回客户,覆盖原来的值。Marshalling(编组,即把参数打包进消息中发送到服务器):transform paramet
15、ers/results into a byte streamProblem:how does a client locate a server?(Client要发送消息到服务器,它需要知道服务器的地址)-Use BindingsServer -Export server interface during initialization -Send name,version#,unique identifier,handle(address)to binderClient (第一次调用远程过程,还没有绑定到一个Server)-First RPC:send msg to binder to impo
16、rt server interface -Binder:check to see if some server has exported the interface Return handle and unique identifier to client Binder充当一个注册服务器的角色。Method flexible -can handle multiple servers with same interface -binder can poll a server to see if it is up and may deregister it if it is down for fa
17、ult tolerance -can enforce authentication(not giving interface to user not on list)Exporting and importing incurs overheadsBinder can be a bottleneck -Use multiple bindersBinder can do load balancing 随机地扩展(散播)Clients到这些Servers来平衡负载。a)The interconnection between client and server in a traditional RPC
18、b)The interaction using asynchronous RPC2-12A client and server interacting through two asynchronous RPCs2-13The steps in writing a client and a server in DCE RPC.2-14 Client-to-server binding in DCE.2-15RPCs applied to(distributed)objects,i.e.,instances of a class -Class:object-oriented abstraction
19、;module with data and operations -Separation between interface and implementation -Interface resides on one machine,implementation on anotherRMIs support system-wide object references -Parameters can be object referencesWhen a client binds to a distributed object,load the interface(“proxy”)into clie
20、nt address space -Proxy analogous to stubsServer stub is referred to as a skeletonClient OSClient invokes a methodClientServer OSSkeletonServerStateMethodInterfaceProxySkeleton invokes same method at objectSame interface as objectClient machineServer machineObjectNetworkMarshalled invocation is pass
21、ed across networkProxy:client stub -Maintains server ID,endpoint,object ID -Sets up and tears down connection with the server -Does serialization of local object parameters -In practice,can be downloaded/constructed on the flySkeleton:server stub -Does deserialization and passes parameters to server
22、 and sends result to proxya)An example with implicit binding using only global referencesb)An example with explicit binding using global and local referencesDistr_object*obj_ref;/Declare a systemwide object referenceobj_ref=;/Initialize the reference to a distributed objectobj_ref-do_something();/Im
23、plicitly bind and invoke a method(a)Distr_object objPref;/Declare a systemwide object referenceLocal_object*obj_ptr;/Declare a pointer to local objectsobj_ref=;/Initialize the reference to a distributed objectobj_ptr=bind(obj_ref);/Explicitly bind and obtain a pointer to the local proxyobj_ptr-do_so
24、mething();/Invoke a method on the local proxy(b)The situation when passing an object by reference or by value.2-18a)Distributed dynamic objects in DCE.b)Distributed named objects2-19l远程过程调用和远程对象调用都有助于隐藏分布式系统中的通信,也就是说增强了访问透明性。l不幸的是,这两种机制并不总是适用的。特别是当无法保证发出请求时接收端一定正在执行的情况下,就必须有其他的通信服务。l同时,RPC和RMI的同步特性也
25、会造成客户在发出的请求得到处理之前被阻塞,因而有时也需要采取其他办法。l这里所说的“其他方法”就是消息传递机制。-Persistence and synchronicity -Message-oriented transient communication Berkeley socket MPI -Message-oriented persistent communication message Message queuing systemsl持久(persistent)通信 需要传输的消息在提交之后由通信系统来存储,直到将其交付给接收者为止。即在将消息成功交付给下一个通信服务器之前,消息一直
26、存储在通信服务器中。其典型例子是电子邮件系统。l暂时(transient)通信 通信系统只在发送和接收消息的应用程序的运行期间存储消息。更准确地说,如果通信服务器无法将消息递送到下一个服务器或者接收者,消息将会被简单地丢弃。l异步通信 发送者把要传输的消息提交之后立即继续执行其其他程序,这意味着该消息存储在位于发送端主机的本地缓冲区中,或者存储在送达的第一个通信服务器上的缓冲区中。l同步通信 发送者在提交消息之后会被阻塞,直到消息已经到达并存储在接收主机的本地缓冲区中以后,也就是消息确实已经传送到接收者之后,才会继续执行其它程序。Transient synchronous comm:respo
27、nse-based=weaker forms,such as delivery-based and reply-basedTransient asynchronous comm:message-passing systemsPersistent comm:developing of middleware for large-scale interconnected networks;failure masking and recoveryMany distributed systems built on top of simple message-oriented model -Example
28、:Berkeley sockets -A socket is an abstract representation of a communication endpointPrimitiveMeaningSocketCreate a new communication endpointBindAttach a local address to a socketListenAnnounce willingness to accept connectionsAcceptBlock caller until a connection request arrivesConnectActively att
29、empt to establish a connectionSendSend some data over the connectionReceiveReceive some data over the connectionCloseRelease the connection Socket primitives for TCP/IP.Connection-oriented communication pattern using sockets.Some of the most intuitive message-passing primitives of MPI.PrimitiveMeani
30、ngMPI_bsendAppend outgoing message to a local send bufferMPI_sendSend a message and wait until copied to local or remote bufferMPI_ssend Send a message and wait until receipt startsMPI_sendrecvSend a message and wait for replyMPI_isendPass reference to outgoing message,and continueMPI_issendPass ref
31、erence to outgoing message,and wait until receipt startsMPI_recvReceive a message;block if there are noneMPI_irecvCheck if there is an incoming message,but do not blockMessage queuing systems or Message-Oriented Middleware(MOM)-Support asynchronous persistent communication -Intermediate storage for
32、message while sender/receiver are inactive -Example application:email -支持那些时间要求较为宽松的消息传输,比如那些要求几分钟完成的传输,而适用于那些必须在几秒内甚至几微秒内完成的传输。(不同于Socket及MPI)Communicate by inserting messages in queues消息由一系列通信服务器依次进行转发,最终送达目的地。即使在消息发送过程中接受者的机器未处于运行状态,消息也能送到。原则上,每一个应用程序都拥有归其私有的消息队列,其他应用程序可以发送消息到该队列中。队列只能由相应 的(也就是拥有
33、该队列的应用程序)读取,但是也可能有多个应用程序共享单个队列。Sender is only guaranteed that message will be eventually inserted in recipients queue -No guarantees on when arriving or if the message will be readFour combinations for loosely-coupled communications using queues.2-26 Loosely coupled communication 消息进入接收者的消息队列时,接收者不必
34、处于运行状态;即使发送者没有运行,接收者也可以读出发送给它的消息。发送者和接收者可以彼此完全独立地运行。Basic interface to a queue in a message-queuing system.PrimitiveMeaningPutAppend a message to a specified queueGetBlock until the specified queue is nonempty,and remove the first messagePollCheck a specified queue for messages,and remove the first.
35、Never block.NotifyInstall a handler to be called when a message is put into the specified queue.主机通过网络相连的通信系统的一般组织结构主机通过网络相连的通信系统的一般组织结构l消息由(发送者的)源队列向(接收者的)目的队列传输。l全部队列的集合是分布在多台机器上的,因此,对于要传输消息的消息队列系统来说,他应该维护一个由队列到其所在网络位置之间的映射关系,即维护一个存储网络位置(IP地址)所对应的队列名的数据库。The relationship between queue-level addres
36、sing and network-level addressing.l队列由队列管理器来管理。一般来说,队列管理器与发送或者接收消息的应用程序直接交互。然而,也有一些特殊的队列管理器发挥了路由器或者中继器的作用:他们将输入的消息转发给其他的队列管理器。l一种解决方案是,使用若干了解网络拓扑的路由器。如果发送者A将目的地为B的消息放入其本地队列中,该消息将会首先传输到最邻近的路由器,该路由器知道如何将该消息往B的方向转发。The general organization of a message-queuing system with routers.2-29随着多媒体分布式系统的出现,必须引入
37、流(stream)的概念,以支持连续媒体的通信。-Data stream -Quality of services -Stream synchronizationA data stream is a sequence of data unitsDiscrete or continuous:-Discrete stream:UNIX pipes or TCP/IP connections-Continuous stream:audio or video(同步非常关键)For continuous stream,three transmission modes:-Asynchronous tran
38、smission mode:the data items are transmitted one after the other,but there are no further timing constraints on when transmission should take place-Synchronous transmission mode:there is a maximum end-to-end delay defined for each unit in a data stream-Isochronous(等时)transmission mode:data units are
39、 transferred on time,have a maximum and minimum end-to-end delay (a)通过网络在两个进程间建立流连接 (b)在两个设备间直接建立流连接“Distributed Systems:Principles and Paradigms”by Tanenbaum and Van Steen,Prentice Hall 2002“Distributed Systems:Concepts and Design”by George Coulouris etc.,Addison-Wesley 2001(Third edition)分布式系统设计,(美)Jie Wu著,高传善等译,机械工业出版社,2001分布式操作系统,Andrew S.Tanenbaum著,电子工业出版社,1999