The PTC scheme for designing loosely coupled recoverable processes: issues in realizing bounded recovery time
The Third Workshop on Future Trends of Distributed Computing Systems (1992)
April 14, 1992 to April 16, 1992
K.H. Kim , Dept. of Electr. & Comput. Eng., California Univ., Irvine, CA, USA
The technology for designing loosely coupled distributed computer systems (DCSs) required to tolerate propagated errors caused by software and/or hardware has remained in an immature state. This paper focuses on the type of DCS applications where a system is structured as a set of loosely coupled interacting processes distributed among multiple physical sites and each process is designed in the 'partitioned design' mode, i.e. designed with its interface specification only, rather than with full knowledge of interfaces between other processes (or sites). The thesis is that fault tolerance capabilities must be designed into loosely coupled processes without violating the design policy. The programmer-transparent coordination (PTC) scheme is one such approach that has been evolving since 1978. While the basic PTC scheme called the PTC/OR (PTC with obedient receiver) scheme is a scheme for facilitating various forms of cooperative backward recovery in systems of loosely coupled processes, it has one drawback: the difficulty of bounding worst-case recovery time. After discussing various possible solution approaches and their limitations, a promising approach called the PTC/SL (PTC with session leaders) scheme which superimposes additional rules on structuring process interactions onto those of the PTC/OR scheme, is presented. Under the PTC/SL scheme various flexible forms of process interactions are still allowed while the task of ensuring bounded recovery time is made a simple one. Several research issues related to the PTC/SL scheme, e.g., efficient implementation techniques, remain as subjects for future research.<
distributed processing, fault tolerant computing, system recovery
K. Kim, "The PTC scheme for designing loosely coupled recoverable processes: issues in realizing bounded recovery time," The Third Workshop on Future Trends of Distributed Computing Systems(FTDCS), Taipei, Taiwan, , pp. 287-296.