The Community for Technology Leaders
Proceedings of IEEE 24th International Symposium on Fault- Tolerant Computing (1994)
Austin, TX, USA
June 15, 1994 to June 17, 1994
ISBN: 0-8186-5520-8
pp: 186-195
D.K. Pradhan , Dept. of Comput. Sci., Texas A&M Univ., College Station, TX, USA
N.H. Vaidya , Dept. of Comput. Sci., Texas A&M Univ., College Station, TX, USA
ABSTRACT
Performance and reliability achieved by a modular redundant system depend on the recovery scheme used. Typically, gain in performance using comparable resources results in reduced reliability. Several high performance computers are noted for small mean time to failure. Performance is measured here in terms of mean and variance of the task completion time, reliability being a task-based measure defined as the probability that a task is completed correctly. Two roll-forward schemes are compared with two rollback schemes for achieving recovery in duplex systems. The roll-forward schemes discussed here are based on a roll-forward checkpointing concept. Roll-forward recovery schemes achieve significantly better performance than rollback schemes by avoiding rollback in most common fault scenarios. It is shown that the roll-forward schemes improve performance with only a small loss in reliability as compared to rollback schemes.<>
INDEX TERMS
performance evaluation, fault tolerant computing, reliability
CITATION

D. Pradhan and N. Vaidya, "Roll-forward and rollback recovery: performance-reliability trade-off," Proceedings of IEEE 24th International Symposium on Fault- Tolerant Computing(FTCS), Austin, TX, USA, 1994, pp. 186-195.
doi:10.1109/FTCS.1994.315642
80 ms
(Ver 3.3 (11022016))