Proceedings of IEEE 24th International Symposium on Fault- Tolerant Computing (1994)
Austin, TX, USA
June 15, 1994 to June 17, 1994
H. Madeira , Coimbra Univ., Portugal
J.G. Silva , Coimbra Univ., Portugal
Traditionally, fail-silent computers are implemented by using massive redundancy (hardware or software). In this research we investigate if it is possible to obtain a high degree of fail-silent behavior from a computer without hardware or software replication by using only simple behavior based error detection techniques. It is assumed that if the errors caused by a fault are detected in time it will be possible to stop the erroneous computer behavior, thus preventing the violation of the fail-silent model. The evaluation technique used in this research is physical fault injection at the pin level. Results obtained by the injection of about 20000 different faults in two different target systems have shown that: in a system without error detection up to 46% of the faults caused the violation of the fail-silent model; in a computer with behavior based error detection the percentage of faults that caused the violation of the fail-silent mode was reduced to values from 2.3% to 0.4%; the results are very dependent on the target system, on the program under execution during the fault injection and on the type of faults.<
fault tolerant computing, redundancy, error handling, system recovery, software reliability, computer debugging
H. Madeira and J. Silva, "Experimental evaluation of the fail-silent behavior in computers without error masking," Proceedings of IEEE 24th International Symposium on Fault- Tolerant Computing(FTCS), Austin, TX, USA, 1994, pp. 350-359.