Experimental Evaluation of the Fail-silent behaviour in Computers without error masking
Henrique Madeira, Joao Gabriel Silva.
International Symposium on Fault Tolerant Computing Systems", FTCS-24
June 1994.
Abstract
Traditionally, fail-silent computers are implemented by using
massive redundancy (hardware or software). In this research we investigate
if it is possible to obtain a high degree of fail -silent behavior from
a computer without hardware or software replication by using only simple
behavior based error detection techniques. It is assumed that if the errors
caused by a fault are detected in time it will be possible to stop the
erroneous computer behavior, thus preventing the violation of the fail-silent
model. The evaluation technique used in this research is physical fault
injection at the pin level. Results obtained by the injection of about
20000 different faults in two different target systems have shown that
1) in a system without error detection up to 46% of the faults caused the
violation of the fail-silent model; 2) in a computer with behavior based
error detection the percentage of faults that caused the violation of the
fail-silent mode was reduced to values from 2.3% to 0.4%; 3) the results
are very dependent on the target system, on the program under execution
during the fault injection and on the type of faults.