Experimental Evaluation of the Fail-Silent Behavior in Programs with Consistency Checks

Mário Zenha Rela, Henrique Madeira, João G. Silva

IEEE 26th Fault Tolerant Computing Symposium, FTCS-26, Sendai, Japão, 25 a 27 de Junho de 1996

Abstract
Previous work has shown that using only simple behavior based error detection mechanisms invisible to the programmer (e.g. memory protection) the percentage of fail-silent violations can be higher than 10%. Since the study of these errors has shown that they were mostly pure data errors, in this paper we evaluate the effectiveness of software techniques checking the semantics of the data such as ABFT and Assertions to detect these remaining errors. The results of injecting physical pin-level faults show that these tests can prevent about 40% of the fail-silent model violations that have escaped to the simple hardware-based error detection techniques. Moreover, the analysis of the remaining errors has shown that most of them remained undetected due to short range control flow breaks. When very simple software-based control flow checking was associated to the semantic tests, the target system behaved —without any dedicated error detection hardware— according to the fail-silent model for more than 98% of all the faults injected