Validação Experimental dos Mecanismos de Detecção de Erros do Computador de Bordo do Satélite SSR
A. C. Pereira Jr., H. Madeira, A. R. De Paula Jr.
ABSTRACT
This paper presents an experimental evaluation of the coverage and latency of the error detection mechanisms of the Brazilian satellite SSR on board computer. This evaluation has been carried out by using the Rifle pin-level fault injection tool, which has been adapted to the SSR on board computer by building a new adaptation module. Injected faults consisted of very short bit flips (one memory cycle) in the processor pins. Two sets of experiments have been done: in the first set faults have been injected at random along the program code and in the second experiment faults have been injected at random concerning execution time. A coverage from 31.5% to 56.4% with a short latency have been found, depending on the fault distribution. A analysis per error detection mechanism is presented and we have concluded that software error detection techniques present better coverage than the simple hardware mechanisms existing in the system. An analysis on the impact of the injected faults in the system is also presented. This analysis has show that from 41.3% to 68.4% of the faults have no impact on the macroscopic behavior of the system, as the errors caused by these faults have been overwritten or discarded in consequence of the normal program execution. Only a small percentagem of undetected faults (from 1.6% to 6.2%) caused the application to produce wrong results. Assuming that all the detected faults could be recovered by recovery mechanisms (under development at this moment), we found that the system can tolerate from 97,7% to 99,9% of the transient faults.