Proceedings 2001 Pacific Rim International Symposium on Dependable Computing
Download PDF

Abstract

In a multiprocessor under normal loading conditions, idle processors naturally offer spare capacity. Previous work attempted to utilize this redundancy to overcome the limitations of classic diagnosability and modular redundancy techniques while providing significant fault tolerance. A popular approach has been task duplexing. The usefulness of this approach for critical applications, unfortunately, is seriously undermined by its susceptibility to agreement on faulty outcomes (malicious agreement). To assess dependability of duplexing under malicious agreement, we propose a stochastic model which dynamically profiles behavior in the presence of malicious faults. The model uses a, more or less, typical policy we call NMR on demand (NMROD). Each task in a multiprocessor is duplicated, with additional processors allocated for recovery as needed. NMROD relies on a fault model favoring response correctness over actual fault status, and integrates on-line repair to provide non-stop operation over an extended period.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!