Proceedings 2001 Pacific Rim International Symposium on Dependable Computing
Download PDF

Abstract

While various checkpointing schemes have been Widely used to reduce the recovery time when a Fault occurs, the problem of evaluating the optimal checkpoint interval that maximizes the availability of the system has been a critical research issue for decades. The evaluation can be done by developing analytical models with restrict assumptions. However, the analytical model has reached its limitations as the checkpointing schemes become complicated. This paper proposes to use stochastic Petri net Model for the evaluation and shows the effectiveness of the approach using case studies. The paper develops stochastic Petri net Models and shows how to obtain the optimal checkpoint intervals for systems employing two widely used checkpointing schemes: Checkpoint with Rollback Recovery scheme for uniprocessor systems and Primary Site Approach for multiprocessor systems.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!