Title: Continuous checkpointing: Joining the checkpointing with virtual memory paging
Authors: Hsu, ST
Chang, RC
Department of Computer Science
Keywords: fault tolerance;incremental checkpoint;memory paging
Issue Date: 1-Sep-1997
Abstract: Checkpointing is a basic mechanism for backward error-recovery in fault-tolerant systems. A checkpointed process stops execution and saves its states to files periodically. To reduce the file sizes, only data modified between two consecutive checkpoint times is saved. However, existing approaches do not consider operating system paging activities; which, if ignored may double the number of disk accesses required to checkpoint non-resident dirty pages. In this paper, we propose continuous checkpointing, which combines the checkpoint facility with virtual memory paging operations. Thus, checkpointing is continuous during the Lifetime of a process without extra overhead. Checkpoint size is no longer proportional to application size, but rather is bounded by resident dirty pages. Experimental results show that disk accesses can be reduced by about 80% when checkpointing large applications. (C) 1997 by John Wiley & Sons, Ltd.
URI: http://hdl.handle.net/11536/338
ISSN: 0038-0644
Volume: 27
Issue: 9
Begin Page: 1103
End Page: 1120
Appears in Collections:Articles

Files in This Item:

  1. A1997XW08300007.pdf