標題: An implementation of using remote memory to checkpoint processes
作者: Hsu, ST
Chang, RC
資訊工程學系
Department of Computer Science
關鍵字: fault tolerance;remote memory;checkpoint
公開日期: 1-Sep-1999
摘要: Process checkpointing is a procedure which periodically saves the process states into stable storage. Most checkpointing facilities select hard disks for archiving. However, the disk seek time is limited by the speed of the read-write heads, thus checkpointing process into a local disk requires extensive disk bandwidth. In this paper, we propose an approach that exploits the memory on idle workstations as a faster storage for checkpointing. In our scheme, autonomous machines which submit jobs to the computation server offer their physical memory to the server for job checkpointing. Eight applications are used to measure the remote memory performance in four checkpointing policies. Experimental results show that remote memory reduces at least 34.5 per cent of the overhead for sequential checkpointing and 32.1 per cent for incremental checkpointing. Additionally, to checkpoint a running process into a remote memory requires only 60 per cent of the local disk checkpoint latency time. Copyright (C) 1999 John Wiley & Sons, Ltd.
URI: http://hdl.handle.net/11536/31107
http://dx.doi.org/10.1002/(SICI)1097-024X(199909)29:11<985
ISSN: 0038-0644
DOI: 10.1002/(SICI)1097-024X(199909)29:11<985
期刊: SOFTWARE-PRACTICE & EXPERIENCE
Volume: 29
Issue: 11
起始頁: 985
結束頁: 1004
Appears in Collections:Articles


Files in This Item:

  1. 000082691100005.pdf