A New Concurrent Checkpoint Mechanism for Embeded Multi-Core Systems
Jianwei Liao; College of Computer and Information Science, Southwest University of China
Computing and Informatics, Tome 28 (2012) no. 1, / Harvested from Computing and Informatics
his paper presents a new transparent, incremental, concurrent checkpoint mechanism for embedded multi-core systems. It allows the checkpointed process (also called checkpointee) to continue running without stopping while checkpoints are set to a large extent. Through tracing TLB misses to block the accesses to target memory pages first time while dumping memory pages (the most time-consuming step when setting a checkpoint). At that time, a kernel thread, called checkpointer, copies the memory access target pages to the designated memory buffer for constructing a consistent state of the checkpointee, and then resumes the memory accesses. From the experimental results, in contrast to a traditional concurrent checkpoint system, the proposed mechanism reduces the downtime of the checkpointed process by more than 10.1 %. Moreover, the incremental checkpointing functionality has been implemented in this new concurrent checkpoint mechanism as well. Compared with full checkpointing, incremental checkpointing can reduce the checkpoint time more than 95.5 % and 89.2 % while the benchmark is the matrix multiplication at the checkpoint intervals of 10 seconds and 20 seconds, respectively.
Publié le : 2012-08-10
Classification:  Concurrent checkpoint, reduced downtime, incremental checkpoint, embedded multi-core systems,  68N25, 68M25
@article{cai1015,
     author = {Jianwei Liao; College of Computer and Information Science, Southwest University of China},
     title = {A New Concurrent Checkpoint Mechanism for Embeded Multi-Core Systems},
     journal = {Computing and Informatics},
     volume = {28},
     number = {1},
     year = {2012},
     language = {en},
     url = {http://dml.mathdoc.fr/item/cai1015}
}
Jianwei Liao; College of Computer and Information Science, Southwest University of China. A New Concurrent Checkpoint Mechanism for Embeded Multi-Core Systems. Computing and Informatics, Tome 28 (2012) no. 1, . http://gdmltest.u-ga.fr/item/cai1015/