[1]王辉,汪芸,马骏驰.一种考虑防护措施的缓存可靠性评估方法[J].东南大学学报(自然科学版),2015,45(1):17-22.[doi:10.3969/j.issn.1001-0505.2015.01.004]
 Wang Hui,Wang Yun,Ma Junchi.Cache reliability evaluation method considering protective strategies[J].Journal of Southeast University (Natural Science Edition),2015,45(1):17-22.[doi:10.3969/j.issn.1001-0505.2015.01.004]
点击复制

一种考虑防护措施的缓存可靠性评估方法()
分享到:

《东南大学学报(自然科学版)》[ISSN:1001-0505/CN:32-1178/N]

卷:
45
期数:
2015年第1期
页码:
17-22
栏目:
计算机科学与工程
出版日期:
2015-01-20

文章信息/Info

Title:
Cache reliability evaluation method considering protective strategies
作者:
王辉汪芸马骏驰
东南大学计算机科学与工程学院, 南京210096
Author(s):
Wang Hui Wang Yun Ma Junchi
School of Computer Science and Engineering, Southeast University, Nanjing 210096, China
关键词:
软错误 时空多位翻转 AVF分析 马尔科夫状态
Keywords:
soft error temporal/spatial multiple bit upset architectural vulnerability factor(AVF)analysis Markov states
分类号:
TP302.8
DOI:
10.3969/j.issn.1001-0505.2015.01.004
摘要:
为了提高缓存单元的可靠性,在软错误防护代价和缓存可靠性之间进行均衡,提出一种基于马尔科夫链的缓存可靠性模型.首先,改进了现有缓存架构脆弱性因子AVF和生命周期分析方法;然后,将单粒子时空单比特和多比特翻转的非等概率特性进行综合分析,在缓存可靠性设计中加入诸如奇偶校验、单位纠错双位检错和交错布局等防护措施;最后,基于单粒子翻转时空累积效应和检错纠错防护策略,使用SPEC2000标准测试程序在Sim-Alpha仿真处理器上对该评估方法进行实验验证.结果表明:所提方法可较好地预测特定应用程序下的缓存可靠性;相比于传统的基于蒙特卡洛错误注入的方法,该方法时间开销更小,应用针对性更强.
Abstract:
In order to improve the reliability of cache, the balance between the price of soft error protection and reliability of cache is achieved, and a model based on Morkov chain is proposed. First, the existing architectural vulnerability factor(AVF)and the life cycle analysis method are improved. Then, the non-equiprobable time-space characteristics of single-bit and multi-bit upsets are comprehensively analyzed. Various protective measures, such as parity checking, single error correction double error detection(SECDED)and staggered layout, are applied to the reliability designs. Finally, based on the cumulative effect of temporal single event upset(SEU)and error detection/correction protective measures, the proposed method is verified by using standard benchmarks of SPEC2000 on a Sim-Alpha processor. The simulation results show that under the specific application, the proposed method can predict the reliability of cache. Compared with the traditional method based on the Monte Carlo of error injection, this method spends less time and is more application-specific.

参考文献/References:

[1] Wang N J, Quek J, Rafacz T M, et al. Characterizing the effects of transient faults on a high-performance processor pipeline[C]//2004 IEEE International Conference on Dependable Systems and Networks. Florence, Italy, 2004: 61-70.
[2] Mukherjee S S, Weaver C, Emer J, et al. A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor[C]//Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture. Washington DC, USA, 2003: 29-40.
[3] Li X, Adve S V, Bose P, et al. SoftArch: an architecture-level tool for modeling and analyzing soft errors[C]//2005 IEEE International Conference on Dependable Systems and Networks.Yokohama, Japan, 2005: 496-505.
[4] Suh J, Annavaram M, Dubois M. MACAU: a Markov model for reliability evaluations of caches under single-bit and multi-bit upsets[C]//2012 IEEE International Symposium on High Performance Computer Architecture. New Orleans, LA, USA,2012: 1-12.
[5] Reviriego P, Maestro J A. Study of the effects of multibit error correction codes on the reliability of memories in the presence of MBUs[J]. IEEE Transactions on Device and Materials Reliability, 2009, 9(1): 31-39.
[6] Biswas A, Racunas P, Cheveresan R, et al. Computing architectural vulnerability factors for address-based structures[C]//2005 IEEE International Symposium on Computer Architecture. Madison, WI, USA, 2005: 532-543.
[7] Fu X, Li T, Fortes J. Sim-SODA: a unified framework for architectural level software reliability analysis[EB/OL].(2006)[2014-03-07]. http://www.ittc.ku.edu/~xinfu/publications/simsoda-mobs06.pdf.
[8] Georgakos G, Huber P, Ostermayr M, et al. Investigation of increased multi-bit failure rate due to neutron induced SEU in advanced embedded SRAMs[C]//2007 IEEE Symposium on VLSI Circuits. Kyoto, Japan, 2007: 80-81.
[9] Tipton A D, Pellish J A, Hutson J M, et al. Device-orientation effects on multiple-bit upset in 65 nm SRAMs[J]. IEEE Transactions on Nuclear Science, 2008, 55(6): 2880-2885.
[10] 陈东彦, 李冬梅, 王树忠. 数学建模[M]. 北京:科学出版社, 2007:110-112.
[11] Hamerly G, Perelman E, Lau J, et al. Simpoint 3.0: faster and more flexible program phase analysis[J]. Journal of Instruction Level Parallelism, 2005, 7(4): 1-28.
[12] Saleh A M, Serrano J J, Patel J H. Reliability of scrubbing recovery-techniques for memory systems[J]. IEEE Transactions on Reliability, 1990, 39(1): 114-122.

备注/Memo

备注/Memo:
收稿日期: 2014-07-14.
作者简介: 王辉(1988—),男,博士生;汪芸(联系人),女,博士,教授,博士生导师,yunwang@seu.edu.cn
引用本文: 王辉,汪芸,马骏驰.一种考虑防护措施的缓存可靠性评估方法[J].东南大学学报:自然科学版,2015,45(1):17-22. [doi:10.3969/j.issn.1001-0505.2015.01.004]
更新日期/Last Update: 2015-01-20