Fault Recovery for Coarse-Grained TMR Soft-Core Processor Using Partial Reconfiguration and State Synchronization
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F19%3APU136526" target="_blank" >RIV/00216305:26230/19:PU136526 - isvavai.cz</a>
Výsledek na webu
<a href="https://www.fit.vut.cz/research/publication/12002/" target="_blank" >https://www.fit.vut.cz/research/publication/12002/</a>
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Fault Recovery for Coarse-Grained TMR Soft-Core Processor Using Partial Reconfiguration and State Synchronization
Popis výsledku v původním jazyce
SRAM FPGAs are being more commonly integrated into safety-critical systems nowadays. These digital circuits can provide suitable platform for a fault tolerant system implementation meeting the trade-offs between performance, reliability, cost and hardware resources. However, SRAM technology is vulnerable to radiation-induced faults and mainly to Single Event Upset (SEU) effect. The SEU can cause "bitflip" faults in SRAM memory cells which may affect internal FPGA routing (clock and reset signals), user memory (flip-flops, block RAM) and the functionality of implemented circuits. SEU mitigation must be implemented into the safety-critical design to achieve required system reliability in the harsh environment. SEU mitigation strategy may combine hardware redundancy and Partial Dynamic Reconfiguration (PDR) in order to implement error detection, self-repair ability and fault recovery mechanism into the system. With respect to the compromise between the system reliability and the resource overhead, various hardware redundancy schemes can be used. The most used form is Triple Modular Redundancy (TMR) which can be applied on different granularity levels in the system design. Coarse-grained TMR and PDR are often combined in one reconfigurable architecture. The time between SEU occurrence and the completion of fault recovery become a crucial parameter because the reliability of the TMR with one failed replica is worse than the reliability of an unprotected system. The fault recovery process can be generally divided into three phases: 1) fault detection, 2) fault removal by reconfiguration of a region containing replica identified as faulty, and 3) state synchronization bringing the reconfigured replica into the operating state consistent with other correctly operating replicas. Combination of TMR and PDR is the approach also often addressed by fault mitigation methods designed for soft-core processors. The processor state is stored in internal memor
Název v anglickém jazyce
Fault Recovery for Coarse-Grained TMR Soft-Core Processor Using Partial Reconfiguration and State Synchronization
Popis výsledku anglicky
SRAM FPGAs are being more commonly integrated into safety-critical systems nowadays. These digital circuits can provide suitable platform for a fault tolerant system implementation meeting the trade-offs between performance, reliability, cost and hardware resources. However, SRAM technology is vulnerable to radiation-induced faults and mainly to Single Event Upset (SEU) effect. The SEU can cause "bitflip" faults in SRAM memory cells which may affect internal FPGA routing (clock and reset signals), user memory (flip-flops, block RAM) and the functionality of implemented circuits. SEU mitigation must be implemented into the safety-critical design to achieve required system reliability in the harsh environment. SEU mitigation strategy may combine hardware redundancy and Partial Dynamic Reconfiguration (PDR) in order to implement error detection, self-repair ability and fault recovery mechanism into the system. With respect to the compromise between the system reliability and the resource overhead, various hardware redundancy schemes can be used. The most used form is Triple Modular Redundancy (TMR) which can be applied on different granularity levels in the system design. Coarse-grained TMR and PDR are often combined in one reconfigurable architecture. The time between SEU occurrence and the completion of fault recovery become a crucial parameter because the reliability of the TMR with one failed replica is worse than the reliability of an unprotected system. The fault recovery process can be generally divided into three phases: 1) fault detection, 2) fault removal by reconfiguration of a region containing replica identified as faulty, and 3) state synchronization bringing the reconfigured replica into the operating state consistent with other correctly operating replicas. Combination of TMR and PDR is the approach also often addressed by fault mitigation methods designed for soft-core processors. The processor state is stored in internal memor
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
20206 - Computer hardware and architecture
Návaznosti výsledku
Projekt
<a href="/cs/project/LQ1602" target="_blank" >LQ1602: IT4Innovations excellence in science</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2019
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Proceedings of the 7th Prague Embedded Systems Workshop
ISBN
978-80-01-06607-2
ISSN
—
e-ISSN
—
Počet stran výsledku
2
Strana od-do
6-7
Název nakladatele
Faculty of Information Technology, Czech Technical University
Místo vydání
Roztoky u Prahy
Místo konání akce
Roztoky u Prahy
Datum konání akce
27. 6. 2019
Typ akce podle státní příslušnosti
EUR - Evropská akce
Kód UT WoS článku
—