Department of Computer Systems
|SZURMAN Karel, MIČULKA Lukáš and KOTÁSEK Zdeněk. Towards a State Synchronization Methodology for Recovery Process after Partial Reconfiguration of Fault Tolerant Systems. Proceedings of the 4th Prague Embedded Systems Workshop. Roztoky u Prahy, 2016.|
|Original title:||Towards a State Synchronization Methodology for Recovery Process after Partial Reconfiguration of Fault Tolerant Systems|
|Book:||Proceedings of the 4th Prague Embedded Systems Workshop|
|Conference:||The 4th Prague Embedded Systems Workshop|
|Place:||Roztoky u Prahy, CZ|
|Space and safety-critical applications are systems where the usage of active fault tolerance and recovery techniques have increasing importance. These systems often utilize SRAM FPGAs due to their high performance and flexibility. However, the SRAM FPGAs contain configuration memory which is prone to radiation-induced faults (e.g. single event upsets) and thus, specific fault mitigation strategies must be implemented into the system design. The most used form of increasing reliability in these fault tolerant systems is triple modular redundancy which can be easily combined with partial dynamic reconfiguration ability to preserve correct functionality of the system. An integral part of the recovery process is except of fault-masking behavior and partial reconfiguration also the synchronization of reconfigured circuit copy with remaining components which are during the recovery process still operating. The synchronization process is closely related to the system architecture, specific requirements and functionality. Our aim is to propose specific methodology to design and implement the most suitable synchronization procedure for the online recovery of target system, without the necessity to reset or to stop the system operation.|
In the paper, basic principles of our methodology are described together with generic architecture for synchronization of any fault tolerant system. Results from our experiments, where we developed reconfigurable fault tolerant CAN bus control system and the synchronization method which combines finite-state machine synchronization with serial/parallel roll-forward data recovery, are presented.