1、Fault-Tolerance in VHDL Description:Transient-Fault Injection&Early Reliability Estimation TIMA-INPG Lab Fabian Vargas,Alexandre Amory Raoul Velazcovargascomputer.org Raoul.Velazcoimag.frCatholic University PUCRSTIMA-INPG LaboratoryElectrical Engineering Dept.46,Av.Flix VialletAv.Ipiranga,6681 38031
2、 Grenoble90619-900 Porto Alegre FranceBrazil 1Summaryn1.Motivation:Important issues on the design of FT circuits for space applicationsn2.1.The Proposed Approach:uBuilt-In Reliability Functions LibraryuTarget Architecture:main blocksn2.2.Reliability Early-Estimation:uMain steps of the procedure and
3、fault-coverage estimationuFault-Injection Mechanism:LFSR to inject single/multiple faultsuExample of fault injection in the VHDL:Generate Statementn3.Conclusions&Future Work21.Motivation:Important concerns of computer designers for space applications:nPower computation,area usage,weight,and dependab
4、ility(availability,reliability,and testability).Main Characteristics&Drawbacks:napplication-specific systems(requirements change frequently from application to application):very expensive systems!nSynthesis(EDA)tools do not represent effective development facilities the short time available for maki
5、ng remedical changes to a faulty application in time-critical systems is not often respected.not optimized compilers.nThere is a lack of commercial libraries with special components(incorporating FT facilities)nDevelopment of an FPGA/ASIC board to test/validate the FT strategies:takes time and money
6、!31.Motivation:Fig.1.Illustration of the charge collection mechanism that causes single-event upset:(a)particle strike and charge generation;(b)current pulse shape generated in the n+p junction during the collection of the charge.(b)CurrentTime(nsec.)0.2 0 0.4 1 100 10 Delayed(Diffusion)Prompt(Drift
7、+Funneling)(a)p substraten+p+bodyN FET gateS0V+-+-D5Vion track+-n+-+-+-driftfunnelingdiffusion0V0Velectron current+-+-Radiation causes Single-Event Upset(SEU)in memory elements:Processor latches and cache mem.cells are sensitive to SEUs FPGAs store logic/routing in latches.42.1.The Proposed Approach
8、:Built-In Reliability Functions Library:achieving the desired circuit fault-toleranceFig.2.Block diagram of the FT-PRO tool being developed to automate the process of generating storage element transient-fault-tolerant complex circuits.High-Reliability HW PartVHDL SimulatorNONOYESYESHW SynthesisCirc
9、uit Reliability Verification StepFault-Injection ConstraintsBuilt-In Reliability Functions LibraryGeneration of the Fault-Tolerant HWTransient-Fault Coverage Desired Reliability Level?T r y t o s e l e c t d i f f e r e n t r e l i a b i l i t y f u n c t i o n sT r y t o s e l e c t d i f f e r e n
10、 t r e l i a b i l i t y f u n c t i o n sVHDL Circuit Description52.1.The Proposed Approach:Built-In Reliability Functions Library:achieving the desired circuit fault-toleranceFig.3.Target block diagram generated by the FT-PRO Tool:(a)for a single register;(b)for an n-register bank.(a)(b)62.1.The P
11、roposed Approach:Built-In Reliability Functions Library:achieving the desired circuit fault-toleranceFig.4.Control block diagrams:(a)Parity Generator;(b)Checker/Corrector.(a)(b)72.2.Reliability Early-Estimation:injecting transient faults(SEUs)in VHDL codenInsertion of the transient(single or multipl
12、e)fault in the VHDL code according to a predefined MTBF.nSimulate the circuit.nAfter simulation,we look for the primary outputs(POs)of the circuit to verify,for each of the injected transient faults,if they affected the functional circuit operation.8nIn this case,we can obtain one of the three concl
13、usions:Fthe fault was not propagated to the POs,then it is considered redundant;Fthe fault was propagated to the POs of the circuit and it was detected by the built-in reliability functions appended to the memory elements.(This can be verified by reading out the outputs of the comparators along with
14、 the VHDL code after simulation.)Then,the reliability of the circuit is maintained.Fif the fault produced an erroneous PO and it was not detected by the appended hardware,then the reliability of the circuit is reduced.This happens because either the reliability functions used in the program fail to
15、detect such a fault,or the choice of the memory elements to be made fault-tolerant is not adequate(because important blocks of storage elements remain in the original form).2.2.Reliability Early-Estimation:injecting transient faults(SEUs)in VHDL code9nAt the end of this process,we compute the overal
16、l transient fault coverage as a function of the predefined MTBF for the target application as follows:Transient_Fault_Coverage(MTBF)=K .(M-E)Where:K is the number of detected transient faults;M is the total number of injected transient faults;E is the number of redundant transient faults in the VHDL
17、 code.2.2.Reliability Early-Estimation:injecting transient faults(SEUs)in VHDL code10Fig.5.Approach used to inject faults in the VHDL code.(Example for a circuit that operates with 8 information bits plus 5 check bits).2.2.Reliability Early-Estimation:injecting transient faults(SEUs)in VHDL code11Th
18、ree different operating modes:(a)normal_mode.No fault injection is possible during the simulation process.(b)precision_fault-injection_mode.Single/multiple faults can be injected in the selected memory register.User defines which bits and in which sequence the selected bits will be flipped by settin
19、g specific seeds into the LFSR,before clocking it.This results in the injection of the fault(s)in the selected memory element.Reset the LFSR,and repeat the operation to insert another seed into this element and so on.(c)random_fault-injection_mode.A unique reset is performed in the beginning of the
20、process in order to inject the first seed.After this,every time the user wants to inject a fault in the selected memory element,he needs only to generate the clock signal is activated,a fault is pseudo-randomly injected into the selected memory element by the LFSR.2.2.Reliability Early-Estimation:in
21、jecting transient faults(SEUs)in VHDL code12nAt the VHDL code level,the LFSR can be implemented by means of a Generate Statement.nThis mechanism can be used as a conditional elaboration of a portion of a VHDL description.2.2.Reliability Early-Estimation:injecting transient faults(SEUs)in VHDL code13
22、package FAULT_INJECTION_PKG is.-fault injection mode-0=normal mode-1=precision fault injection mode-2=random fault injection modeconstant FAULT_INJECTION:integer:=0;-to allow fault injection in high data order,set this constantconstant FAULT_DATA_HIGH:std_logic:=1;.end FAULT_INJECTION_PKG;-entity RE
23、G_FT isport(CLOCK,RESET,-chip enableCE:in std_logic;-input from data busD:in std_logic_vector(7 downto 0);-output to data busQ:out std_logic_vector(7 downto 0);ERROR:out std_logic_vector(1 downto 0);end REG_FT;Fig.6.Pseudo VHDL code illustrating thehigh-level fault-injection mechanism.14architecture
24、 REG_FT of REG_FT is-register(info+check bits)signal REG:std_logic_vector(12 downto 0);-info bitsalias INFO_REG:std_logic_vector(7 downto 0)is reg(12 downto 5);-check bitsalias CHECK_REG:std_logic_vector(4 downto 0)is reg(4 downto 0);.begin.NORMAL_MODE:if FAULT_INJECTION=0 generate-input data from d
25、ata bus INFO_REG=D;-parity from parity generator CHECK_REG INFO_REG(7 downto 4),LFSR_OUT=LFSR_OUT_DATA_HIGH,CLK_IN=CLK_LFSR,RST_IN=RESET);-insert a fault in the 4 MSB bits INFO_REG=LFSR_OUT_DATA_HIGH&INFO_REG(3 downto 0);end generate;end generate;.end REG_FT;Fig.6.Pseudo VHDL code illustrating thehi
26、gh-level fault-injection mechanism.15Consider the clock signal C1 used to drive the LFSR.The goal of this control signal is to determine the moment when the LFSR evaluates,i.e.the exact moment when a fault is injected in the selected memory element.Possible implementation at the VHDL code level:comm
27、and after,to introduce timing constraints to memory element assignments.C1:=“1”after 100ms;C1:=“0”after 200ms;C1:=“1”after 300ms;C1:=“0”after 400ms;.Note that the number of faults injected depends on the type of the seed placed in the LFSR.2.2.Reliability Early-Estimation:injecting transient faults(
28、SEUs)in VHDL code16n We presented a new approach to automate the process of generating fault tolerant complex circuits described in VHDL language.The approach uses coding techniques associated to registers or group of registers to detect the occurrence of a bit-flip(Single-Event Upset SEU)and to loc
29、alize the affected memory element(thus,performing error correction).n In a second step,this approach also estimates the reliability of such complex circuits with respect to SEU.This procedure is also performed in an early stage of the design process,i.e.,at the circuit VHDL specification level.n Thi
30、s approach is being automated through the development of the FT_PRO tool.n A test vehicle(a Z80-like microprocessor)is being implemented in a commercial FPGA to be exercised under radiation at the Lawrence Berkeley Lab facility(88-inch cyclotron).Experimental results will allow to verify the effectiveness of the reliability early-estimation procedure,as well as will provide a valuable feedback to future improvements of the built-in reliability functions database.3.Conclusions&Future Work:17