|Fig. 1: An Intel Haswell wafer with a pin sitting on it. (Source: Wikimedia Commons)|
Since Jack Kilby built the first integrated circuit (IC) in 1958, IC technology has evolved to the extent that we trust it with our lives. Autopilot software is frequently used to land commercial aircraft in low visibility situations. Intelligent pressure-sensing systems regulate the operation of therapeutic respiratory devices. A typical car contains tens of electronic control units, governing aspects of its operation from fuel injection to airbag deployment. Every time we get behind the wheel, we rely on the robustness of ICs for our survival.
IC reliability is an active research topic of perpetually growing importance. Since the 1960s we have witnessed an exponential increase in the number of transistors integrated onto a single electronic device.  Without any error correcting safeguards, the failure of a single transistor may cause an entire device to fail. In this dangerous situation, the failure rate of an IC would scale with the number of transistors. A processor such as the Intel Core i7 (Fig. 1) contains over 1 billion transistors, making it an absolute necessity for engineers to understand a multitude of failure mechanisms at the single transistor level, and to combat them with techniques for robust design at the circuit and system level. In this report we focus on a category of failure mechanisms due to ionizing radiation known as a single event upsets (SEU). 
A single event upset occurs when an ionizing particle (such as an alpha particle originating from impurities in the IC packaging materials) strikes a piece of semiconductor and generates electron-hole pairs, resulting in charge injection into some node in a circuit. This causes a transient voltage spike at that node, with amplitude and decay time determined by the injected charge Q, the capacitive loading C on the node, and the effective resistance R of the transistor(s) driving the node. Each of these three parameters can be related to the feature size of a particular CMOS technology. A very simple model (Fig. 2) is enough to show that the impact of SEUs worsens with each technology generation, as the minimum feature size scales by a factor α = 0.7. More sophisticated models are necessary to size logic gates in order to create circuit level immunity
|Fig. 2: First order model of SEU voltage transient.|
The injected charge Q is inversely proportional to the doping density of the semiconductor.  Under constant field scaling, the doping density scales by a factor 1/α.  Hence, in each technology generation, the injected charge due to an ionizing particle of the same linear energy transfer (LET) scales by α. However, the gate capacitance of a unit size transistor also scales by α. To first order, this implies that the amplitude of a voltage spike, Q/C, remains constant. Furthermore, the effective resistance R of a unit size transistor remains constant, and hence the decay time constant RC of the pulse scales by α. To first order, the time it takes the voltage spike to disappear scales by the same factor as any gate delay. Overall, because the supply voltage and noise margins scale by α, spikes of constant amplitude with the same relative delay produce higher bit error rates in each successive technology node in the absence of safeguards.
During a real SEU, charge is not injected instantaneously. Rather, a current pulse with two time constants flows into the affected node. The first is the collection time constant of the junction, and the second is the ion-track establishment time constant. Both depend on process-related factors. If the circuit model in Fig. 2 were modified to show a current pulse of finite duration, rather than an ideal current impulse, it would be clear that the amplitude of the voltage spike depends on the effective resistance R driving the node, in addition to the capacitance loading the node. Lowering the effective resistance (i.e. using a larger drive strength) can help limit the amplitude of the voltage spike to remain within the allowed logic levels. A straightforward approach to mitigating the impact of SEUs is to increase the sizes of logic gates.  However, depending on the input pattern to the logic circuit, many logic gates may be able to tolerate voltage spikes with amplitude equal to the supply voltage without affecting the primary outputs. This condition is called logic masking. By selectively sizing up only the gates with low logic masking probability, the area and power overhead of radiation hardening can be significantly reduced. 
© Danny Bankman. The author grants permission to copy, distribute and display this work in unaltered form, with attribution to the author, for noncommercial purposes only. All other rights, including commercial rights, are reserved to the author.
 G. Moore, "Cramming More Components onto Integrated Circuits," Proc. IEEE 86, 82 (1998).
 A. Clark, "Radiation Hardening of Electronic Components," Physics 241, Stanford University, Winter 2015.
 Q. Zhou and K. Mohanram, "Gate Sizing to Radiation Harden Combinational Logic," IEEE Trans. Comput. Aid. D. 25, 155 (2006).
 R. Dennard et al., "Design of Ion-Implanted MOSFET's with Very Small Physical Dimensions," Proc. IEEE 87, 668 (1999).
 Q. Zhou and K. Mohanram, "Transistor Sizing for Radiation Hardening," IEEE 1315343, 25 Apr 04.