Dissertation Defense
Power, Interconnect, and Reliability Techniques for Large Scale Integrated Circuits
Add to Google Calendar
Historically, consumer computing products have moved to increasingly smaller form factors, from the personal computer, to the laptop, and now to devices such as smart phones and tablets. These products have a high amount of visibility, but in the background are datacenters solving problems beyond what can be solved by consumer devices. One of the first computers, ENIAC, was a massive machine that weighed 30 tons and occupied a full room to calculate artillery firing tables. Now it is possible to build a much more powerful system in a cubic millimeter form factor, but we still build warehouse-size systems, such as the "k computer" , to solve complex problems like global weather simulation.
Although advances in computing technology have greatly improved system capabilities and form factors, they have introduced problems in heat dissipation, reliability, and yield. Large scale systems are greatly affected by this, where their power usage is measured in megawatts, their reliability goal is running a mere 30 hours without system failure, and their processor count numbers in the hundreds of thousands.
This thesis addresses these issues on four fronts. First, 3D-stacking technology coupled with near-threshold computing (NTC) is used to address heat dissipation. A 3D-stacked NTC system, Centip3De, is presented as a demonstration of this strategy, with a 75x decrease in processor power and a 5.1x improvement in energy efficiency. Next, interconnect and system reliability is addressed with a failure tolerant interconnect fabric, Vicis, which disables faulty components to maintain reliability, tolerating fault rates of over 1 in 2,000 gates. Third, system yield is addressed using a demonstrated in-situ performance monitoring technique, Safety Razor, which uses a novel time-to-digital converter with sub-picosecond calibration accuracy. Finally, stochastic computing is proposed as an error-tolerant form of computation for advanced VLSI processes, with an example application of an image sensor array with built-in edge detection investigated.