Timing closure highlights the challenges of 45nm silicon design and below
Innovation is the cornerstone of the semiconductor industry and has been responsible for massive changes in all parts of the industry, from design through fabrication, assembly and test. The foundational requirements of innovation in design are changing; they are expanding in scope. Point solutions that locally optimize a single design process by some metric, such as power, are more often than not proving to be a net disruption to design closure, rather than a benefit.
The necessarily expanded scope of innovation, especially true for advanced node design, means that the most significant innovations will come from large organizations that are willing to make bold investments.
We estimate that the EDA investment for the move to 20nm and 14nm FinFET to be in the $1B range. For the size of our industry, this is indeed a bold, albeit necessary, investment. We can regard the issue of timing signoff as a microcosm of the way in which innovation in EDA has changed and how it is evolving now and into the future.
Cadence Design Systems has responded with the recent introduction of its Tempus Timing Signoff Solution, a new static and timing analysis closure tool that yields up to an order of magnitude faster than traditional timing analysis solutions. Although start-ups have developed new technologies that solve individual parts of the signoff problem, those innovations sometimes do not make it through to the implementation flows used by system-on-chip (SoC) engineering teams because they do not solve the overarching problems.
Over the past decade and a half, the physics of nanometer technology have taken an increasingly firm grasp on the design process and created a much more complex situation for signoff. The shift from ASIC to SoC design that began in earnest at the start of this millennium accompanied a dramatic change in methodology and also the way in which innovative design technologies came to the market.
Just 15 years ago, signoff for digital logic-dominated designs was relatively straightforward thanks to the use of widely accepted approximations. Gate delay strongly outweighed wire delay, which could be treated as practically negligible. Signoff was largely a matter of performing timing analysis based on the results provided by the ASIC vendor’s ‘golden’ gate-level simulator.
Design teams gradually took on more of the responsibilities of signoff work from the ASIC vendors as they moved production to foundries. At the same time, layout-dependent effects played an increasing role in the performance of design. Gate delay moved to become less important on critical paths. Wire delay took over as the key issue to solve. This called for a new generation of layout-aware tools developed by both large, broad-based EDA tools suppliers and start-ups.
Start-ups played a crucial role with their technology. Each could tackle a hole in the offerings of mainstream suppliers by, very often, recruiting a small and select group of ‘teaching customers’ who could feed back vital information on tool performance from real designs. Engineers from these start-ups would often engage in close collaboration with the design teams inside the customers responsible for benchmarking and working with their software.
In recent years, many of the nanometer effects with which SoC design teams must engage have become closely interconnected. Just ten years ago, a timing violation on a critical path could easily have been solved by the insertion of a buffer, or the movement of some of the gates to reduce the wire distance and with it delay. A point tool optimized for this analysis and solution could easily be inserted into a broader design flow. Analysis was often optimized for capacity rather than accuracy.
As designs moved to millions of gates, runtime overhead was often the primary issue. Parasitics could, to a large extent, be abstracted out except for paths that were extremely close to the timing margin. At 130nm, for example, the gap between metal interconnect lines were such that their coupling capacitances were overshadowed by ground and pin capacitance. For the inter-track coupling capacitance, it was generally easier and faster to add a small margin.
The number of timing runs was also quite limited. In general, it was sufficient to analyze a best-case, nominal and worstcase scenario for three of the key parameters: process, voltage and temperature. This would effectively encompass all the realistic operating points for the design on its target process. It was reasonable to make the assumption that delay would be at maximum temperature, lowest voltage and worst process conditions.
As process dimensions shrank, assumptions that previously held up well began to break down. The coupling capacitance between metal interconnect became much more significant at 65nm because the line pitch was much tighter and the traces themselves became taller in order to keep parasitic resistance under control. As a result, the lines began to behave more like the parallel plates of a capacitor.
At 45nm, the variation in metal thickness became a key concern, increasing the range over which designs needed to be simulated to provide best-case and worst-case delay values. Below 45nm, lithographically-induced variations in transistor, gate and local interconnect structures became significant, leading to the introduction of larger margins to accommodate the difference across the process variability range. Other, more subtle, effects of the shift to nanometer dimensions have led to an explosion in the effort needed to achieve timing signoff.
The overarching issue is the interaction between global and local effects. Since the beginning of the past decade, behavior under temperature changes became more difficult to predict, a situation that has been given the name “temperature inversion dependence”. The issue is caused by the use of lower supply
voltages in order to provide greater energy efficiency – see Figure 1.
Fig. 1: Temperature inversion
Instead of running faster than a ‘hot’ corner, the circuitry may run more slowly under a certain threshold voltage as the temperature falls – and the effect is dependent on the threshold voltage used in the devices that lie along the path being analyzed. The reason for the effect is that two effects combine
to determine the delay through a logic gate. At the higher voltages used traditionally, mobility controls the drain current of an active transistor. But as voltages drop, the threshold voltage has a much larger role in determining drain current. As a result, old assumptions break down and demand that a larger number of analyses are needed to check properly for variations.
In nanometer processes, variability is more localized than it was on older processes. Metal line widths have become small enough to impact the resistance of the wire with just a small amount of variation. Given that metallization is a separate process from base layer processing, engineers cannot assume that process variations will move in the same direction for both base and metal layers. Therefore, at 45nm and to a larger extent at 28nm, multiple extraction corners were required for timing analysis and optimization.
Double patterning provides further source of variability in sub-28nm processes. Because lithography under double patterning calls for two masks for the same layer, the masks must be precisely aligned such that the spacing between patterns is consistent across the die. Although foundries are working hard to minimize the effect, there will always be some phase shift in the masks relative to each other and it may not be possible to predict what that phase shift is – see Figure 2. Timing views are required that reflect the impact of phase shifts in different directions for a given combination of temperature, voltage and other process variations.
Fig. 2: Mask shift occurs with double patterning
The focus on low-energy design adds a further layer of complexity in timing signoff. Designs that employ techniques such as dynamic frequency and voltage scaling in order to optimize their energy efficiency will need to be analyzed at multiple operating points to ensure that effects such as temperature dependence inversion do not adversely affect the timing reliability of the SoC. These analyses should be performed against the other sources of variability, leading to a combinatorial explosion. With just eight different operating modes, it is easy to reach the situation where more than 200 timing views need to be analyzed. Through careful selection and pruning of the combinations – removing those that are unlikely to provide significantly different results to other tests – it is possible to reduce the number. But the SoC implementation team is still left with a large number of timing views to generate – see Figure 3.
Fig. 3: Trend of analysis views (MxC) at shrinking nodes
The problem is not just confined to the leading-edge processes. Increasingly, low-power design techniques are being applied to designs aimed at older processes. Although these processes will have fewer sources of variability, as voltages are reduced to take advantage of power savings, effects such as
temperature dependence inversion become more apparent.
The time it takes to generate each timing view is only a small part of the problem. Up to 40 per cent of the chip implementation flow is now consumed by the time it takes to act on the results of the analyses – see Figure 4. Each timing view generates a set of violations that need to be correlated with the results from the other timing views. Consolidating the data takes time, engineering insight and, for many teams today, custom scripts to process the data. There is then the issue of implementing the changes needed to close timing.
Fig. 4: Aggregrate runtime with increasing views
Today’s signoff timers are not physically aware. Any changes, such as buffer insertion, are left to the implementation environment as a post-processing step for engineering change order (ECO) generation. Often the placement of new cells is dramatically different from what is assumed by the optimization algorithms because available vacant space is hard to find in highly utilized designs.
The result is a significant mismatch between the assumed interconnect parasitics during the optimization steps and the actual placement and routing that result from the ECO. Changes may affect the timing of paths that may have already met timing, causing them to violate timing in the timing views from the subsequent iteration. What previously may have been a timingclean view could potentially have many violations after placement and rerouting.
One thing becomes clear from an analysis of the way in which timing signoff has evolved over the past decade. A simple technology update is not enough to solve the problems. Conventional wisdom holds that startups provide much of the innovation for technologically driven markets. Startups have traditionally used ‘teaching customers’ to help drive them towards market-ready solutions. However, such solutions do not always make it to market because the need is no longer for narrowly defined solutions but for a tool infrastructure that cuts across the different pieces of the implementation flow.
A more accurate implementation engine would reduce some of the overhead of dealing with multiple timing views. But a genuine solution requires attention to multiple points in that flow, involving a more holistic approach.
There are numerous actors in the SoC ecosystem who can and do provide essential knowledge and feedback on issues that affect design and implementation. It is extremely difficult for a start-up with a more restricted set of partners to engage with all of them in order to derive the best solution. Although the core technology being delivered may have many strong points and provide better support for certain issues, the key today is to be able to bring all of the technology to build a more cohesive solution. It involves input from foundries and IDMs, with their knowledge of the way that variability issues affect timing. Library vendors have their part to play in understanding the issues caused by moves to smaller geometries and the impact of technologies such as double patterning. And there are the early adopter customers who can provide real-world designs that exercise all parts of a design flow.
Then there is the role of the EDA tools supplier to bring the inputs together and develop new ways of dealing with the influx of data. The supplier needs to have the scope to look at the flow in a holistic manner and understand which are the chokepoints that limit design speed. A more accurate implementation engine for ECOs is one possible answer to the problem of timing signoff. But a more effective approach may be to look at the overarching requirements of signoff and to work out ways in which the application of timing fixes and ECOs are made so that they are more closely integrated with the timing signoff process.
That requires a combination of new technology and attention to detail in the architecture of the overall flow that a player with decades of experience in implementation can bring.
As a result, the industry is moving towards a new development pipeline that involves a matrix of partnerships rather than individual links between design and tools-development groups. By bringing these different views together, EDA tools developers can react much more quickly to the needs of design and reflect the pace of innovation that is taking place in product design and process engineering.
Chi-Ping Hsu is Senior VP, Research and Development, Silicon Realization Group at Cadence Design Systems – www.cadence.com