Is FPGA power design ready for concurrent engineering?
Despite ready access to a variety of development tools such as early power estimators and power analyzers specifically targeting FPGA-based projects, it is beneficial for power designers to consider a worst-case, rather than an optimal-case, power system early in the design process because there is still too much uncertainty in how the dynamic load requirements will fluctuate between a static, low current condition to a full processing state until the hardware design is completed and power can be measured.
Could adopting concurrent engineering (CE) practices provide a way for development teams that are using FPGA devices in their projects to more easily and quickly find and extract the most effective balance between processing performance, bill of material (BOM) cost, and energy efficiency in today’s designs? Examining how concurrent engineering impacts a team’s design efforts and how it can affect a development team’s ability to address power supplies from the beginning of the project alongside the FPGA and the rest of the system can help answer this question.
Concurrent engineering is a mechanism that enables design teams to more quickly discover and resolve disconnects in assumptions between the various disciplines that work together to produce the final design. It is highly unlikely that any development team could get all of the requirements for a complex system perfectly correct at the start of a design – as a result, it is more effective to be able to discover, identify, and abandon disconnects in assumptions and design decisions as early as possible and replace them with ones that guide the project closer to the desired outcome at the lowest possible cost.
Is the complexity and potential consequences of late design cycle and worst-case FPGA power system design sufficient to justify adopting concurrent engineering practices? To answer this we need to understand: what are the sources of design complexity and uncertainty that designers of FPGA power systems face and how do they affect the trade-offs they must make when designing the power supply?
Complexity and uncertainty
Every member of a design team is experiencing increases in complexity and uncertainty – which are fortunately also being somewhat mitigated with improving levels of integration and abstraction which help keep the overall amount of complexity within the capacity of a human designer to understand and work with. As with any discipline that adds its contribution at the tail end of a design, upstream design assumptions and decisions can create additional sources of complexity and uncertainty that might otherwise have been minimized if there was earlier coordination and communication.
The design of the power supply is one of these potential downstream disciplines in increasingly complex systems. For this exercise, let’s look at the sources of complexity and then uncertainty from the perspective of the power supply designer. The two key FPGA specifications that we will look at that affect the design of the power supply are voltage and current requirements.
FPGA voltage requirement trends are driving up complexity because they require a growing number of power rails. Instead of needing two power rails for core and I/O cells and possibly a third power rail for auxiliary functions, today’s high-end FPGAs can require more than a dozen externally-driven power rails.
Why has the number of power rails needed grown so dramatically? SRAM cells may require a slightly higher voltage then the internal logic gates to ensure reliable full-speed operation while also using a lower voltage for standby mode. Industry standards can prevent different I/O cells from sharing the same power rails and increase the number of power rails needed because they may lock the various I/O cells and their physical receiving and transmitting interfaces to different power supplies with different supply-noise limits and voltage levels. For example, Ethernet may run at a different I/O voltage than an I2C bus. One is an on-board bus and the other is an external bus, but both can be implemented in the FPGA. Reducing jitter or improving noise margins for sensitive circuits, such as low-noise amplifiers, phase-locked loops, transceivers, and precision analog circuits, can increase the need for more power rails because they cannot share a power rail with noisier components even though they are operating at the same voltage.
Besides requiring a growing number of power rails, today’s FPGAs are operating at lower voltages than their predecessors which is valuable for reducing power consumption and increasing integration, but also increases complexity because the power supply must be able to maintain voltage tolerance requirements that keep getting tighter (See Figure 1). As an example, the published magnitude of the core voltage ripple tolerance for FPGAs based on 28 nm technology node has more than halved since FPGAs were manufactured in 130 nm. The percentage of error budget has shrunk from 5% to 3% and is heading towards 2%. Maintaining the voltage tolerance requirements is related to understanding and addressing the FPGA current requirements.
Figure 1.The average voltage ripple tolerance has moved than halved over four technology nodes representing a source of increasing complexity for power supply designers.
FPGA current characteristic trends are driving up complexity because the higher density and number of peripherals/functions/IP blocks contained within the FPGA is growing alongside Moore’s Law – approximately twice as many blocks fit in the same amount of silicon between each technology node. While the voltages supplied to the FPGA are fixed, the operating current for each of these voltages is not and fluctuates depending on how the FPGA logic is implemented.
The current fluctuates dramatically when blocks of the internal logic gates or I/O cells transition between high and low utilization levels. As the FPGA transitions to a higher processing rate, the current draw will increase and the voltage will tend to drop. A good power supply design will prevent the voltage drop from exceeding the voltage transient threshold. Likewise, as the FPGA transitions to a lower processing rate, the current draw will drop and the voltage will tend to rise, and the power supply design will prevent it from exceeding the threshold. In short, a lot of the uncertainty that can materially affect the power supply design comes from how the FPGA designer implements the system on the FPGA itself.
This type of uncertainty specifically impacts FPGA systems, in part, because one of the key features of using an FPGA is that the designer can create arbitrarily-sized processing resources and an arbitrary quantity of redundant processing resources to solve their problem in less time and/or less power than on a software-programmable processor. So while a software-programmable processor has a bounded set of processing resources that can operate simultaneously, an FPGA provides the opportunity to create a specialized, optimized, and custom set of processing resources that require a custom power supply design.
Understanding and managing how the FPGA designer has implemented the transitions between high and low processing states on the FPGA early in the design process can significantly affect the power designer’s available options for optimizing the power supply design and meeting the system power requirements. It is not a requirement, nor is it necessarily desirable for each power rail in an FPGA to get its own power supply as it could increase costs and take up a lot more valuable board space than it needs to. Instead, the power designer may use a distributed power network where a bulk-regulator steps down the system voltage and distributes it to individual point-of-load regulators which then supply each rail. Each regulator is designed to provide a constant output voltage despite variations (within a designed range) in input voltage and output load current.
There are two fundamental types of regulators: linear and switching. A linear regulator is easier to implement, delivers a cleaner output with smaller noise or voltage ripple in their output, costs less to use, and requires less board space than a switching regulator. However, their power conversion efficiency is much lower than a switching regulator’s, especially as the difference in voltages (output vs. input) diverges. For example, using a linear regulator to generate 1V from a 5V source will have a conversion efficiency of 20% which is much worse than an approximately 85% conversion efficiency with a switching regulator.
The conversion efficiency is the ratio of output power to input power and a lower efficiency means the regulator is burning the power instead of the FPGA; this makes linear regulators less suitable than switching regulators for FPGA applications that have a high operating current – which for some fast I/O nodes on high-end FPGA-based systems can reach 80A. Additionally, the temperature rise from the wasted power dissipation will affect the amount of space required for heat sinks or air flow to maintain the performance of the system components. As a general rule, 1W dissipated on one square inch of copper will have a temperature rise of 10ºC when there is no airflow.
While a switching regulator is much more power efficient than a linear regulator, it comes with a noise penalty that means a larger voltage ripple – further compounding the power supply designer’s challenge of a shrinking tolerance threshold. Proper placement of the switching regulator components on board is critical to minimize the electrical noise, and its slightly larger components just add to this challenge.
So a little of the right knowledge about the power budget early enough in the design process can make a big difference in whether the power supply designer will have the right board location and amount of board space to use the more efficient switching regulator or make do with a less efficient linear regulator.
Much of an FPGA’s power consumption depends on the FPGA designer’s implementation choices that affect the system’s switching frequency, output loading, supply voltage, number of interconnects, percentage of interconnects switching, and the structure of the logic and interconnect blocks. These choices, in turn, impact the power supply designer’s options and system design trade-offs that can affect the final system performance.
Fortunately, FPGA power supply designers have a variety of tools and techniques available to them for analyzing power considerations early in the design process. For example, most FPGA suppliers provide early power estimators and power analyzers to help designers establish power budgets. Designers can use software-based early power estimators – basically glorified or literally spreadsheets – to collect values and assumptions of logic capacity and activity rates early in the design process to estimate where and how much power the system will use (See Figure 2).
Figure 2. The worksheet shown in this software-based early power estimator offers suggested components for each power rail based on the planned FPGA utilization (image courtesy of Altera).
Early power estimators enable power designers to enter estimates for utilization of the different subsystems on the FPGA. In the example tool, tabs at the bottom of the snapshot provide worksheets to capture the power consumption for each type of resource including logic, memory, various I/O, and hardware signal processing resources. These values flow through to the other worksheets in the tool.
However, it is not sufficient to just estimate power consumption – the power designer needs to design the power tree to support how the FPGA designer intends to use the FPGA. By calculating the estimated total static and dynamic power consumption for an FPGA design, the tool assists the power designer to translate the power consumption requirements to an appropriate power tree while ensuring the design complies with the design team’s system trade-off decisions and meets the current and voltage requirements. Some tools, including the example in Figure 2, may suggest power management devices that meet the design’s needs. The estimators reflect years of the FPGA supplier’s and designer’s experience that are useful to use until real design implementation-based numbers become available.
As the FPGA design progresses, the power supply designer will start using power analyzer tools to more accurately capture the power consumption and power supply requirements based on detailed information, such as the netlist output from an FPGA design tool, about how the design is implemented in the FPGA.
The purpose of the power estimation and analysis tools is to help designers establish early guidelines for the power budget. Armed with early power estimates the design team may be able to design and build boards in parallel with the FPGA design to save time or have a more final board design ready once the FPGA design is done that could allow the team to perform more effective testing and optimization.
The power supply designer needs to measure and verify the actual FPGA performance during system integration with hardware because it is sensitive to the actual design and environmental operating conditions and may differ from the estimate and analysis tools.
It’s about negotiating
That final disclaimer about how the estimation and system integration power numbers can differ does not instill a lot of confidence. Even when using them, the power supply designer still faces significant uncertainty. This is where we come full circle back to whether FPGA power supply design is ready for concurrent engineering practices.
Similar to examples of how the aerospace and microprocessor teams were able to understand how upstream design decisions affected downstream design requirements, understanding how the power supply’s design requirements are affected by the board and FPGA designer’s earlier choices provides a mechanism and an opportunity for each team member to start communicating and negotiating about how to optimize the whole system rather than just a portion of it before the costs have been baked into the design.
The primary value of practicing early coordination and communication with each of the development team members from the beginning of the project is to uncover disconnects in assumptions between the different domain experts – in this case, power designers, FPGA designers, and board designers – as well as designers for the other system components that reside on or affect the board, FPGA, and power system as early as possible in the design process. At that point, the affected groups can discuss, argue, scream at each other, and eventually begin and complete the process of finding and negotiating a resolution at a lower cost because they are avoiding all kinds of expensive rework and late-design cycle requirement changes.
Having the option to use more efficient switching regulators requires fairly accurate foreknowledge about the system’s power requirements so that the appropriate amount of board space in the right locations can be allocated to the power supply regulators and components to be able to meet power requirements including the voltage ripple tolerance requirement. The consequence of a poor or inaccurate power forecast can mean using less efficient regulators that meet the voltage ripple requirements but that “ripple” out additional requirements to the rest of the design to handle a larger power supply, deal with higher heat to dissipate, or even operating the FPGA at a slower rate than needed.
An important value of the power estimation and analysis tools is to get everyone talking about power as early as possible. In contemporary designs, the FPGA can be the primary driver for performance and power consumption of the system – and as a result, can also be the primary driver for the power supply design. Coordinating with your power designer from the beginning of the design process provides the opportunity to talk about system trade-offs and how the FPGA will be used and then use the tools to get an earlier and more accurate estimate of power consumption.
Success is not about getting all of the system requirements right before you start, it is about discovering and abandoning bad positions as soon as possible and replacing them with ones that guide the project closer to the desired outcome at the lowest possible cost. The result will be that you will reap the benefits of earlier, easier, and more accurate power forecasts in subsequent projects.
When I worked on some aerospace integrated product team projects (our version of concurrent engineering) we used a variation of the 80-20 rule: 80% of a project’s cost is fixed in the first 20% of the design effort. After that, the best you could hope to do was shuffle around the remaining 20%. Effectively the decisions we made when we knew the least had the most impact on the project’s end cost.
The expression may not be completely accurate, but there seems to be some wisdom in it. More importantly, it provided a warning and a reminder of raising issues and collaborating with team members that had different domain expertise as soon as reasonable before you locked in the 80% that you would later regret when the project reached the final stages of the development process.
A description of concurrent engineering appeared in a 1988 report from the Institute for Defense Analysis saying that it is a systematic approach to the integrated, concurrent design of products and their related processes, including manufacture and support. This approach is intended to cause the developers, from the outset, to consider all elements of the product life cycle from conception through disposal, including quality, cost, schedule, and user requirements (Reference 1). Concurrent engineering is similar to and overlaps with other terms such as collaborative engineering, simultaneous engineering, and integrated product development.
For every project in history that involved two or more people working together, concurrent engineering in some form has been practiced, but modern concurrent engineering relies on information and communication technology to enable larger and multi-disciplinary development teams to work together and share new project information at a faster rate than ever before. Although many descriptions of concurrent engineering emphasize that doing design in parallel decreases product development time – be warned – the most important mechanism of concurrent engineering is that it provides a means to more quickly discover disconnects in assumptions between the various disciplines that work together to produce the final design. The benefit of discovering and resolving disconnected assumptions earlier can be higher than the added cost of coordinating and communicating between the different teams as the complexity and consequences of those disconnects manifesting in the production system increase.
Aerospace projects were among the earliest groups to adopt modern concurrent engineering practices in an attempt to discover disconnects between the teams that were responsible for prototyping, manufacturing, and repairing the product. Many of the earliest discoveries of where disconnects of assumptions occur is between upstream and downstream activities during the design cycle. For example, a decision made during prototyping might make manufacturing or maintenance more difficult and expensive – while, if they had taken a slightly different approach, it could have just as effectively addressed their concern with little or a much smaller impact on the requirements for the downstream team’s tasks. The benefit of discovering and resolving disconnected assumptions in man-rated safety systems was much higher than the added cost of coordinating and communicating between the different teams throughout the entire design process.
Years later, the development of new microprocessors began to benefit from early and concurrent or collaborative development effort between processor architects, the teams providing software development tools such as compilers, and other software developers. Processor architecture began to incorporate decisions and features that simplified the assumptions that developers of compilers had to make – resulting in faster, smaller, and more efficient generated code. The benefit of choosing different but equivalent architectural approaches enabled realizing better compiled code performance even as the complexity and cost of software systems was rapidly increasing.
In each of these cases, the extra effort to coordinate and communicate early and frequently with the other members of the system development team was offset by a larger benefit of discovering errors and disconnects in assumptions when they were much less expensive to negotiate and resolve.
Winner, R. I., J. P. Pennell, H. E. Bertrand, and M. M. G. Slusarezuk (1988). The Role of Concurrent Engineering in Weapons System Acquisition, Institute for Defense Analyses, Alexandria, VA, USA, IDA Report R-338.