Configurable VLIW core boosts energy efficiency for long battery life: Page 2 of 2

September 13, 2017 //By Nick Flaherty
Bryan Donoghue, digital system lead at Cambridge Consultants speaking at the recent NMI conference on high performance digital systems
Researchers at Cambridge Consultants in the UK have developed a flexible VLIW core for highly power efficient systems that can be 'always on' for years from a small battery.

At 100MHz this can give 1GMAC/s of performance and in many systems the algorithms paralellise well so multiple cores can be used to get to several 10s of GMAC/s of performance with tens of milliwatts of power consumption.

As compilers aren’t very efficient with VLIW code, the core is programmed in assembler and the team has also built a set of tools to support the development.

“We have a toolset that helps us build these cores and have a big library of these, mix and match the modules and that squirts out the Verilog. We code in assembler rather than C or CUDA – but the competition is Verilog and it’s a lot easier to program in assembler.”

A graphical simulator called Sapphyre is configurable with chosen modules, allowing developers to chose the data path. This is bit and cycle accurate which is important to provide the required performance, but it also produces cycle by cycle vectors that are then used as the test vectors from the Verilog.

“We also have a real time debug monitor embedded in the silicon via the multiplexer – that helps developing code on the actual silicon and it provides visibility of all the data in the system,” he said. “You can take that data and feed it back into the simulator for a replay and that gives great visibility.

A typical design using the core in 40nm runs at 96MHz and uses 116K gates. This provides 384MMAC/s at 8mW peak performance and a 1mW average power in 0.25mm2 of silicon. This can be used to replace a CPU or DSP core in an ASIC to reduce power and area and boost performance

The core can also be used for machine learning in AI systems, he says. “The modules change to array-based processing modules for CNN layers but the same architecture works well and we are doing work in that space,” he said.

Vous êtes certain ?

Si vous désactivez les cookies, vous ne pouvez plus naviguer sur le site.

Vous allez être rediriger vers Google.