Untether AI is a processor company that is pioneering the technology of moving processing steps to where the data resides rather than the other way around. This reduces data movement and thereby improves operational efficiency and power efficiency. To this end it has developed a bus-free near-memory computing architecture that is good for neural network inferencung
With its first generation runAI200 chip (see 'At-memory' inference engine raises NN performance) gaining traction, now was the right time for Untether to raise funds to expand the company and get ready for a next generation chip.
Venture capitalists agreed and have put up $125 million in an oversubscribed round co-led by an affiliate of Tracker Capital Management LLC and previous investor Intel Capital. The round included money from new investor the Canada Pension Plan Investment Board and existing investor Radical Ventures. The round puts Untether.AI's funding to date above $150 million, Iyengar said.
Iyengar said the substantial amount of money would be used for two purposes: to expand the engagements for the runAI200 in multiple markets and to define and develop the next generation of hardware. "We have to build up our support to take advantage of every potential product interaction," Iyengar told eeNews Europe in a Zoom interview.
That might be the easier part of the task if the 16nm chip and its 8TOPS/W efficiency is doing as well as the company claims. It is the management team's job to decide where the sweet spot is for various applications. Should Untether.AI go to the 12nm manufacturing process to keep costs down or go to 7nm to get significant uplift in performance and capacity to hold neural nets? Or should the company go "all-in" to try and intersect the 3nm when it arrives in 2022 or 2023.
As a further indication of where Untetcher stands today, the runAI200 is designed to try and contain complete neural networks and the coefficients on a single chip – or on four chips when considering the TsunAImi PCIe card. The chip contains 200Mbytes of SRAM with 260,000 processing elements dispersed among the SRAM. The design supports int8 and int16 data types and has a 720MHz clock frequency for efficiency and a 960MHz mode optimized for performance.
Next: Time to choose