'At-memory' inference engine raises NN performance: Page 2 of 2

October 29, 2020 //By Peter Clarke
'At-memory' inference engine raises NN performance
AI chip startup Untether AI (Toronto, Ontario) has announced its runAI200 inference processor based on its near-memory architecture, which it calls "at-memory."

At the heart of the at-memory compute architecture is a memory bank: 385Kbytes of SRAM with a 2D array of 512 processing elements. With 511 banks per chip, each device offers 200Mbytes of memory and operates up to 502TOPS in its "sport" mode.

Multiple function-specific buses support the movement of information in the north-south and east-west directions but the emphasis of the architecture is minimizing data movement as much as possible. A zero-detect function allows for processing elements to be switched off which can save as much as 50 percent of power consumption.

Beachler, who is a veteran of FPGA company Altera, commented that the resulting array architecture as similarities to an FPGA. There is also a custom 32bit RISC processor that is tailored for AI loads on the chip.

At the PCIe card level this translates into over 80,000 frames per second of ResNet-50 v1.5 throughput at batch=1. Benchmarks show that this performance is 3 times that of nearest rivals. For natural language processing, tsunAImi accelerator cards can process more than 12,000 queries per second (qps) of BERT-base, four times faster than any announced product.

Key to the ability to such performance is the software development kit, known as imAIgine.

The imAIgine SDK provides push-button quantization, optimization, physical allocation, and multi-chip partitioning. It also provides an extensive visualization toolkit, cycle-accurate simulator, and a runtime API.

The tsunAImi accelerator card is sampling now and will be commercially available in 1Q2021.

Related links and articles:

www.untether.ai

News articles:

AI startup appoints FPGA, embedded veteran as CEO

Mixed-signal designers form near-memory AI startup

Server processor startup raises $240 million

Groq enters production with A0 tensor processor

Cerebras Wafer Scale Engine: An Introduction

Intel drops Nervana after buying Habana


Vous êtes certain ?

Si vous désactivez les cookies, vous ne pouvez plus naviguer sur le site.

Vous allez être rediriger vers Google.