The company claims the HL-1000 is the highest performance AI inference processor although it must always be considered that different companies are aiming AI processors at different applications and on different computing environments.
Habana has demonstrated its processor on a PCIe card based on the chip claiming performance of 15,000 images/second throughput on the ResNet50 network with a batch-size of 10 with 1.3ms of latency, while consuming 100 watts of power.
So the Goya-1000 is clearly intended to be deployed in data centers where Habana claims it will offer one to three orders of magnitude better performance than solutions commonly deployed in data centers today. This of course compares a neural processor with CPUs and GPUs rather than other neural processors.
General architecture of HL-1000, Goya inference processor. Source: Habana Labs.
Fabless chip company Habana has not provided information about the process has been used to manufacture the HL-1000 Goya chip or a follow-on chip, the HL-2000.
The chip is described as a general-purpose AI processor and able to support multiple neural networks and various applications, including image recognition, language translation, sentiment analysis.
The HL-1000 chip architecture is based on eight tensor processor cores (TPCs) that are programmable in C and C++ using a LLVM-based compiler. Each TPC supports multiple general multiply accumulate, matrix-multiply functions with local memory. Habana has produced supporting development tools, libraries.
Next: And software