Graphcore’s execs on machine learning, company building

Feature articles | November 1, 2016

By Peter Clarke

Sensing / Conditioning MPUs/MCUs Data Acquisition Memory & Data Storage Automotive Software & Embedded tools Analog Boards & Embedded Cards

Toon said that Graphcore currently stands at 40 employees and that the $30 million raised in the recently announced Series A (see Graphcore gets big backing for machine learning) would be used to complete the first design and for some limited expansion. “We could have taken more but this is sufficient to get product out,” said Toon. “We will keep the engineering based here in Bristol but there is scope for some customer support and business development roles in Silicon Valley, Seattle and China,” he added.

Nigel Toon, CEO and co-founder of Graphcore.

Toon acknowledged that there is one other major technology company, besides Samsung and Robert Bosch, that contributed to the Series A funding . He said that company has chosen not to go public on the investment.

With regard to the Intelligent Processor Unit (IPU) Knowles commented: “We will release our technology in the second half of 2017. It is a brand new, from-scratch design.”

Much of the team had previously worked with Knowles at Element 14 designing for wireline, and at Icera designing for wireless. Now the team is doing the same for machine learning.

What Graphcore has said about the IPU – on its website – is that it will include massively parallel, low-precision floating-point compute and a much higher compute density than other solutions. The IPU will hold the complete machine learning model inside the processor and have 100x memory bandwidth than other solutions.

This will be backed up with an IPU-Appliance intended to increase the performance of both training and inference by between 10x and 100x compared to contemporary systems; and the IPU-Accelerator, a PCIe card designed to plug into a conventional server computer to accelerate machine learning applications.

Knowles said: “It will be a very large chip. We have not taped out, and because it is a large chip we cannot really benefit from doing test circuits on shuttle runs. Fortunately, we [the team] have a long-standing relationship with TSMC and a very good track record of getting it right first time.”

Knowles said that the design is being aimed at a 16nm FinFET process from TSMC. When asked whether that would be the 16FF+ or 16FFC (near-threshold voltage) process variants offered by TSMC, Knowles said: “TSMC offers several versions of 16nm FinFET,” and indicated a final decision on which one has not been taken yet.

There is clearly momentum building for writing machine learning software, whether it is to select an individual’s music choices in the cloud or to make decisions embedded in autonomous vehicles. And with that momentum is coming the development of a host of proprietary and other API and interface standards and languages albeit that these mainly target the running of neural networks in software on general purpose or graphics processors or FPGAs. The list of interfaces includes: Google TensorFlow, Theano, Torch, Caffe, Microsoft’s Azure, DMTK and CNTK, Veres from Samsung, DSSTNE and many more besides. Are any going to be relevant to Graphcore?

Simon Knowles, CTO and co-founder of Graphcore.

Knowles commented: “We made a very early decision to not introduce a new interface. Our technology is programmed using standard frameworks. We will emphasize TensorFlow and Mxnet initially but will be able to address others over time.”

TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. MXNet is another open source library for deep learning with broad language input support.

“Fortunately, the world has reached some sort of consensus to use frameworks rather than to invent new languages for machine learning. That means programming can be done in Python and C++ and entered into Tensorflow making it much easier for us to architect hardware,” said Knowles.

Toon added: “Initially the same software can be run on the IPU as on other systems and it will just run faster and more energy efficiently on the IPU. Thereafter more ambitious applications can be attempted.”

Knowles added: “One of the details we can disclose is that we have a software interlayer called Poplar, which is itself a graphic framework that we link to Tensorflow or other frameworks. Poplar does the same job for the IPU that CUDA does for a GPU. Poplar is closer to the metal while Tensorflow is closer to the software programmer’s point of view.”

Toon added: “But we are going to make the libraries associated with Poplar totally open.”

However, with the machine learning landscape changing so rapidly it remains unclear whether which of a host of startups can intersect with where the market will be in a few quarters time. Such startups include Nervana, acquired by Intel, Wave Computing Inc. KnuEdge Inc., BrainChip Inc. and TeraDeep Inc., amongst others.

Knowles agreed that the landscape is fast moving. “Five years ago it was the wild west but as an entry mechanism graph frameworks have emerged as the clear choice.” And while machine learning is much broader than just neural networks even a term like neural networks is evolving. “The success of neural networks is very recent indeed. If you use the term now the assumption is you mean convolutional neural networks. Previously it might have been “random forest” networks as used in Microsoft Kinect and before that “support vector” machines. The field is nascent and moving faster than ever before. So it would be folly to build a machine specific to one architecture.”

Knowles added that some of the neural network architectures are only popular because they can be run on GPUs, which are of course optimized primarily as graphics renderers and therefore show inefficiencies in running neural network software.

“Supervised learning, unsupervised learning, reinforcement learning; all three are very important and we will support all three. Then there is hardware training or learning versus hardware inference.” Knowles pointed out that some people say you need different hardware to support learning and inference because the first would typically work with much larger datasets and may need higher dynamic range. “Our view is that is that it is not necessary to split this functions. It is quite possible to do both on one piece of silicon.”

It is notable that Knowles former colleague and boss at both Element 14 and Icera, Stan Boland, has his own artificial intelligence startup FiveAI Inc., also based in Bristol (see Icera team forms machine-learning ADAS startup). However, Boland’s experience has apparently persuaded him to take a different route to Toon and Knowles and focus on software for autonomous automobiles. “Perhaps, Stan is a prospector looking for gold in AI and we shall supply the picks and shovels,” said Toon. Indeed there is the possibility that both companies could be successful with FiveAI software running on Graphcore hardware at some point in 2017 and beyond.

Related links and articles:

News articles:

Graphcore gets big backing for machine learning

Intel to Acquire Deep Learning Nervana

AI Drives Startup to Map Deep Learning Computer

Stealthy Military Startup Launches Neural Processor

BrainChip appoints former Exar CEO to lead company

Xilinx invests in neural network startup