close
close

Mondor Festival

News with a Local Lens

Photonic processor could enable lightning-fast AI calculations with extreme energy efficiency
minsta

Photonic processor could enable lightning-fast AI calculations with extreme energy efficiency

The deep neural network models that power today’s most demanding machine learning applications have become so large and complex that they are pushing the limits of traditional electronic computing hardware.

Photonic hardware, capable of performing machine learning calculations with light, offers a faster, more energy-efficient alternative. However, there are certain types of neural network calculations that a photonic device cannot perform, requiring the use of off-chip electronics or other techniques that hinder speed and efficiency.

Building on a decade of research, scientists at MIT and elsewhere have developed a new photonic chip that overcomes these obstacles. They demonstrated a fully integrated photonic processor capable of optically performing all key calculations of a deep neural network on-chip.

The optical device was able to perform key calculations in a machine learning classification task in less than half a nanosecond while achieving over 92% accuracy, performance comparable to traditional hardware.

The chip, made up of interconnected modules forming an optical neural network, is manufactured using commercial foundry processes, which could enable the technology to be scaled up and integrated into electronics.

In the long term, the photonic processor could lead to faster, more energy-efficient deep learning for computationally demanding applications like lidar, scientific research in astronomy and particle physics, or high-speed telecommunications.

“There are many cases where the performance of the model is not the only thing that matters, but also how quickly you can get an answer. Now that we have an end-to-end system capable of running a neural network in optics, at the nanosecond scale, we can start to think at a higher level about applications and algorithms,” says Saumil Bandyopadhyay ’17, MEng ’18, PhD ’23, a visiting scholar at of the group Quantum Photonics and AI in the Research Electronics Laboratory (RLE) and a postdoctoral fellow at NTT Research, Inc., who is lead author of a paper on the new chip.

Bandyopadhyay is joined in the article by Alexander Sludds ’18, MEng ’19, PhD ’23, Nicholas Harris PhD ’17, and Darius Bunandar PhD ’19; Stefan Krastanov, former RLE research scientist and now assistant professor at the University of Massachusetts Amherst; Ryan Hamerly, visiting scientist at RLE and senior scientist at NTT Research; Matthew Streshinsky, former head of silicon photonics at Nokia, now co-founder and CEO of Enosemi; Michael Hochberg, president of Periplous, LLC; and lead author Dirk Englund, professor in the Department of Electrical Engineering and Computer Science, principal investigator of the Quantum Photonics and Artificial Intelligence Group and RLE. The research appears today in Natural photonics.

Machine learning with light

Deep neural networks are composed of many interconnected layers of nodes, or neurons, that operate on input data to produce an output. A key operation in a deep neural network involves using linear algebra to perform matrix multiplication, which transforms data as it is passed from one layer to another.

But in addition to these linear operations, deep neural networks perform nonlinear operations that help the model learn more complex patterns. Nonlinear operations, like activation functions, give deep neural networks the power to solve complex problems.

In 2017, Englund’s group, along with researchers in the lab of Marin Soljačić, Cecil and Ida Green Professor of Physics, demonstrated an optical neural network on a single photonic chip capable of performing matrix multiplication with light.

But at the time, the device could not perform nonlinear operations on the chip. Optical data had to be converted into electrical signals and sent to a digital processor to perform nonlinear operations.

“Nonlinearity in optics is quite difficult because photons do not interact very easily with each other. This makes triggering optical nonlinearities very energy intensive, so it becomes difficult to build a system capable of do it in a scalable way,” says Bandyopadhyay. .

They overcame this challenge by designing devices called nonlinear optical function units (NOFUs), which combine electronics and optics to implement nonlinear operations on the chip.

The researchers built an optical deep neural network on a photonic chip using three layers of devices that perform linear and nonlinear operations.

A fully integrated network

Initially, their system encodes the parameters of a deep neural network in light. Then, a set of programmable beam splitters, shown in the 2017 paper, performs matrix multiplication on these inputs.

The data is then transmitted to programmable NOFUs, which implement nonlinear functions by siphoning a small amount of light to photodiodes that convert optical signals into electrical current. This process, which eliminates the need for an external amplifier, consumes very little energy.

“We stay in the optical domain the whole time, until the end, when we want to read the response. This allows us to achieve ultra-low latency,” says Bandyopadhyay.

Achieving such low latency allowed them to efficiently train a deep neural network on the chip, a process known as on site training which generally consumes a lot of energy in digital equipment.

“This is particularly useful for systems where you are doing optical signal processing in one domain, like navigation or telecommunications, but also for systems where you want to learn in real time,” he explains.

The photonic system achieved over 96% accuracy in training testing and over 92% accuracy in inference, which is comparable to traditional hardware. Additionally, the chip performs key calculations in less than half a nanosecond.

“This work demonstrates that computing – essentially mapping inputs and outputs – can be compiled onto new linear and nonlinear physics architectures that enable a fundamentally different scaling law between the computation and the effort required,” explains Englund.

The entire circuit was manufactured using the same infrastructure and foundry processes that produce CMOS computer chips. This could allow the chip to be manufactured at scale, using proven techniques that introduce very few errors into the manufacturing process.

Scaling their device and integrating it with real-world electronic devices such as cameras or telecommunications systems will be a major focus of future work, says Bandyopadhyay. Additionally, researchers want to explore algorithms that can leverage the advantages of optics to train systems faster and with greater energy efficiency.

This research was supported, in part, by the National Science Foundation, the Air Force Office of Scientific Research, and NTT Research.