Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
17
result(s) for
"Sludds, Alexander"
Sort by:
Large-Scale Optical Neural Networks Based on Photoelectric Multiplication
by
Bernstein, Liane
,
Hamerly, Ryan
,
Englund, Dirk
in
Accelerators
,
Artificial intelligence
,
Artificial neural networks
2019
Recent success in deep neural networks has generated strong interest in hardware accelerators to improve speed and energy consumption. This paper presents a new type of photonic accelerator based on coherent detection that is scalable to large (N≳106) networks and can be operated at high (gigahertz) speeds and very low (subattojoule) energies per multiply and accumulate (MAC), using the massive spatial multiplexing enabled by standard free-space optical components. In contrast to previous approaches, both weights and inputs are optically encoded so that the network can be reprogrammed and trained on the fly. Simulations of the network using models for digit and image classification reveal a “standard quantum limit” for optical neural networks, set by photodetector shot noise. This bound, which can be as low as50zJ/MAC, suggests that performance below the thermodynamic (Landauer) limit for digital irreversible computation is theoretically possible in this device. The proposed accelerator can implement both fully connected and convolutional networks. We also present a scheme for backpropagation and training that can be performed in the same hardware. This architecture will enable a new class of ultralow-energy processors for deep learning.
Journal Article
Freely scalable and reconfigurable optical hardware for deep learning
2021
As deep neural network (DNN) models grow ever-larger, they can achieve higher accuracy and solve more complex problems. This trend has been enabled by an increase in available compute power; however, efforts to continue to scale electronic processors are impeded by the costs of communication, thermal management, power delivery and clocking. To improve scalability, we propose a digital optical neural network (DONN) with intralayer optical interconnects and reconfigurable input values. The path-length-independence of optical energy consumption enables information locality between a transmitter and a large number of arbitrarily arranged receivers, which allows greater flexibility in architecture design to circumvent scaling limitations. In a proof-of-concept experiment, we demonstrate optical multicast in the classification of 500 MNIST images with a 3-layer, fully-connected network. We also analyze the energy consumption of the DONN and find that digital optical data transfer is beneficial over electronics when the spacing of computational units is on the order of
>
10
μ
m.
Journal Article
Single-chip photonic deep neural network with forward-only training
by
Bandyopadhyay, Saumil
,
Krastanov, Stefan
,
Harris, Nicholas
in
639/624/1075/1079
,
639/624/1075/401
,
639/624/399/1099
2024
As deep neural networks revolutionize machine learning, energy consumption and throughput are emerging as fundamental limitations of complementary metal–oxide–semiconductor (CMOS) electronics. This has motivated a search for new hardware architectures optimized for artificial intelligence, such as electronic systolic arrays, memristor crossbar arrays and optical accelerators. Optical systems can perform linear matrix operations at an exceptionally high rate and efficiency, motivating recent demonstrations of low-latency matrix accelerators and optoelectronic image classifiers. However, demonstrating coherent, ultralow-latency optical processing of deep neural networks has remained an outstanding challenge. Here we realize such a system in a scalable photonic integrated circuit that monolithically integrates multiple coherent optical processor units for matrix algebra and nonlinear activation functions into a single chip. We experimentally demonstrate this fully integrated coherent optical neural network architecture for a deep neural network with six neurons and three layers that optically computes both linear and nonlinear functions with a latency of 410 ps, unlocking new applications that require ultrafast, direct processing of optical signals. We implement backpropagation-free in situ training on this system, achieving 92.5% accuracy on a six-class vowel classification task, which is comparable to the accuracy obtained on a digital computer. This work lends experimental evidence to theoretical proposals for in situ training, enabling orders of magnitude improvements in the throughput of training data. Moreover, the fully integrated coherent optical neural network opens the path to inference at nanosecond latency and femtojoule per operation energy efficiency.
Researchers experimentally demonstrate a fully integrated coherent optical neural network. The system, with six neurons and three layers, operates with a latency of 410 ps.
Journal Article
Deep learning with coherent VCSEL neural networks
by
Davis, Ronald
,
Reitzenstein, Stephan
,
Heermeier, Niels
in
Artificial neural networks
,
Cognitive tasks
,
Computation
2023
Deep neural networks (DNNs) are reshaping the field of information processing. With the exponential growth of these DNNs challenging existing computing hardware, optical neural networks (ONNs) have recently emerged to process DNN tasks with high clock rates, parallelism and low-loss data transmission. However, existing challenges for ONNs are high energy consumption due to their low electro-optic conversion efficiency, low compute density due to large device footprints and channel crosstalk, and long latency due to the lack of inline nonlinearity. Here we experimentally demonstrate a spatial-temporal-multiplexed ONN system that simultaneously overcomes all these challenges. We exploit neuron encoding with volume-manufactured micrometre-scale vertical-cavity surface-emitting laser (VCSEL) arrays that exhibit efficient electro-optic conversion (<5 attojoules per symbol with a π-phase-shift voltage of Vπ = 4 mV) and compact footprint (<0.01 mm2 per device). Homodyne photoelectric multiplication allows matrix operations at the quantum-noise limit and detection-based optical nonlinearity with instantaneous response. With three-dimensional neural connectivity, our system can reach an energy efficiency of 7 femtojoules per operation (OP) with a compute density of 6 teraOP mm−2 s−1, representing 100-fold and 20-fold improvements, respectively, over state-of-the-art digital processors. Near-term development could improve these metrics by two more orders of magnitude. Our optoelectronic processor opens new avenues to accelerate machine learning tasks from data centres to decentralized devices.Energy consumption and compute density are challenges for computing systems. Here researchers show an optical computing architecture using micrometre-scale VCSEL transmitter arrays enabling 7 fJ energy per operation and a potential compute density of 6 tera-operations mm−2 s−1.
Journal Article
Delocalized Photonic Deep Learning on the Internet's Edge
2023
Machine learning has become ubiquitous in our daily lives, providing unprecedented improvements in image recognition, autonomous driving and conversational AI. To enable this improvement the size of machine learning models has grown exponentially, requiring new hardware that scales accordingly. CMOS electronics, the workhorse of computing for the last half century, has hit a fundamental barrier to further improvement, limited by the high energy and bandwidth cost of metallic interconnects. In this thesis I will demonstrate how we can build systems making use of the physics of photonics and electronics to enable computing systems on lightweight edge devices that were previously infeasible by orders of magnitude.First, we consider a system where all metallic interconnects above the digital logic are replaced by optical fan-out. I propose a freely scalable digital optical neural network accelerator which replaces all non-local metallic wires in a digital systolic array with free-space optical interconnections enabled by fan-out and receiverless photodetectors.For the primary contribution of my thesis I explore making use of photonics to enable faster edge computing. Advanced machine learning models are currently impossible to run on edge devices such as smart sensors and unmanned aerial vehicles owing to constraints on power, processing, and memory. I introduce an approach to machine learning inference based on delocalized analog processing across networks. In this approach, named Netcast, cloud-based “smart transceivers” stream weight data to edge devices, enabling ultraefficient photonic inference. I demonstrate image recognition at ultralow optical energy of 40 attojoules per multiply (<1 photon per multiply) at 98.8% (93%) classification accuracy. I reproduce this performance in a Boston-area field trial over 86 kilometers of deployed optical fiber, wavelength multiplexed over 3 terahertz of optical bandwidth. My work allows milliwatt-class edge devices with minimal memory and processing to compute at teraFLOPS rates reserved for high-power (>100 watts) cloud computers.
Dissertation
Freely scalable and reconfigurable optical hardware for deep learning
2021
Abstract
As deep neural network (DNN) models grow ever-larger, they can achieve higher accuracy and solve more complex problems. This trend has been enabled by an increase in available compute power; however, efforts to continue to scale electronic processors are impeded by the costs of communication, thermal management, power delivery and clocking. To improve scalability, we propose a digital optical neural network (DONN) with intralayer optical interconnects and reconfigurable input values. The path-length-independence of optical energy consumption enables information locality between a transmitter and a large number of arbitrarily arranged receivers, which allows greater flexibility in architecture design to circumvent scaling limitations. In a proof-of-concept experiment, we demonstrate optical multicast in the classification of 500 MNIST images with a 3-layer, fully-connected network. We also analyze the energy consumption of the DONN and find that digital optical data transfer is beneficial over electronics when the spacing of computational units is on the order of
>10 \\upmu
> 10
μ
m.
Journal Article
Attojoule Scale Computation of Large Optical Neural Networks
2019
The ultra-high bandwidth and low energy cost of modern photonics offers many opportunities for improving both speed and energy efficiency in classical information processing. Recently a new architecture has been proposed which allows for substantial energy reductions in matrix-matrix products by utilizing balanced homodyne detection for computation and optical fan-out for data delivery. In this thesis I work towards the analysis and implementation of both analog and digital optical neural networks. For analog optical neural networks I discuss both the physical implementation of this system as well as an analysis of limits imposed on this system by shot noise, crosstalk, and electro-optic/opto-electronic information conversion. From these results, it is found that femtojoule-scale computation per multiply and accumulate operation is achievable in the near term with further energy gains foreseeable with emerging technology. This thesis also presents a system-scale throughput and energy analysis of digital optical neural networks, which can enable incredibly high data speeds (> 10GHz) with CMOS compatible voltages at weight transmitter power dissipation comparable to a modern CPU.
Dissertation
Single chip photonic deep neural network with accelerated training
by
Bandyopadhyay, Saumil
,
Krastanov, Stefan
,
Harris, Nicholas
in
Artificial intelligence
,
Artificial neural networks
,
C band
2022
As deep neural networks (DNNs) revolutionize machine learning, energy consumption and throughput are emerging as fundamental limitations of CMOS electronics. This has motivated a search for new hardware architectures optimized for artificial intelligence, such as electronic systolic arrays, memristor crossbar arrays, and optical accelerators. Optical systems can perform linear matrix operations at exceptionally high rate and efficiency, motivating recent demonstrations of low latency linear algebra and optical energy consumption below a photon per multiply-accumulate operation. However, demonstrating systems that co-integrate both linear and nonlinear processing units in a single chip remains a central challenge. Here we introduce such a system in a scalable photonic integrated circuit (PIC), enabled by several key advances: (i) high-bandwidth and low-power programmable nonlinear optical function units (NOFUs); (ii) coherent matrix multiplication units (CMXUs); and (iii) in situ training with optical acceleration. We experimentally demonstrate this fully-integrated coherent optical neural network (FICONN) architecture for a 3-layer DNN comprising 12 NOFUs and three CMXUs operating in the telecom C-band. Using in situ training on a vowel classification task, the FICONN achieves 92.7% accuracy on a test set, which is identical to the accuracy obtained on a digital computer with the same number of weights. This work lends experimental evidence to theoretical proposals for in situ training, unlocking orders of magnitude improvements in the throughput of training data. Moreover, the FICONN opens the path to inference at nanosecond latency and femtojoule per operation energy efficiency.
Deep Learning with Coherent VCSEL Neural Networks
by
Davis, Ronald
,
Reitzenstein, Stephan
,
Heermeier, Niels
in
Artificial neural networks
,
Cognitive tasks
,
Computing time
2022
Deep neural networks (DNNs) are reshaping the field of information processing. With their exponential growth challenging existing electronic hardware, optical neural networks (ONNs) are emerging to process DNN tasks in the optical domain with high clock rates, parallelism and low-loss data transmission. However, to explore the potential of ONNs, it is necessary to investigate the full-system performance incorporating the major DNN elements, including matrix algebra and nonlinear activation. Existing challenges to ONNs are high energy consumption due to low electro-optic (EO) conversion efficiency, low compute density due to large device footprint and channel crosstalk, and long latency due to the lack of inline nonlinearity. Here we experimentally demonstrate an ONN system that simultaneously overcomes all these challenges. We exploit neuron encoding with volume-manufactured micron-scale vertical-cavity surface-emitting laser (VCSEL) transmitter arrays that exhibit high EO conversion (<5 attojoule/symbol with \\(V_\\)=4 mV), high operation bandwidth (up to 25 GS/s), and compact footprint (<0.01 mm\\(^2\\) per device). Photoelectric multiplication allows low-energy matrix operations at the shot-noise quantum limit. Homodyne detection-based nonlinearity enables nonlinear activation with instantaneous response. The full-system energy efficiency and compute density reach 7 femtojoules per operation (fJ/OP) and 25 TeraOP/(mm\\(^2\\) s), both representing a >100-fold improvement over state-of-the-art digital computers, with substantially several more orders of magnitude for future improvement. Beyond neural network inference, its feature of rapid weight updating is crucial for training deep learning models. Our technique opens an avenue to large-scale optoelectronic processors to accelerate machine learning tasks from data centers to decentralized edge devices.
Towards the Information-Theoretic Limit of Programmable Photonics
by
Jasvith Raj Basani
,
Hamerly, Ryan
,
Englund, Dirk
in
Circuits
,
Information theory
,
Neural networks
2024
The scalability of many programmable photonic circuits is limited by the \\(2\\) tuning range needed for the constituent phase shifters. To address this problem, we introduce the concept of a phase-efficient circuit architecture, where the average phase shift is \\( 2\\). We derive a universal information-theoretic limit to the phase-shift efficiency of universal multiport interferometers, and propose a \"3-MZI\" architecture that approaches this limit to within a factor of \\(2\\), approximately a \\(10\\) reduction in average phase shift over the prior art, where the average phase shift scales inversely with system size as \\(O(1/N)\\). For non-unitary circuits, we show that the 3-MZI saturates the theoretical bound for Gaussian-distributed target matrices. Using this architecture, we show optical neural network training with all phase shifters constrained to \\( 0.2\\) radians without loss of accuracy.