Catalogue Search | MBRL

Plasticity in memristive devices for spiking neural networks

by Grollier, Julie , Linares-Barranco, Bernabé , Lecerf, Gwendal in Behavior , Design , Designers

2015

Memristive devices present a new device technology allowing for the realization of compact non-volatile memories. Some of them are already in the process of industrialization. Additionally, they exhibit complex multilevel and plastic behaviors, which make them good candidates for the implementation of artificial synapses in neuromorphic engineering. However, memristive effects rely on diverse physical mechanisms, and their plastic behaviors differ strongly from one technology to another. Here, we present measurements performed on different memristive devices and the opportunities that they provide. We show that they can be used to implement different learning rules whose properties emerge directly from device physics: real time or accelerated operation, deterministic or stochastic behavior, long term or short term plasticity. We then discuss how such devices might be integrated into a complete architecture. These results highlight that there is no unique way to exploit memristive devices in neuromorphic systems. Understanding and embracing device physics is the key for their optimal use.

Journal Article

Share this book

Add to My Shelf

Automated design of error-resilient and hardware-efficient deep neural networks

by Ascheid, Gerd , Vogel, Sebastian , Elsken, Thomas in Algorithms , Artificial Intelligence , Artificial neural networks

2020

Applying deep neural networks (DNNs) in mobile and safety-critical systems, such as autonomous vehicles, demands a reliable and efficient execution on hardware. The design of the neural architecture has a large influence on the achievable efficiency and bit error resilience of the network on hardware. Since there are numerous design choices for the architecture of DNNs, with partially opposing effects on the preferred characteristics (such as small error rates at low latency), multi-objective optimization strategies are necessary. In this paper, we develop an evolutionary optimization technique for the automated design of hardware-optimized DNN architectures. For this purpose, we derive a set of inexpensively computable objective functions, which enable the fast evaluation of DNN architectures with respect to their hardware efficiency and error resilience. We observe a strong correlation between predicted error resilience and actual measurements obtained from fault injection simulations. Furthermore, we analyze two different quantization schemes for efficient DNN computation and find one providing a significantly higher error resilience compared to the other. Finally, a comparison of the architectures provided by our algorithm with the popular MobileNetV2 and NASNet-A models reveals an up to seven times improved bit error resilience of our models. We are the first to combine error resilience, efficiency, and performance optimization in a neural architecture search framework.

Journal Article

Share this book

Add to My Shelf

The promise of training deep neural networks on CPUs: A survey

by He, Wei in Algorithms , artificial intelligence accelerator , Artificial neural networks

2023

This survey presents a comprehensive analysis of the potential benefits and challenges of training deep neural networks (DNNs) on CPUs, summarizing existing research in the field. Five distinct DNN models are examined: Ternary Neural Networks (TNNs), Binary Neural Networks (BNNs), Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and a novel method called Sub-Linear Deep Learning Engine (SLIDE), specifically designed for CPU-based network training. The survey emphasizes the advantages of using CPUs for DNN training, such as low cost, compact size, and broad applicability across various domains. Furthermore, the survey collects concerns related to CPU acceleration, including the absence of a unified programming model and the inefficiencies in DNN training due to increased floating-point operations. The survey explores algorithmic and hardware optimization strategies, incorporating compressed network structures, innovative techniques like SLIDE, and the RISC-V instruction set to tackle these issues. According to the survey, CPUs are more likely to become the alternative for developers with limited resources in the future. Through continued algorithm optimization and hardware enhancements, CPUs can provide more cost-efficient neural network training solutions, excelling in areas such as mobile servers and edge computing.

Journal Article

Share this book

Add to My Shelf

Charge-trap synaptic device with polycrystalline silicon channel for low power in-memory computing

by Hwang, Joon , Kim, Soomin , Shin, Wonjun in 639/166/987 , 639/925/927/1007 , Architecture

2024

Processing-in-memory (PIM) is gaining tremendous research and commercial interest because of its potential to replace the von Neumann bottleneck in current computing architectures. In this study, we implemented a PIM hardware architecture (circuit) based on the charge-trap flash (CTF) as a synaptic device. The PIM circuit with a CT memory performed exceedingly well by reducing the inference energy in the synapse array. To evaluate the image recognition accuracy, a Visual Geometry Group (VGG)-8 neural network was used for training, using the Canadian Institute for Advanced Research (CIFAR)-10 dataset for off-chip learning applications. In addition to the system accuracy for neuromorphic applications, the energy efficiency, computing efficiency, and latency were closely investigated in the presumably integrated PIM architecture. Simulations that were performed incorporated cycle-to-cycle device variations, synaptic array size, and technology node scaling, along with other hardware-sense considerations.

Journal Article

Share this book

Add to My Shelf

Flying Free: A Research Overview of Deep Learning in Drone Navigation Autonomy

by Mckeever, Susan , Lee, Thomas , Courtney, Jane in Accuracy , artificial intelligence , artificial neural networks

2021

With the rise of Deep Learning approaches in computer vision applications, significant strides have been made towards vehicular autonomy. Research activity in autonomous drone navigation has increased rapidly in the past five years, and drones are moving fast towards the ultimate goal of near-complete autonomy. However, while much work in the area focuses on specific tasks in drone navigation, the contribution to the overall goal of autonomy is often not assessed, and a comprehensive overview is needed. In this work, a taxonomy of drone navigation autonomy is established by mapping the definitions of vehicular autonomy levels, as defined by the Society of Automotive Engineers, to specific drone tasks in order to create a clear definition of autonomy when applied to drones. A top–down examination of research work in the area is conducted, focusing on drone navigation tasks, in order to understand the extent of research activity in each area. Autonomy levels are cross-checked against the drone navigation tasks addressed in each work to provide a framework for understanding the trajectory of current research. This work serves as a guide to research in drone autonomy with a particular focus on Deep Learning-based solutions, indicating key works and areas of opportunity for development of this area in the future.

Journal Article

Share this book

Add to My Shelf

Study of Quantized Hardware Deep Neural Networks Based on Resistive Switching Devices, Conventional versus Convolutional Approaches

by Romero-Zaliz, Rocío , Jiménez-Molinos, Francisco , Roldán, Juan B. in Algorithms , Arrays , Artificial intelligence

2021

A comprehensive analysis of two types of artificial neural networks (ANN) is performed to assess the influence of quantization on the synaptic weights. Conventional multilayer-perceptron (MLP) and convolutional neural networks (CNN) have been considered by changing their features in the training and inference contexts, such as number of levels in the quantization process, the number of hidden layers on the network topology, the number of neurons per hidden layer, the image databases, the number of convolutional layers, etc. A reference technology based on 1T1R structures with bipolar memristors including HfO2 dielectrics was employed, accounting for different multilevel schemes and the corresponding conductance quantization algorithms. The accuracy of the image recognition processes was studied in depth. This type of studies are essential prior to hardware implementation of neural networks. The obtained results support the use of CNNs for image domains. This is linked to the role played by convolutional layers at extracting image features and reducing the data complexity. In this case, the number of synaptic weights can be reduced in comparison to MLPs.

Journal Article

Share this book

Add to My Shelf

Low-temperature emergent neuromorphic networks with correlated oxide devices

by Zaluzhnyy, Ivan A. , Dynes, Robert C. , Goteti, Uday S. in Applied Physical Sciences , Arrays , Axons

2021

Neuromorphic computing—which aims to mimic the collective and emergent behavior of the brain’s neurons, synapses, axons, and dendrites—offers an intriguing, potentially disruptive solution to society’s ever-growing computational needs. Although much progress has been made in designing circuit elements that mimic the behavior of neurons and synapses, challenges remain in designing networks of elements that feature a collective response behavior. We present simulations of networks of circuits and devices based on superconducting and Mott-insulating oxides that display a multiplicity of emergent states that depend on the spatial configuration of the network. Our proposed network designs are based on experimentally known ways of tuning the properties of these oxides using light ions. We show how neuronal and synaptic behavior can be achieved with arrays of superconducting Josephson junction loops, all within the same device. We also show how a multiplicity of synaptic states could be achieved by designing arrays of devices based on hydrogenated rare earth nickelates. Together, our results demonstrate a research platform that utilizes the collective macroscopic properties of quantum materials to mimic the emergent behavior found in biological systems.

Journal Article

Share this book

Add to My Shelf

Efficient parallel implementation of reservoir computing systems

by Skibinsky-Gitlin, Erik S. , Canals, Vincent , Roca, Miquel in Artificial Intelligence , Computational Biology/Bioinformatics , Computational Science and Engineering

2020

Reservoir computing (RC) is a powerful machine learning methodology well suited for time-series processing. The hardware implementation of RC systems (HRC) may extend the utility of this neural approach to solve real-life problems for which software solutions are not satisfactory. Nevertheless, the implementation of massive parallel-connected reservoir networks is costly in terms of circuit area and power, mainly due to the requirement of implementing synapse multipliers that increase gate count to prohibitive values. Most HRC systems present in the literature solve this area problem by sequencializing the processes, thus loosing the expected fault-tolerance and low latency of fully parallel-connected HRCs. Therefore, the development of new methodologies to implement fully parallel HRC systems is of high interest to many computational intelligence applications requiring quick responses. In this article, we propose a compact hardware implementation for Echo-State Networks (an specific type of reservoir) that reduces the area cost by simplifying the synapses and using linear piece-wise activation functions for neurons. The proposed design is synthesized in a Field-Programmable Gate Array and evaluated for different time-series prediction tasks. Without compromising the overall accuracy, the proposed approach achieves a significant saving in terms of power and hardware when compared with recently published implementations. This technique pave the way for the low-power implementation of fully parallel reservoir networks containing thousands of neurons in a single integrated circuit.

Journal Article

Share this book

Add to My Shelf

A Configurable Event-Driven Convolutional Node with Rate Saturation Mechanism for Modular ConvNet Systems Implementation

by Linares-Barranco, Alejandro , Linares-Barranco, Bernabé , Camuñas-Mesa, Luis A. in Address Event Representation (AER) , Brain architecture , convolutional neural networks

2018

Convolutional Neural Networks (ConvNets) are a particular type of neural network often used for many applications like image recognition, video analysis or natural language processing. They are inspired by the human brain, following a specific organization of the connectivity pattern between layers of neurons known as receptive field. These networks have been traditionally implemented in software, but they are becoming more computationally expensive as they scale up, having limitations for real-time processing of high-speed stimuli. On the other hand, hardware implementations show difficulties to be used for different applications, due to their reduced flexibility. In this paper, we propose a fully configurable event-driven convolutional node with rate saturation mechanism that can be used to implement arbitrary ConvNets on FPGAs. This node includes a convolutional processing unit and a routing element which allows to build large 2D arrays where any multilayer structure can be implemented. The rate saturation mechanism emulates the refractory behavior in biological neurons, guaranteeing a minimum separation in time between consecutive events. A 4-layer ConvNet with 22 convolutional nodes trained for poker card symbol recognition has been implemented in a Spartan6 FPGA. This network has been tested with a stimulus where 40 poker cards were observed by a Dynamic Vision Sensor (DVS) in 1 s time. Different slow-down factors were applied to characterize the behavior of the system for high speed processing. For slow stimulus play-back, a 96% recognition rate is obtained with a power consumption of 0.85 mW. At maximum play-back speed, a traffic control mechanism downsamples the input stimulus, obtaining a recognition rate above 63% when less than 20% of the input events are processed, demonstrating the robustness of the network.

Journal Article

Share this book

Add to My Shelf

An Analysis on the Architecture and the Size of Quantized Hardware Neural Networks Based on Memristors

by Jimenez-Molinos, Francisco , Romero-Zaliz, Rocio , Perez, Eduardo in Accuracy , Algorithms , Arrays

2021

We have performed different simulation experiments in relation to hardware neural networks (NN) to analyze the role of the number of synapses for different NN architectures in the network accuracy, considering different datasets. A technology that stands upon 4-kbit 1T1R ReRAM arrays, where resistive switching devices based on HfO2 dielectrics are employed, is taken as a reference. In our study, fully dense (FdNN) and convolutional neural networks (CNN) were considered, where the NN size in terms of the number of synapses and of hidden layer neurons were varied. CNNs work better when the number of synapses to be used is limited. If quantized synaptic weights are included, we observed that NN accuracy decreases significantly as the number of synapses is reduced; in this respect, a trade-off between the number of synapses and the NN accuracy has to be achieved. Consequently, the CNN architecture must be carefully designed; in particular, it was noticed that different datasets need specific architectures according to their complexity to achieve good results. It was shown that due to the number of variables that can be changed in the optimization of a NN hardware implementation, a specific solution has to be worked in each case in terms of synaptic weight levels, NN architecture, etc.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter