Catalogue Search | MBRL

A Survey of Convolutional Neural Networks on Edge with Reconfigurable Computing

by Véstias, Mário P. in Algorithms , Architecture , Artificial neural networks

2019

The convolutional neural network (CNN) is one of the most used deep learning models for image detection and classification, due to its high accuracy when compared to other machine learning algorithms. CNNs achieve better results at the cost of higher computing and memory requirements. Inference of convolutional neural networks is therefore usually done in centralized high-performance platforms. However, many applications based on CNNs are migrating to edge devices near the source of data due to the unreliability of a transmission channel in exchanging data with a central server, the uncertainty about channel latency not tolerated by many applications, security and data privacy, etc. While advantageous, deep learning on edge is quite challenging because edge devices are usually limited in terms of performance, cost, and energy. Reconfigurable computing is being considered for inference on edge due to its high performance and energy efficiency while keeping a high hardware flexibility that allows for the easy adaption of the target computing platform to the CNN model. In this paper, we described the features of the most common CNNs, the capabilities of reconfigurable computing for running CNNs, the state-of-the-art of reconfigurable computing implementations proposed to run CNN models, as well as the trends and challenges for future edge reconfigurable platforms.

Journal Article

Share this book

Add to My Shelf

A Review of Synthetic-Aperture Radar Image Formation Algorithms and Implementations: A Computational Perspective

by Neto, Horácio , Monteiro, José , Cruz, Helena in Aircraft , Algorithms , Antennas

2022

Designing synthetic-aperture radar image formation systems can be challenging due to the numerous options of algorithms and devices that can be used. There are many SAR image formation algorithms, such as backprojection, matched-filter, polar format, Range–Doppler and chirp scaling algorithms. Each algorithm presents its own advantages and disadvantages considering efficiency and image quality; thus, we aim to introduce some of the most common SAR image formation algorithms and compare them based on these two aspects. Depending on the requisites of each individual system and implementation, there are many device options to choose from, for instance, FPGAs, GPUs, CPUs, many-core CPUs, and microcontrollers. We present a review of the state of the art of SAR imaging systems implementations. We also compare such implementations in terms of power consumption, execution time, and image quality for the different algorithms used.

Journal Article

Share this book

Add to My Shelf

Intelligent Traffic Control Strategies for VLC-Connected Vehicles and Pedestrian Flow Management

by Vieira, Manuel Augusto , Galvão, Gonçalo , Louro, Paula in autonomous vehicles , Communication , Control systems

2025

Urban traffic congestion leads to daily delays, driven by outdated, rigid control systems. As vehicle numbers grow, fixed-phase signals struggle to adapt to real-time conditions. This work presents a decentralized Multi-Agent Reinforcement Learning (MARL) system to manage a traffic cell composed of five intersections, introducing the novel Strategic Anti-Blocking Phase Adjustment (SAPA) module, developed to enable dynamic phase time adjustments. The goal is to optimize arterial traffic flow by adapting strategies to different traffic generation patterns, simulating priority movements along circular or radial arterials, such as inbound or outbound city flows. The system aims to manage diverse scenarios within a cell, with the long-term goal of scaling to city-wide networks. A Visible Light Communication (VLC) infrastructure is integrated to support real-time data exchange between vehicles and infrastructure, capturing vehicle position, speed, and pedestrian presence at intersections. The system is evaluated through multiple performance metrics, showing promising results: reduced vehicle queues and waiting times, increased average speeds, and improved pedestrian safety and overall flow management. These outcomes demonstrate the system’s potential to deliver adaptive, intelligent traffic control for complex urban environments.

Journal Article

Share this book

Add to My Shelf

Fast and Accurate System for Onboard Target Recognition on Raw SAR Echo Data

by Flores, Paulo , Jacinto, Gustavo , Duarte, Rui Policarpo in Accuracy , Airborne/spaceborne computers , Algorithms

2025

Synthetic Aperture Radar (SAR) onboard satellites provides high-resolution Earth imaging independent of weather conditions. SAR data are acquired by an aircraft or satellite and sent to a ground station to be processed. However, for novel applications requiring real-time analysis and decisions, onboard processing is necessary to escape the limited downlink bandwidth and latency. One such application is real-time target recognition, which has emerged as a decisive operation in areas such as defense and surveillance. In recent years, deep learning models have improved the accuracy of target recognition algorithms. However, these are based on optical image processing and are computation and memory expensive, which requires not only processing the SAR pulse data but also optimized models and architectures for efficient deployment in onboard computers. This paper presents a fast and accurate target recognition system directly on raw SAR data using a neural network model. This network receives and processes SAR echo data for fast processing, alleviating the computationally expensive DSP image generation algorithms such as Backprojection and RangeDoppler. Thus, this allows the use of simpler and faster models, while maintaining accuracy. The system was designed, optimized, and tested on low-cost embedded devices with low size, weight, and energy requirements (Khadas VIM3 and Raspberry Pi 5). Results demonstrate that the proposed solution achieves a target classification accuracy for the MSTAR dataset close to 100% in less than 1.5 ms and 5.5 W of power.

Journal Article

Share this book

Add to My Shelf

Intelligent Sports Weights

by Jacinto, Gustavo , Duarte, Olga dos Santos , Policarpo Duarte, Rui in Accuracy , Algorithms , Classification

2025

Weightlifting is a common fitness activity and can be practiced individually without supervision. However, performing regular weightlifting exercises without any form of feedback can lead to serious injuries. To counter this, this work proposes a different approach to automatic weightlifting supervision off-the-person. The proposed embedded system is coupled to the weights and evaluates if they follow the correct trajectory in real time. The system is based on a low-power embedded System-on-a-Chip to perform the classification of the correctness of physical exercises using a Convolutional Neural Network with data from the embedded IMU. It is a low-cost solution and can be adapted to the characteristics of specific exercises to fine-tune the performance of the athlete. Experimental results show real-time monitoring capability with an average accuracy close to 95%. To favor its use, the prototypes have been enclosed on a custom 3D case and validated in an operational environment. All research outputs, developments, and engineering models are publicly available.

Journal Article

Share this book

Add to My Shelf

Enhancing Airport Traffic Flow: Intelligent System Based on VLC, Rerouting Techniques, and Adaptive Reward Learning

by Vieira, Manuel Augusto , Louro, Paula , Fantoni, Alessandro in adaptive reward mechanisms , Airports , Artificial intelligence

2025

Airports are complex environments where efficient localization and intelligent traffic management are essential for ensuring smooth navigation and operational efficiency for both pedestrians and Autonomous Guided Vehicles (AGVs). This study presents an Artificial Intelligence (AI)-driven airport traffic management system that integrates Visible Light Communication (VLC), rerouting techniques, and adaptive reward mechanisms to optimize traffic flow, reduce congestion, and enhance safety. VLC-enabled luminaires serve as transmission points for location-specific guidance, forming a hybrid mesh network based on tetrachromatic LEDs with On-Off Keying (OOK) modulation and SiC optical receivers. AI agents, driven by Deep Reinforcement Learning (DRL), continuously analyze traffic conditions, apply adaptive rewards to improve decision-making, and dynamically reroute agents to balance traffic loads and avoid bottlenecks. Traffic states are encoded and processed through Q-learning algorithms, enabling intelligent phase activation and responsive control strategies. Simulation results confirm that the proposed system enables more balanced green time allocation, with reductions of up to 43% in vehicle-prioritized phases (e.g., Phase 1 at C1) to accommodate pedestrian flows. These adjustments lead to improved route planning, reduced halting times, and enhanced coordination between AGVs and pedestrian traffic across multiple intersections. Additionally, traffic flow responsiveness is preserved, with critical clearance phases maintaining stability or showing slight increases despite pedestrian prioritization. Simulation results confirm improved route planning, reduced halting times, and enhanced coordination between AGVs and pedestrian flows. The system also enables accurate indoor localization without relying on a Global Positioning System (GPS), supporting seamless movement and operational optimization. By combining VLC, adaptive AI models, and rerouting strategies, the proposed approach contributes to safer, more efficient, and human-centered airport mobility.

Journal Article

Share this book

Add to My Shelf

Decimal Multiplication in FPGA with a Novel Decimal Adder/Subtractor

by Neto, Horácio , Véstias, Mário in Algorithms , Arithmetic and logic units , decimal adder parallel multiplication

2021

Financial and commercial data are mostly represented in decimal format. To avoid errors introduced when converting some decimal fractions to binary, these data are processed with decimal arithmetic. Most processors only have hardwired binary arithmetic units. So, decimal operations are executed with slow software-based decimal arithmetic functions. For the fast execution of decimal operations, dedicated hardware units have been proposed and designed in FPGA. Decimal multiplication is found in most decimal-based applications and so its optimized design is very important for fast execution. In this paper two new parallel decimal multipliers in FPGA are proposed. These are based on a new decimal adder/subtractor also proposed in this paper. The new decimal multipliers improve state-of-the-art parallel decimal multipliers. Compared to previous architectures, implementation results show that the proposed multipliers achieve 26% better area and 12% better performance. Also, the new decimal multipliers reduce the area and performance gap to binary multipliers and are smaller for 32 digit operands.

Journal Article

Share this book

Add to My Shelf

Energy-Efficient and Real-Time Wearable for Wellbeing-Monitoring IoT System Based on SoC-FPGA

by Frutuoso, Maria , Neto, Horácio , Duarte, Rui in Algorithms , Biometrics , Blood

2023

Wearable devices used for personal monitoring applications have been improved over the last decades. However, these devices are limited in terms of size, processing capability and power consumption. This paper proposes an efficient hardware/software embedded system for monitoring bio-signals in real time, including a heart rate calculator using PPG and an emotion classifier from EEG. The system is suitable for outpatient clinic applications requiring data transfers to external medical staff. The proposed solution contributes with an effective alternative to the traditional approach of processing bio-signals offline by proposing a SoC-FPGA based system that is able to fully process the signals locally at the node. Two sub-systems were developed targeting a Zynq 7010 device and integrating custom hardware IP cores that accelerate the processing of the most complex tasks. The PPG sub-system implements an autocorrelation peak detection algorithm to calculate heart rate values. The EEG sub-system consists of a KNN emotion classifier of preprocessed EEG features. This work overcomes the processing limitations of microcontrollers and general-purpose units, presenting a scalable and autonomous wearable solution with high processing capability and real-time response.

Journal Article

Share this book

Add to My Shelf

Moving Deep Learning to the Edge

by Neto, Horácio C. , Duarte, Rui Policarpo , Véstias, Mário P. in artificial intelligence , deep learning , deep neural network

2020

Deep learning is now present in a wide range of services and applications, replacing and complementing other machine learning algorithms. Performing training and inference of deep neural networks using the cloud computing model is not viable for applications where low latency is required. Furthermore, the rapid proliferation of the Internet of Things will generate a large volume of data to be processed, which will soon overload the capacity of cloud servers. One solution is to process the data at the edge devices themselves, in order to alleviate cloud server workloads and improve latency. However, edge devices are less powerful than cloud servers, and many are subject to energy constraints. Hence, new resource and energy-oriented deep learning models are required, as well as new computing platforms. This paper reviews the main research directions for edge computing deep learning algorithms.

Journal Article

Share this book

Add to My Shelf

Smart Embedded System for Skin Cancer Classification

by Durães, Pedro F. , Véstias, Mário P. in Accuracy , Algorithms , Analysis

2023

The very good results achieved with recent algorithms for image classification based on deep learning have enabled new applications in many domains. The medical field is one that can greatly benefit from these algorithms in order to help the medical professional elaborate on his/her diagnostic. In particular, portable devices for medical image classification are useful in scenarios where a full analysis system is not an option or is difficult to obtain. Algorithms based on deep learning models are computationally demanding; therefore, it is difficult to run them in low-cost devices with a low energy consumption and high efficiency. In this paper, a low-cost system is proposed to classify skin cancer images. Two approaches were followed to achieve a fast and accurate system. At the algorithmic level, a cascade inference technique was considered, where two models were used for inference. At the architectural level, the deep learning processing unit from Vitis-AI was considered in order to design very efficient accelerators in FPGA. The dual model was trained and implemented for skin cancer detection in a ZYNQ UltraScale+ MPSoC ZCU104 evaluation kit with a ZU7EV device. The core was integrated in a full system-on-chip solution and tested with the HAM10000 dataset. It achieves a performance of 13.5 FPS with an accuracy of 87%, with only 33k LUTs, 80 DSPs, 70 BRAMs and 1 URAM.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter