Catalogue Search | MBRL

Eight-Bit Vector SoftFloat Extension for the RISC-V Spike Simulator

by Marcelli, Andrea , Mastrandrea, Antonio , Menichelli, Francesco in Accuracy , Array processors , Artificial intelligence

2025

The recent demand for 8-bit floating-point (FP) formats is driven by their potential to accelerate domain-specific applications with intensive vector computations (e.g., machine learning, graphics, and data compression). This paper presents the design, implementation, and application of the software model of an 8-bit FP vector arithmetic operation set, compliant with the RISC-V vector instruction set architecture. The model has been developed as an extension of the SoftFloat library and integrated into the RISC-V reference instruction-level simulator Spike, providing the first open-source 8-bit SoftFloat extension for an instruction-set simulator. Based on the SoftFloat library templates for standard FP formats, the proposed extension implements the two widely used 8-bit formats E4M3 and E5M2 in both Open Compute Project (OCP) and IEEE 754 variants. In host-time micro-kernels, FP8 delivers +2–4% more elements per second versus FP32 (across vfadd/vfsub/vfmul) and ≈5% lower RSS; E4M3 and E5M2 perform similarly. Enabling FP8 in Spike increases the stripped binary by ~1.8% (mostly .text). The proposed extension was used to fully verify and correct errors in the vector FP unit design for the eProcessor European project, and continues to be used to verify other 8-bit FP unit implementations.

Journal Article

Share this book

Add to My Shelf

Evaluation of Dynamic Triple Modular Redundancy in an Interleaved-Multi-Threading RISC-V Core

by Mastrandrea, Antonio , Ottavi, Marco , Menichelli, Francesco in Analysis , Architecture , Central processing units

2023

Functional safety is a key requirement in several application domains in which microprocessors are an essential part. A number of redundancy techniques have been developed with the common purpose of protecting circuits against single event upset (SEU) faults. In microprocessors, functional redundancy may be achieved through multi-core or simultaneous-multi-threading architectures, with techniques that are broadly classifiable as Double Modular Redundancy (DMR) and Triple Modular Redundancy (TMR), involving the duplication or triplication of architecture units, respectively. RISC-V plays an interesting role in this context for its inherent extendability and the availability of open-source microarchitecture designs. In this work, we present a novel way to exploit the advantages of both DMR and TMR techniques in an Interleaved-Multi-Threading (IMT) microprocessor architecture, leveraging its replicated threads for redundancy, and obtaining a system that can dynamically switch from DMR to TMR in the case of faults. We demonstrated the approach for a specific family of RISC-V cores, modifying the microarchitecture and proving its effectiveness with an extensive RTL fault-injection simulation campaign.

Journal Article

Share this book

Add to My Shelf

Fault-Tolerant Hardware Acceleration for High-Performance Edge-Computing Nodes

by Mastrandrea, Antonio , Barbirotta, Marcello , Angioli, Marco in Artificial intelligence , Cloud computing , Cost control

2023

High-performance embedded systems with powerful processors, specialized hardware accelerators, and advanced software techniques are all key technologies driving the growth of the IoT. By combining hardware and software techniques, it is possible to increase the overall reliability and safety of these systems by designing embedded architectures that can continue to function correctly in the event of a failure or malfunction. In this work, we fully investigate the integration of a configurable hardware vector acceleration unit in the fault-tolerant RISC-V Klessydra-fT03 soft core, introducing two different redundant vector co-processors coupled with the Interleaved-Multi-Threading paradigm on which the microprocessor is based. We then illustrate the pros and cons of both approaches, comparing their impacts on performance and hardware utilization with their vulnerability, presenting a quantitative large-fault-injection simulation analysis on typical vector computing benchmarks, and comparing and classifying the obtained results. The results demonstrate, under specific conditions, that it is possible to add a hardware co-processor to a fault-tolerant microprocessor, improving performance without degrading safety and reliability.

Journal Article

Share this book

Add to My Shelf

Customizable Vector Acceleration in Extreme-Edge Computing: A RISC-V Software/Hardware Architecture Study on VGG-16 Implementation

by Mastrandrea, Antonio , Menichelli, Francesco , Sordillo, Stefano in Algorithms , Artificial intelligence , Cloud computing

2021

Computing in the cloud-edge continuum, as opposed to cloud computing, relies on high performance processing on the extreme edge of the Internet of Things (IoT) hierarchy. Hardware acceleration is a mandatory solution to achieve the performance requirements, yet it can be tightly tied to particular computation kernels, even within the same application. Vector-oriented hardware acceleration has gained renewed interest to support artificial intelligence (AI) applications like convolutional networks or classification algorithms. We present a comprehensive investigation of the performance and power efficiency achievable by configurable vector acceleration subsystems, obtaining evidence of both the high potential of the proposed microarchitecture and the advantage of hardware customization in total transparency to the software program.

Journal Article

Share this book

Add to My Shelf

High-Level Side-Channel Attack Modeling and Simulation for Security-Critical Systems on Chips

by Trifiletti, A. , Menicocci, R. , Menichelli, F. in Algorithms , Case studies , Computer programs

2008

The design flow of a digital cryptographic device must take into account the evaluation of its security against attacks based on side channels observation. The adoption of high level countermeasures, as well as the verification of the feasibility of new attacks, presently require the execution of time-consuming physical measurements on the prototype product or the simulation at a low abstraction level. Starting from these assumptions, we developed an exploration approach centered on high level simulation, in order to evaluate the actual implementation of a cryptographic algorithm, being it software or hardware based. The simulation is performed within a unified tool based on SystemC, that can model a software implementation running on a microprocessor-based architecture or a dedicated hardware implementation as well as mixed software-hardware implementations with cycle-accurate resolution. Here we describe the tool and provide a large set of design explorations and characterizations based on actual implementations of the AES cryptographic algorithm, demonstrating how the execution of a large set of experiments allowed by the fast simulation engine can lead to important improvements in the knowledge and the identification of the weaknesses in cryptographic algorithm implementations.

Journal Article

Share this book

Add to My Shelf

Klessydra-T: Designing Vector Coprocessors for Multi-Threaded Edge-Computing Cores

by Mastrandrea, Antonio , Menichelli, Francesco , Scotti, Giuseppe in Algorithms , Coprocessors , Edge computing

2021

Computation intensive kernels, such as convolutions, matrix multiplication and Fourier transform, are fundamental to edge-computing AI, signal processing and cryptographic applications. Interleaved-Multi-Threading (IMT) processor cores are interesting to pursue energy efficiency and low hardware cost for edge-computing, yet they need hardware acceleration schemes to run heavy computational workloads. Following a vector approach to accelerate computations, this study explores possible alternatives to implement vector coprocessing units in RISC-V cores, showing the synergy between IMT and data-level parallelism in the target workloads.

Paper

Share this book

Add to My Shelf

HD-CB: The First Exploration of Hyperdimensional Computing for Contextual Bandits Problems

by Barbirotta, Marcello , Rosato, Antonello , Angioli, Marco in Algorithms , Artificial intelligence , Cognitive tasks

2025

Hyperdimensional Computing (HDC), also known as Vector Symbolic Architectures, is a computing paradigm that combines the strengths of symbolic reasoning with the efficiency and scalability of distributed connectionist models in artificial intelligence. HDC has recently emerged as a promising alternative for performing learning tasks in resource-constrained environments thanks to its energy and computational efficiency, inherent parallelism, and resilience to noise and hardware faults. This work introduces the Hyperdimensional Contextual Bandits (HD-CB): the first exploration of HDC to model and automate sequential decision-making Contextual Bandits (CB) problems. The proposed approach maps environmental states in a high-dimensional space and represents each action with dedicated hypervectors (HVs). At each iteration, these HVs are used to select the optimal action for the given context and are updated based on the received reward, replacing computationally expensive ridge regression procedures required by traditional linear CB algorithms with simple, highly parallel vector operations. We propose four HD-CB variants, demonstrating their flexibility in implementing different exploration strategies, as well as techniques to reduce memory overhead and the number of hyperparameters. Extensive simulations on synthetic datasets and a real-world benchmark reveal that HD-CB consistently achieves competitive or superior performance compared to traditional linear CB algorithms, while offering faster convergence time, lower computational complexity, improved scalability, and high parallelism.

Paper

Share this book

Add to My Shelf

Efficient Implementation of LinearUCB through Algorithmic Improvements and Vector Computing Acceleration for Embedded Learning Systems

by Mastrandrea, Antonio , Barbirotta, Marcello , Angioli, Marco in Acceleration , Algorithms , Artificial intelligence

2025

As the Internet of Things expands, embedding Artificial Intelligence algorithms in resource-constrained devices has become increasingly important to enable real-time, autonomous decision-making without relying on centralized cloud servers. However, implementing and executing complex algorithms in embedded devices poses significant challenges due to limited computational power, memory, and energy resources. This paper presents algorithmic and hardware techniques to efficiently implement two LinearUCB Contextual Bandits algorithms on resource-constrained embedded devices. Algorithmic modifications based on the Sherman-Morrison-Woodbury formula streamline model complexity, while vector acceleration is harnessed to speed up matrix operations. We analyze the impact of each optimization individually and then combine them in a two-pronged strategy. The results show notable improvements in execution time and energy consumption, demonstrating the effectiveness of combining algorithmic and hardware optimizations to enhance learning models for edge computing environments with low-power and real-time requirements.

Paper

Share this book

Add to My Shelf

The microarchitecture of a multi-threaded RISC-V compliant processing core family for IoT end-nodes

by Mastrandrea, Antonio , Menichelli, Francesco , Abdallah Cheikh in Central processing units , Computer architecture , CPUs

2017

Internet-of-Things end-nodes demand low power processing platforms characterized by heterogeneous dedicated units, controlled by a processor core running concurrent control threads. Such architecture scheme fits one of the main target application domain of the RISC-V instruction set. We present an open-source processing core compliant with RISC-V on the software side and with the popular Pulpino processor platform on the hardware side, while supporting interleaved multi-threading for IoT applications. The latter feature is a novel contribution in this application domain. We report details about the microarchitecture design along with performance data.

Paper

Share this book

Add to My Shelf

A Comprehensive Review of Risk Factors for Venous Thromboembolism: From Epidemiology to Pathophysiology

by Pastori, Daniele , Menichelli, Danilo , Cormaci, Vito Maria in Anticoagulants , Autoimmune diseases , Blood clots

2023

Venous thromboembolism (VTE) is the third most common cause of death worldwide. The incidence of VTE varies according to different countries, ranging from 1–2 per 1000 person-years in Western Countries, while it is lower in Eastern Countries (<1 per 1000 person-years). Many risk factors have been identified in patients developing VTE, but the relative contribution of each risk factor to thrombotic risk, as well as pathogenetic mechanisms, have not been fully described. Herewith, we provide a comprehensive review of the most common risk factors for VTE, including male sex, diabetes, obesity, smoking, Factor V Leiden, Prothrombin G20210A Gene Mutation, Plasminogen Activator Inhibitor-1, oral contraceptives and hormonal replacement, long-haul flight, residual venous thrombosis, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, trauma and fractures, pregnancy, immobilization, antiphospholipid syndrome, surgery and cancer. Regarding the latter, the incidence of VTE seems highest in pancreatic, liver and non-small cells lung cancer (>70 per 1000 person-years) and lowest in breast, melanoma and prostate cancer (<20 per 1000 person-years). In this comprehensive review, we summarized the prevalence of different risk factors for VTE and the potential molecular mechanisms/pathogenetic mediators leading to VTE.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter