Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
13,048
result(s) for
"Model compression"
Sort by:
A Wearable Assistive Device for Blind Pedestrians Using Real-Time Object Detection and Tactile Presentation
by
Sawada, Hideyuki
,
Chen, Yiwen
,
Shen, Junjie
in
assistance for visually impaired people
,
Blindness
,
model compression
2022
Nowadays, improving the traffic safety of visually impaired people is a topic of widespread concern. To help avoid the risks and hazards of road traffic in their daily life, we propose a wearable device using object detection techniques and a novel tactile display made from shape-memory alloy (SMA) actuators. After detecting obstacles in real-time, the tactile display attached to a user’s hands presents different tactile sensations to show the position of the obstacles. To implement the computation-consuming object detection algorithm in a low-memory mobile device, we introduced a slimming compression method to reduce 90% of the redundant structures of the neural network. We also designed a particular driving circuit board that can efficiently drive the SMA-based tactile displays. In addition, we also conducted several experiments to verify our wearable assistive device’s performance. The results of the experiments showed that the subject was able to recognize the left or right position of a stationary obstacle with 96% accuracy and also successfully avoided collisions with moving obstacles by using the wearable assistive device.
Journal Article
F-ALBERT: A Distilled Model from a Two-Time Distillation System for Reduced Computational Complexity in ALBERT Model
2023
Recently, language models based on the Transformer architecture have been predominantly used in AI natural language processing. These models, which have been proven to perform better with more parameters, have led to a significant increase in model size and computational load. ALBERT solves this problem by significantly reducing the number of parameters it retains by repeatedly reusing parameters. Although ALBERT significantly reduces the parameters it maintains, it requires a computational load similar to the original language model due to the reuse process. In this study, we develop a distillation system that decreases the number of times the ALBERT model reuses parameters and progressively reduces the parameters being reused. We propose a representation in this distillation system that can effectively distill the knowledge of the original model and develop a new architecture with reduced computation. Through this system, F-ALBERT, which had about half the computational load compared to the ALBERT model, restored about 98% of the performance of the original model on the GLUE benchmark.
Journal Article
A comprehensive survey on model compression and acceleration
by
Sarangapani Jagannathan
,
Choudhary Tejalal
,
Mishra Vipul
in
Accuracy
,
Artificial intelligence
,
Artificial neural networks
2020
In recent years, machine learning (ML) and deep learning (DL) have shown remarkable improvement in computer vision, natural language processing, stock prediction, forecasting, and audio processing to name a few. The size of the trained DL model is large for these complex tasks, which makes it difficult to deploy on resource-constrained devices. For instance, size of the pre-trained VGG16 model trained on the ImageNet dataset is more than 500 MB. Resource-constrained devices such as mobile phones and internet of things devices have limited memory and less computation power. For real-time applications, the trained models should be deployed on resource-constrained devices. Popular convolutional neural network models have millions of parameters that leads to increase in the size of the trained model. Hence, it becomes essential to compress and accelerate these models before deploying on resource-constrained devices while making the least compromise with the model accuracy. It is a challenging task to retain the same accuracy after compressing the model. To address this challenge, in the last couple of years many researchers have suggested different techniques for model compression and acceleration. In this paper, we have presented a survey of various techniques suggested for compressing and accelerating the ML and DL models. We have also discussed the challenges of the existing techniques and have provided future research directions in the field.
Journal Article
Model Compression for Deep Neural Networks: A Survey
2023
Currently, with the rapid development of deep learning, deep neural networks (DNNs) have been widely applied in various computer vision tasks. However, in the pursuit of performance, advanced DNN models have become more complex, which has led to a large memory footprint and high computation demands. As a result, the models are difficult to apply in real time. To address these issues, model compression has become a focus of research. Furthermore, model compression techniques play an important role in deploying models on edge devices. This study analyzed various model compression methods to assist researchers in reducing device storage space, speeding up model inference, reducing model complexity and training costs, and improving model deployment. Hence, this paper summarized the state-of-the-art techniques for model compression, including model pruning, parameter quantization, low-rank decomposition, knowledge distillation, and lightweight model design. In addition, this paper discusses research challenges and directions for future work.
Journal Article
Learning Accurate Performance Predictors for Ultrafast Automated Model Compression
2023
In this paper, we propose an ultrafast automated model compression framework called SeerNet for flexible network deployment. Conventional non-differen-tiable methods discretely search the desirable compression policy based on the accuracy from exhaustively trained lightweight models, and existing differentiable methods optimize an extremely large supernet to obtain the required compressed model for deployment. They both cause heavy computational cost due to the complex compression policy search and evaluation process. On the contrary, we obtain the optimal efficient networks by directly optimizing the compression policy with an accurate performance predictor, where the ultrafast automated model compression for various computational cost constraint is achieved without complex compression policy search and evaluation. Specifically, we first train the performance predictor based on the accuracy from uncertain compression policies actively selected by efficient evolutionary search, so that informative supervision is provided to learn the accurate performance predictor with acceptable cost. Then we leverage the gradient that maximizes the predicted performance under the barrier complexity constraint for ultrafast acquisition of the desirable compression policy, where adaptive update stepsizes with momentum are employed to enhance optimality of the acquired pruning and quantization strategy. Compared with the state-of-the-art automated model compression methods, experimental results on image classification and object detection show that our method achieves competitive accuracy-complexity trade-offs with significant reduction of the search cost. Code is available at https://github.com/ZiweiWangTHU/SeerNet.
Journal Article
Knowledge Distillation: A Survey
by
Gou Jianping
,
Maybank, Stephen J
,
Yu Baosheng
in
Algorithms
,
Artificial neural networks
,
Computer science
2021
In recent years, deep neural networks have been successful in both industry and academia, especially for computer vision tasks. The great success of deep learning is mainly due to its scalability to encode large-scale data and to maneuver billions of model parameters. However, it is a challenge to deploy these cumbersome deep models on devices with limited resources, e.g., mobile phones and embedded devices, not only because of the high computational complexity but also the large storage requirements. To this end, a variety of model compression and acceleration techniques have been developed. As a representative type of model compression and acceleration, knowledge distillation effectively learns a small student model from a large teacher model. It has received rapid increasing attention from the community. This paper provides a comprehensive survey of knowledge distillation from the perspectives of knowledge categories, training schemes, teacher–student architecture, distillation algorithms, performance comparison and applications. Furthermore, challenges in knowledge distillation are briefly reviewed and comments on future research are discussed and forwarded.
Journal Article
Distribution-Sensitive Information Retention for Accurate Binary Neural Network
by
Ding, Yifu
,
Qin, Haotong
,
Zhang, Xiangguo
in
Distillation
,
Image classification
,
Neural networks
2023
Model binarization is an effective method of compressing neural networks and accelerating their inference process, which enables state-of-the-art models to run on resource-limited devices. Recently, advanced binarization methods have been greatly improved by minimizing the quantization error directly in the forward process. However, a significant performance gap still exists between the 1-bit model and the 32-bit one. The empirical study shows that binarization causes a great loss of information in the forward and backward propagation which harms the performance of binary neural networks (BNNs). We present a novel distribution-sensitive information retention network (DIR-Net) that retains the information in the forward and backward propagation by improving internal propagation and introducing external representations. The DIR-Net mainly relies on three technical contributions: (1) Information Maximized Binarization (IMB): minimizing the information loss and the binarization error of weights/activations simultaneously by weight balance and standardization; (2) Distribution-sensitive Two-stage Estimator (DTE): retaining the information of gradients by distribution-sensitive soft approximation by jointly considering the updating capability and accurate gradient; (3) Representation-align Binarization-aware Distillation (RBD): retaining the representation information by distilling the representations between full-precision and binarized networks. The DIR-Net investigates both forward and backward processes of BNNs from the unified information perspective, thereby providing new insight into the mechanism of network binarization. The three techniques in our DIR-Net are versatile and effective and can be applied in various structures to improve BNNs. Comprehensive experiments on the image classification and objective detection tasks show that our DIR-Net consistently outperforms the state-of-the-art binarization approaches under mainstream and compact architectures, such as ResNet, VGG, EfficientNet, DARTS, and MobileNet. Additionally, we conduct our DIR-Net on real-world resource-limited devices which achieves 11.1× storage saving and 5.4× speedup.
Journal Article
Tiny Language Models for Automation and Control: Overview, Potential Applications, and Future Research Directions
by
El Makkaoui, Khalid
,
Alfarraj, Osama
,
Lamaakal, Ismail
in
Algorithms
,
Automation
,
Control systems
2025
Large Language Models (LLMs), like GPT and BERT, have significantly advanced Natural Language Processing (NLP), enabling high performance on complex tasks. However, their size and computational needs make LLMs unsuitable for deployment on resource-constrained devices, where efficiency, speed, and low power consumption are critical. Tiny Language Models (TLMs), also known as BabyLMs, offer compact alternatives by using advanced compression and optimization techniques to function effectively on devices such as smartphones, Internet of Things (IoT) systems, and embedded platforms. This paper provides a comprehensive survey of TLM architectures and methodologies, including key techniques such as knowledge distillation, quantization, and pruning. Additionally, it explores potential and emerging applications of TLMs in automation and control, covering areas such as edge computing, IoT, industrial automation, and healthcare. The survey discusses challenges unique to TLMs, such as trade-offs between model size and accuracy, limited generalization, and ethical considerations in deployment. Future research directions are also proposed, focusing on hybrid compression techniques, application-specific adaptations, and context-aware TLMs optimized for hardware-specific constraints. This paper aims to serve as a foundational resource for advancing TLMs capabilities across diverse real-world applications.
Journal Article
Boost Precision Agriculture with Unmanned Aerial Vehicle Remote Sensing and Edge Intelligence: A Survey
by
Jin, Yongjun
,
Wang, Lizhe
,
Liu, Renhua
in
Agricultural economics
,
Agricultural production
,
Agriculture
2021
In recent years unmanned aerial vehicles (UAVs) have emerged as a popular and cost-effective technology to capture high spatial and temporal resolution remote sensing (RS) images for a wide range of precision agriculture applications, which can help reduce costs and environmental impacts by providing detailed agricultural information to optimize field practices. Furthermore, deep learning (DL) has been successfully applied in agricultural applications such as weed detection, crop pest and disease detection, etc. as an intelligent tool. However, most DL-based methods place high computation, memory and network demands on resources. Cloud computing can increase processing efficiency with high scalability and low cost, but results in high latency and great pressure on the network bandwidth. The emerging of edge intelligence, although still in the early stages, provides a promising solution for artificial intelligence (AI) applications on intelligent edge devices at the edge of the network close to data sources. These devices are with built-in processors enabling onboard analytics or AI (e.g., UAVs and Internet of Things gateways). Therefore, in this paper, a comprehensive survey on the latest developments of precision agriculture with UAV RS and edge intelligence is conducted for the first time. The major insights observed are as follows: (a) in terms of UAV systems, small or light, fixed-wing or industrial rotor-wing UAVs are widely used in precision agriculture; (b) sensors on UAVs can provide multi-source datasets, and there are only a few public UAV dataset for intelligent precision agriculture, mainly from RGB sensors and a few from multispectral and hyperspectral sensors; (c) DL-based UAV RS methods can be categorized into classification, object detection and segmentation tasks, and convolutional neural network and recurrent neural network are the mostly common used network architectures; (d) cloud computing is a common solution to UAV RS data processing, while edge computing brings the computing close to data sources; (e) edge intelligence is the convergence of artificial intelligence and edge computing, in which model compression especially parameter pruning and quantization is the most important and widely used technique at present, and typical edge resources include central processing units, graphics processing units and field programmable gate arrays.
Journal Article
MambaLIC: State-Space Models for Efficient Remote Sensing Image Compression
2026
Remote sensing (RS) images, characterized by their large size and rich texture, require algorithms capable of effectively integrating both global and local features for compression. However, existing Learned Image Compression (LIC) approaches face distinct bottlenecks. While Transformer-based architectures typically suffer from heavy computational loads, standard State Space Models (SSMs) often incur prohibitive memory costs when processing high-resolution inputs. To address these limitations, we propose MambaLIC, a novel RS image compression network that integrates the efficient long-range modeling of SSMs with the local modeling ability of CNNs. In this paper, we introduce an innovative Remote Sensing State Space Model (RS-SSM) module, which combines visual SSM with dynamic convolution for remote sensing image compression. This integration facilitates effective interaction between local and global information, thereby enhancing the performance of RS image compression. Furthermore, we propose an SSM attention-based (SSA-based) spatial-channel context model for better entropy modeling. Compared to Transformer-CNN mixed architectures, MambaLIC reduces computational complexity by 63.9% and achieves superior rate-distortion (RD) performance. Consequently, compared to the latest SS2D-based method MambaIC, MambaLIC achieves substantial efficiency gains, saving 78.8% in memory usage. Experimental results demonstrate that MambaLIC achieves state-of-the-art (SOTA) performance, outperforming VVC (VTM-17.0) by 14.22%, 18.48%, and 17.47% in BD-rate on UC-Merced, LoveDA, and xView datasets, respectively.
Journal Article