Catalogue Search | MBRL

A deep learning based fusion of RGB camera information and magnetic localization information for endoscopic capsule robots

by Konukoglu, Ender , Shabbir, Jahanzaib , Turan, Mehmet in Algorithms , Artificial Intelligence , Cameras

2017

A reliable, real time localization functionality is crutial for actively controlled capsule endoscopy robots, which are an emerging, minimally invasive diagnostic and therapeutic technology for the gastrointestinal (GI) tract. In this study, we extend the success of deep learning approaches from various research fields to the problem of sensor fusion for endoscopic capsule robots. We propose a multi-sensor fusion based localization approach which combines endoscopic camera information and magnetic sensor based localization information. The results performed on real pig stomach dataset show that our method achieves sub-millimeter precision for both translational and rotational movements.

Journal Article

Share this book

Add to My Shelf

CellNet: A Lightweight Model towards Accurate LOC-Based High-Speed Cell Detection

by Xianlei Long , Qingyi Gu , Idaku Ishii in Accuracy , Algorithms , Artificial neural networks

2022

Label-free cell separation and sorting in a microfluidic system, an essential technique for modern cancer diagnosis, resulted in high-throughput single-cell analysis becoming a reality. However, designing an efficient cell detection model is challenging. Traditional cell detection methods are subject to occlusion boundaries and weak textures, resulting in poor performance. Modern detection models based on convolutional neural networks (CNNs) have achieved promising results at the cost of a large number of both parameters and floating point operations (FLOPs). In this work, we present a lightweight, yet powerful cell detection model named CellNet, which includes two efficient modules, CellConv blocks and the h-swish nonlinearity function. CellConv is proposed as an effective feature extractor as a substitute to computationally expensive convolutional layers, whereas the h-swish function is introduced to increase the nonlinearity of the compact model. To boost the prediction and localization ability of the detection model, we re-designed the model’s multi-task loss function. In comparison with other efficient object detection methods, our approach achieved state-of-the-art 98.70% mean average precision (mAP) on our custom sea urchin embryos dataset with only 0.08 M parameters and 0.10 B FLOPs, reducing the size of the model by 39.5× and the computational cost by 4.6×. We deployed CellNet on different platforms to verify its efficiency. The inference speed on a graphics processing unit (GPU) was 500.0 fps compared with 87.7 fps on a CPU. Additionally, CellNet is 769.5-times smaller and 420 fps faster than YOLOv3. Extensive experimental results demonstrate that CellNet can achieve an excellent efficiency/accuracy trade-off on resource-constrained platforms.

Journal Article

Share this book

Add to My Shelf

A CNN-Based Method of Vehicle Detection from Aerial Images Using Hard Example Mining

by Ryosuke Shibasaki , Hiroyuki Miyazaki , Yohei Koga in aerial image , convolutional neural network (CNN) , hard example mining

2018

Recently, deep learning techniques have had a practical role in vehicle detection. While much effort has been spent on applying deep learning to vehicle detection, the effective use of training data has not been thoroughly studied, although it has great potential for improving training results, especially in cases where the training data are sparse. In this paper, we proposed using hard example mining (HEM) in the training process of a convolutional neural network (CNN) for vehicle detection in aerial images. We applied HEM to stochastic gradient descent (SGD) to choose the most informative training data by calculating the loss values in each batch and employing the examples with the largest losses. We picked 100 out of both 500 and 1000 examples for training in one iteration, and we tested different ratios of positive to negative examples in the training data to evaluate how the balance of positive and negative examples would affect the performance. In any case, our method always outperformed the plain SGD. The experimental results for images from New York showed improved performance over a CNN trained in plain SGD where the F1 score of our method was 0.02 higher.

Journal Article

Share this book

Add to My Shelf

BenSignNet: Bengali Sign Language Alphabet Recognition Using Concatenated Segmentation and Convolutional Neural Network

by Miah, Abu Saleh Musa , Hasan, Md Al Mehedi , Shin, Jungpil in 38-BdSL , Accuracy , Bengali sign language (BSL)

2022

Sign language recognition is one of the most challenging applications in machine learning and human-computer interaction. Many researchers have developed classification models for different sign languages such as English, Arabic, Japanese, and Bengali; however, no significant research has been done on the general-shape performance for different datasets. Most research work has achieved satisfactory performance with a small dataset. These models may fail to replicate the same performance for evaluating different and larger datasets. In this context, this paper proposes a novel method for recognizing Bengali sign language (BSL) alphabets to overcome the issue of generalization. The proposed method has been evaluated with three benchmark datasets such as ‘38 BdSL’, ‘KU-BdSL’, and ‘Ishara-Lipi’. Here, three steps are followed to achieve the goal: segmentation, augmentation, and Convolutional neural network (CNN) based classification. Firstly, a concatenated segmentation approach with YCbCr, HSV and watershed algorithm was designed to accurately identify gesture signs. Secondly, seven image augmentation techniques are selected to increase the training data size without changing the semantic meaning. Finally, the CNN-based model called BenSignNet was applied to extract the features and classify purposes. The performance accuracy of the model achieved 94.00%, 99.60%, and 99.60% for the BdSL Alphabet, KU-BdSL, and Ishara-Lipi datasets, respectively. Experimental findings confirmed that our proposed method achieved a higher recognition rate than the conventional ones and accomplished a generalization property in all datasets for the BSL domain.

Journal Article

Share this book

Add to My Shelf

Forecasting Vertical Profiles of Ocean Currents from Surface Characteristics: A Multivariate Multi-Head Convolutional Neural Network–Long Short-Term Memory Approach

by Coniglione, Robert , Bernard, Landry , McKenna, Jason R. in Algorithms , Artificial neural networks , chained multivariate multi-output regression

2023

While study of ocean dynamics usually involves modeling deep ocean variables, monitoring and accurate forecasting of nearshore environments is also critical. However, sensor observations often contain artifacts like long stretches of missing data and noise, typically after an extreme event occurrence or some accidental damage to the sensors. Such data artifacts, if not handled diligently prior to modeling, can significantly impact the reliability of any further predictive analysis. Therefore, we present a framework that integrates data reconstruction of key sea state variables and multi-step-ahead forecasting of current speed from the reconstructed time series for 19 depth levels simultaneously. Using multivariate chained regressions, the reconstruction algorithm rigorously tests from an ensemble of tree-based models (fed only with surface characteristics) to impute gaps in the vertical profiles of the sea state variables down to 20 m deep. Subsequently, a deep encoder–decoder model, comprising multi-head convolutional networks, extracts high-level features from each depth level’s multivariate (reconstructed) input and feeds them to a deep long short-term memory network for 24 h ahead forecasts of current speed profiles. In this work, we utilized Viking buoy data, and demonstrated that with limited training data, we could explain an overall 80% variation in the current speed profiles across the forecast period and the depth levels.

Journal Article

Share this book

Add to My Shelf

Enhancing Plant Disease Detection: Incorporating Advanced CNN Architectures for Better Accuracy and Interpretability

by Castillo-Ossa, Luis F. , González-Briones, Alfonso , Florez, Sebastián López in Accuracy , Artificial Intelligence , Artificial neural networks

2025

Convolutional Neural Networks (CNNs) have proven effective in automated plant disease diagnosis, significantly contributing to crop health monitoring. However, their limited interpretability hinders practical deployment in real-world agricultural settings. To address this, we explore advanced CNN architectures, namely ResNet-50 and EfficientNet, augmented with attention mechanisms. These models enhance accuracy by optimizing depth, width, and resolution, while attention layers improve transparency by focusing on disease-relevant regions. Experiments using the PlantVillage dataset show that basic CNNs achieve 46.69% accuracy, while ResNet-50 and EfficientNet attain 63.79% and 98.27%, respectively. On a 39-class extended dataset, our proposed EfficientNet-B0 with attention (EfficientNetB0-Attn), integrating an attention module at layer 262, achieves 99.39% accuracy. This approach significantly enhances interpretability without compromising performance. The attention module generates weights via backpropagation, allowing the model to emphasize disease-relevant image regions, thereby enhancing both accuracy and interpretability.

Journal Article

Share this book

Add to My Shelf

Real Time Multipurpose Smart Waste Classification Model for Efficient Recycling in Smart Cities Using Multilayer Convolutional Neural Network and Perceptron

by Ali, Tariq , Shoaib, Muhammad , Irfan, Muhammad in Accuracy , Artificial intelligence , Classification

2021

Urbanization is a big concern for both developed and developing countries in recent years. People shift themselves and their families to urban areas for the sake of better education and a modern lifestyle. Due to rapid urbanization, cities are facing huge challenges, one of which is waste management, as the volume of waste is directly proportional to the people living in the city. The municipalities and the city administrations use the traditional wastage classification techniques which are manual, very slow, inefficient and costly. Therefore, automatic waste classification and management is essential for the cities that are being urbanized for the better recycling of waste. Better recycling of waste gives the opportunity to reduce the amount of waste sent to landfills by reducing the need to collect new raw material. In this paper, the idea of a real-time smart waste classification model is presented that uses a hybrid approach to classify waste into various classes. Two machine learning models, a multilayer perceptron and multilayer convolutional neural network (ML-CNN), are implemented. The multilayer perceptron is used to provide binary classification, i.e., metal or non-metal waste, and the CNN identifies the class of non-metal waste. A camera is placed in front of the waste conveyor belt, which takes a picture of the waste and classifies it. Upon successful classification, an automatic hand hammer is used to push the waste into the assigned labeled bucket. Experiments were carried out in a real-time environment with image segmentation. The training, testing, and validation accuracy of the purposed model was 0.99% under different training batches with different input features.

Journal Article

Share this book

Add to My Shelf

Universal 3D‐Printing of Suspended Metal Oxide Nanowire Arrays on MEMS for AI‐Optimized Combinatorial Gas Fingerprinting

by Yang, Jihyuk , Huan, Xiao , Cheng, Xing in 3-D printers , 3D printings , convolutional neural network (CNN)‐based prediction

2025

Additive manufacturing technology has the potential to provide great versatility in the design and fabrication of sensing devices. This prerequisite necessitates further technological improvement in precision and material diversity. Here, a universal meniscus‐guided 3D printing method is reported that can fabricate freestanding metal oxide semiconducting nanowires with programmed compositions and compatible substrate options at the single‐entity level. By studying printing process and ink compositions, polycrystalline metal oxide (MOX) nanowires with controlled shapes and tunable diameters down to 180 nm are achieved. The method enables a high‐precision, mask‐free printing of MOXs nanowire arches array on a 1.5 µm thick suspended membrane, paving the way for integrating a micro‐electromechanical systems (MEMS) chemiresistive gas sensor. The diversity of 3D printable materials demonstrated in this study covers 24 types of combination MOX nanowires, including TiO2, ZnO, SnO2, In2O3, WO3, and CeO2, doped with noble metals of Au, Ag, Pd, and Pt. Their sensing performances for CH4, NH3, CH3CH2OH, CO, and H2S gases are quantitatively investigated, while artificial intelligence (AI)‐driven analysis of multi‐sensor responses achieves 98% gas classification accuracy via sliding time window‐based convolutional neural networks (CNN). The ability to 3D print semiconducting materials opens the possibility to freely design and realize new‐concept electronic devices beyond the restrictions of the traditional top‐down manufacturing process. A universal meniscus‐guided 3D printing method enables direct fabrication of suspended metal oxide nanowires on MEMS chips. It achieves 180 nm resolution, compositional tunability, and integration on 1.5 µm thick membranes. Twenty‐four types of nanowires are printed to investigate gas responses, from which complementary materials enabled time window‐based CNN analysis with 98% classification accuracy. This strategy bridges 3D nanoprinting and practical electronic integration.

Journal Article

Share this book

Add to My Shelf

Automatic Ceiling Damage Detection in Large-Span Structures Based on Computer Vision and Deep Learning

by Lichen Wang , Ken’ichi Kawaguchi , Jianzhuang Xiao in Accuracy , Algorithms , ceiling damage detection; large-span structure; convolutional neural networks (CNN); object detection; deep learning

2022

To alleviate the workload in prevailing expert-based onsite inspection, a vision-based method using state-of-the-art deep learning architectures is proposed to automatically detect ceiling damage in large-span structures. The dataset consists of 914 images collected by the Kawaguchi Lab since 1995 with over 7000 learnable damages in the ceilings and is categorized into four typical damage forms (peelings, cracks, distortions, and fall-offs). Twelve detection models are established, trained, and compared by variable hyperparameter analysis. The best performing model reaches a mean average precision (mAP) of 75.28%, which is considerably high for object detection. A comparative study indicates that the model is generally robust to the challenges in ceiling damage detection, including partial occlusion by visual obstructions, the extremely varied aspect ratios, small object detection, and multi-object detection. Another comparative study in the F1 score performance, which combines the precision and recall in to one single metric, shows that the model outperforms the CNN (convolutional neural networks) model using the Saliency-MAP method in our previous research to a remarkable extent. In the case of a large-area ratio with a non-ceiling region, the F1 score of these two models are 0.83 and 0.28, respectively. The findings of this study push automatic ceiling damage detection in large-span structures one step further.

Journal Article

Share this book

Add to My Shelf

An Emotion and Attention Recognition System to Classify the Level of Engagement to a Video Conversation by Participants in Real Time Using Machine Learning Models and Utilizing a Neural Accelerator Chip

by Jay Rajasekera , Dilki Dandeniya Arachchi , Janith Kodithuwakku in Algorithms , Application programming interface , Artificial intelligence

2022

It is not an easy task for organizers to observe the engagement level of a video meeting audience. This research was conducted to build an intelligent system to enhance the experience of video conversations such as virtual meetings and online classrooms using convolutional neural network (CNN)- and support vector machine (SVM)-based machine learning models to classify the emotional states and the attention level of the participants to a video conversation. This application visualizes their attention and emotion analytics in a meaningful manner. This proposed system provides an artificial intelligence (AI)-powered analytics system with optimized machine learning models to monitor the audience and prepare insightful reports on the basis of participants’ facial features throughout the video conversation. One of the main objectives of this research is to utilize the neural accelerator chip to enhance emotion and attention detection tasks. A custom CNN developed by Gyrfalcon Technology Inc (GTI) named GnetDet was used in this system to run the trained model on their GTI Lightspeeur 2803 neural accelerator chip.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter