Catalogue Search | MBRL

A Comprehensive Review on Synergy of Multi-Modal Data and AI Technologies in Medical Diagnosis

by Wang, Huina , Yang, Jijiang , Zhao, Qing in Alzheimer's disease , Analysis , Artificial intelligence

2024

Disease diagnosis represents a critical and arduous endeavor within the medical field. Artificial intelligence (AI) techniques, spanning from machine learning and deep learning to large model paradigms, stand poised to significantly augment physicians in rendering more evidence-based decisions, thus presenting a pioneering solution for clinical practice. Traditionally, the amalgamation of diverse medical data modalities (e.g., image, text, speech, genetic data, physiological signals) is imperative to facilitate a comprehensive disease analysis, a topic of burgeoning interest among both researchers and clinicians in recent times. Hence, there exists a pressing need to synthesize the latest strides in multi-modal data and AI technologies in the realm of medical diagnosis. In this paper, we narrow our focus to five specific disorders (Alzheimer’s disease, breast cancer, depression, heart disease, epilepsy), elucidating advanced endeavors in their diagnosis and treatment through the lens of artificial intelligence. Our survey not only delineates detailed diagnostic methodologies across varying modalities but also underscores commonly utilized public datasets, the intricacies of feature engineering, prevalent classification models, and envisaged challenges for future endeavors. In essence, our research endeavors to contribute to the advancement of diagnostic methodologies, furnishing invaluable insights for clinical decision making.

Journal Article

Share this book

Add to My Shelf

Damage identification of structural systems by modal strain energy and an optimization-based iterative regularization method

by Sarmadi, Hassan , Daneshvar, Mohammad Hassan , Saffarian, Mohsen in Bridges , Damage detection , Damage localization

2023

Sensitivity-based methods using modal data are effective and reliable tools for damage localization and quantification. However, those may fail in obtaining reasonable and accurate results due to low damage detectability of sensitivity functions and the ill-posedness problem caused by noisy modal data. To address these major challenges, this article proposes a new method for locating and quantifying damage by developing a new sensitivity function of modal strain energy and solving an ill-posed inverse problem via an optimization-based iterative regularization method called Iteratively Reweighted Norm-Basis Pursuit Denoising (IRN-BPD). A stopping condition based on the residual of the solution and an improved generalized cross-validation function are proposed to terminate the iterative algorithm of IRN-BPD and determine an optimal regularization value. The major contributions of this article include getting an idea from the first-order necessary condition of the optimization problem for deriving a sensitivity formulation and proposing a new regularized solution. The great advantages of these methods are increasing damage detectability, determining an optimal regularization value, and obtaining an accurate solution. A simple mass–spring system and a full-scale bridge structure are considered to verify the accuracy and effectiveness of the proposed methods in numerical studies. Results demonstrate that the methods presented in this article succeed in locating and quantifying damage under incomplete noisy modal data.

Journal Article

Share this book

Add to My Shelf

On model-based damage detection by an enhanced sensitivity function of modal flexibility and LSMR-Tikhonov method under incomplete noisy modal data

by Sarmadi Hassan , Mansour, Ghalehnovi , Entezami Alireza in Damage detection , Damage localization , Eigenvalues

2022

Sensitivity-based methods by the model updating strategy are still influential and reliable for structural damage detection. The major issue is to utilize a well-established sensitivity function that should be directly relevant to damage. Under noisy modal data, it is well known that the sensitivity-based model updating strategy is an ill-posed problem. The main aim of this article is to locate and quantify damage using incomplete noisy modal parameters by improving a sensitivity function of modal flexibility and proposing a new iterative regularization method for solving an ill-posed problem. The main contribution of the enhanced sensitivity formulation is to develop the derivative of eigenvalue and establish a more relevant sensitivity function to damage. The new regularization method is a combination of an iterative approach called least squares minimal residual and the well-known Tikhonov regularization technique. The key novel element of the proposed solution method is to choose an optimal regularization parameter during the iterative process rather than being required a prior. Numerical simulations are used to validate the accuracy and efficiency of the improved and proposed methods. Results demonstrate that the enhanced sensitivity function of the modal flexibility is more sensitive to damage in comparison with the basic formulation. Moreover, one can observe the robustness of the proposed solution method to solve the ill-posed problem for damage localization and quantification under noise-free and noisy modal data.

Journal Article

Share this book

Add to My Shelf

Self-supervised opinion summarization with multi-modal knowledge graph

by Jin, Lingyun , Chen, Jingqiang in Artificial Intelligence , Computer Science , Data Structures and Information Theory

2024

Multi-modal opinion summarization aims at automatically generating summaries of products or businesses from multi-modal reviews containing text, image and table to present clear references for other customers. To create faithful summaries, multi-modal structural knowledge should be well utilized, which is neglected by most existing work on multi-modal opinion summarization. Thus, we propose an opinion summarization framework based on multi-modal knowledge graphs (MKGOpinSum) to utilize structural knowledge in multi-modal data for opinion summarization. To construct a multi-modal knowledge graph, we first build a textual knowledge graph from review text and then enrich it by linking detected image objects to its corresponding entities. Our method obtains each modality representation from their own encoders, and generates the summary from the text decoder. To address the issue of heterogeneity of multi-modal data, we adopt a multi-modal training pipeline. In the pipeline we first pretrain text encoder and decoder with only text modality data. Then we respectively pretrain table and MKG modality by taking text decoder as a pivot. Finally, we train the entire encoder-decoder architecture and fuse representations of all modalities to generate the summary text. Experiments on Amazon and Yelp dataset show the framework has satisfactory performances when compared to ten baselines.

Journal Article

Share this book

Add to My Shelf

Damage detection in anisotropic-laminated composite beams based on incomplete modal data and teaching–learning-based optimization

by Kahya, Volkan , Toğan, Vedat , Adıyaman, Gökhan in Algorithms , Cantilever beams , Composite beams

2022

This study presents an efficient approach for the detection of damages in laminated composite beams with arbitrary lay-up. The approach uses the finite element model updating based on limited vibration data and a metaheuristic optimization algorithm. To this aim, a thirteen degrees-of-freedom (DOFs) beam finite element (FE) model is employed for numerical simulation of the actual structure. The Guyan condensation method is employed for model-order reduction to simulate the limited number of sensor data. The damage detection problem is defined as an unconstrained optimization problem. The objective function to be minimized is formulated using the objective function constructed as a weighted linear combination of the root-mean-square error in the frequencies and the error in the correlation between two mode shapes, which is represented by Modal Assurance Criterion (MAC). Teaching–Learning-Based Optimization (TLBO) is used as a metaheuristic tool for optimization. The proposed method is verified by four examples. A parametric study on anisotropic-laminated composite beams with cantilevered and clamped end conditions under three assumed damage scenarios is conducted to show the efficacy of the proposed method. The results indicate that the proposed method can identify single and multiple damages in anisotropic-laminated composite beams with adequate precision and outperforms the other algorithms in terms of accuracy and computational cost.

Journal Article

Share this book

Add to My Shelf

Remote Sensing LiDAR and Hyperspectral Classification with Multi-Scale Graph Encoder–Decoder Network

by Zhang, Weiguang , Wang, Hu , Nie, Liang in Accuracy , Classification , Coders

2024

The rapid development of sensor technology has made multi-modal remote sensing data valuable for land cover classification due to its diverse and complementary information. Many feature extraction methods for multi-modal data, combining light detection and ranging (LiDAR) and hyperspectral imaging (HSI), have recognized the importance of incorporating multiple spatial scales. However, effectively capturing both long-range global correlations and short-range local features simultaneously on different scales remains a challenge, particularly in large-scale, complex ground scenes. To address this limitation, we propose a multi-scale graph encoder–decoder network (MGEN) for multi-modal data classification. The MGEN adopts a graph model that maintains global sample correlations to fuse multi-scale features, enabling simultaneous extraction of local and global information. The graph encoder maps multi-modal data from different scales to the graph space and completes feature extraction in the graph space. The graph decoder maps the features of multiple scales back to the original data space and completes multi-scale feature fusion and classification. Experimental results on three HSI-LiDAR datasets demonstrate that the proposed MGEN achieves considerable classification accuracies and outperforms state-of-the-art methods.

Journal Article

Share this book

Add to My Shelf

A Docker-based federated learning framework design and deployment for multi-modal data stream classification

by Xhafa, Fatos , Nandi, Arijit , Kumar, Rohit in Classification , Clients , Data transmission

2023

In the high-performance computing (HPC) domain, federated learning has gained immense popularity. Especially in emotional and physical health analytics and experimental facilities. Federated learning is one of the most promising distributed machine learning frameworks because it supports data privacy and security by not sharing the clients’ data but instead sharing their local models. In federated learning, many clients explicitly train their machine learning/deep learning models (local training) before aggregating them as a global model at the global server. However, the FL framework is difficult to build and deploy across multiple distributed clients due to its heterogeneous nature. We developed Docker-enabled federated learning (DFL) by utilizing client-agnostic technologies like Docker containers to simplify the deployment of FL frameworks for data stream processing on the heterogeneous client. In the DFL, the clients and global servers are written using TensorFlow and lightweight message queuing telemetry transport protocol to communicate between clients and global servers in the IoT environment. Furthermore, the DFL’s effectiveness, efficiency, and scalability are evaluated in the test case scenario where real-time emotion state classification is done from distributed multi-modal physiological data streams under various practical configurations.

Journal Article

Share this book

Add to My Shelf

AL-MobileNet: a novel model for 2D gesture recognition in intelligent cockpit based on multi-modal data

by Yu, Liwen , Zhang, Bo , Wang, Bin in Acknowledgment , Algorithms , Artificial Intelligence

2024

As the degree of automotive intelligence increases, gesture recognition is gaining more attention in human-vehicle interaction. However, existing gesture recognition methods are computationally intensive and perform poorly in multi-modal sensor scenarios. This paper proposes a novel network structure, AL-MobileNet (MobileNet with Attention and Lightweight Modules), which can quickly and accurately estimate 2D gestures in RGB and infrared (IR) images. The innovations of this paper are as follows: Firstly, to enhance multi-modal data, we created a synthetic IR dataset based on real 2D gestures and employed a coarse-to-fine training approach. Secondly, to speed up the model's computation on edge devices, we introduced a new lightweight computational module called the Split Channel Attention Block (SCAB). Thirdly, to ensure the model maintains accuracy in large datasets, we incorporated auxiliary networks and Angle-Weighted Loss (AWL) into the backbone network. Experiments show that AL-MobileNet requires only 0.4 GFLOPs of computational power and 1.2 million parameters. This makes it 1.5 times faster than MobileNet and allows for quick execution on edge devices. AL-MobileNet achieved a running speed of up to 28 FPS on the Ambarella CV28. On both general datasets and our dataset, our algorithm achieved an average PCK0.2 score of 0.95. This indicates that the algorithm can quickly generate accurate 2D gestures. The demonstration of the algorithm can be reviewed in gesturebaolong.

Journal Article

Share this book

Add to My Shelf

Transformer-Based Multi-Modal Fusion for Martian Impact Crater Classification

by Wu, Yinghong , Zhao, Haishi , Yang, Chen in Automation , Cameras , Classification

2026

Impact craters, as key geomorphic features on Mars, provide important insights into surface processes and geological evolution. However, automatic classification of crater morphologies remains challenging due to substantial variations in size, degradation degree, and data quality across different types of Martian craters. This study proposes a multi-modal framework for Martian crater classification by integrating infrared imagery, an optical map, and digital elevation model (DEM) data. Specifically, daytime infrared imagery from THEMIS, a color map from the Tianwen-1 MoRIC instrument, and topographic data derived from combined MOLA–HRSC observations are used to capture complementary thermal, morphological, and elevation-related characteristics. A transformer-based feature extraction and cross-modal fusion strategy is adopted, where infrared imagery guides the interaction among multi-source features. Experiments on a carefully constructed dataset covering four crater categories, i.e., standard craters, layered ejecta craters, degraded craters, and secondary craters, demonstrate that the proposed approach achieves an overall precision of 0.848 and a recall of 0.851, outperforming single-modality baselines. Layered ejecta craters exhibit the highest classification performance, benefiting from their distinctive ejecta morphologies, whereas secondary craters remain more difficult to classify due to their small spatial scales. The results highlight the value of multi-modal data for Martian crater morphology classification.

Journal Article

Share this book

Add to My Shelf

A novel deep learning framework for accurate melanoma diagnosis integrating imaging and genomic data for improved patient outcomes

by Ramesh, Janjhyam Venkata Naga , Narayanasamy, Navaprakash , Kiran, Ajmeera in Adult , Aged , convolutional neural networks (CNNs)

2024

Background Melanoma is one of the most malignant forms of skin cancer, with a high mortality rate in the advanced stages. Therefore, early and accurate detection of melanoma plays an important role in improving patients' prognosis. Biopsy is the traditional method for melanoma diagnosis, but this method lacks reliability. Therefore, it is important to apply new methods to diagnose melanoma effectively. Aim This study presents a new approach to classify melanoma using deep neural networks (DNNs) with combined multiple modal imaging and genomic data, which could potentially provide more reliable diagnosis than current medical methods for melanoma. Method We built a dataset of dermoscopic images, histopathological slides and genomic profiles. We developed a custom framework composed of two widely established types of neural networks for analysing image data Convolutional Neural Networks (CNNs) and networks that can learn graph structure for analysing genomic data‐Graph Neural Networks. We trained and evaluated the proposed framework on this dataset. Results The developed multi‐modal DNN achieved higher accuracy than traditional medical approaches. The mean accuracy of the proposed model was 92.5% with an area under the receiver operating characteristic curve of 0.96, suggesting that the multi‐modal DNN approach can detect critical morphologic and molecular features of melanoma beyond the limitations of traditional AI and traditional machine learning approaches. The combination of cutting‐edge AI may allow access to a broader range of diagnostic data, which can allow dermatologists to make more accurate decisions and refine treatment strategies. However, the application of the framework will have to be validated at a larger scale and more clinical trials need to be conducted to establish whether this novel diagnostic approach will be more effective and feasible.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter