Catalogue Search | MBRL

Leveraging 3D Molecular Spatial Visual Information and Multi‐Perspective Representations for Drug Discovery

by Tan, Feng , Zhang, Zimai , Qi, Yujie in Artificial intelligence , Computational Biology - methods , Deep Learning

2026

Drug discovery remains a costly and time‐intensive process, where accurate identification of drug associations is critical for therapeutic development. Existing computational approaches predominantly rely on sequence‐derived or 2D molecular representations, often overlooking the intrinsic 3D complexity of small molecules. Here, a deep learning framework is presented that directly learns from 3D molecular spatial visual information, capturing geometric, topological, and stereochemical features from spatial renderings. By integrating this spatial information with traditional molecular descriptors, unified multi‐perspective representations are constructed that better reflect molecular structure and function. Across benchmark tasks involving drug–microRNA, drug–drug, and drug–protein interaction prediction, this model consistently outperforms conventional fingerprint‐based baselines. Interpretability analyses show that the model attends to biologically relevant substructures, highlighting the value of 3D molecular spatial visual information in molecular recognition. These findings demonstrate the potential of spatially informed learning to enhance predictive performance and provide mechanistic insights in computational drug discovery. A deep learning framework called MolVisGNN is proposed to fuse 3D molecular visual information of drugs with multi‐source features, which proves the importance of 3D molecular visual information of drugs and the advancedness of this model in the field of drug discovery, and provides a reference for how to more comprehensively express small molecule drugs in deep learning in the future.

Journal Article

Share this book

Add to My Shelf

A High-Resolution Remote Sensing Road Extraction Method Based on the Coupling of Global Spatial Features and Fourier Domain Features

by Xing, Xiaoyu , Wu, Yanlan , Wu, Yongchuang in Accuracy , Artificial intelligence , China

2024

Remote sensing road extraction based on deep learning is an important method for road extraction. However, in complex remote sensing images, different road information often exhibits varying frequency distributions and texture characteristics, and it is usually difficult to express the comprehensive characteristics of roads effectively from a single spatial domain perspective. To address the aforementioned issues, this article proposes a road extraction method that couples global spatial learning with Fourier frequency domain learning. This method first utilizes a transformer to capture global road features and then applies Fourier transform to separate and enhance high-frequency and low-frequency information. Finally, it integrates spatial and frequency domain features to express road characteristics comprehensively and overcome the effects of intra-class differences and occlusions. Experimental results on HF, MS, and DeepGlobe road datasets show that our method can more comprehensively express road features compared with other deep learning models (e.g., Unet, D-Linknet, DeepLab-v3, DCSwin, SGCN) and extract road boundaries more accurately and coherently. The IOU accuracy of the extracted results also achieved 72.54%, 55.35%, and 71.87%.

Journal Article

Share this book

Add to My Shelf

Enhancing Driver Monitoring Systems Based on Novel Multi-Task Fusion Algorithm

by Dada, Ibidapo Dare , Abayomi-Alli, Adebayo A. , Raudonis, Vidas in Accident prevention , Accidents, Traffic - prevention & control , Accuracy

2025

Distracted driving continues to be a major contributor to road accidents, highlighting the growing research interest in advanced driver monitoring systems for enhanced safety. This paper seeks to improve the overall performance and effectiveness of such systems by highlighting the importance of recognizing the driver’s activity. This paper introduces a novel methodology for assessing driver attention by using multi-perspective information using videos that capture the full driver body, hands, and face and focusing on three driver tasks: distracted actions, gaze direction, and hands-on-wheel monitoring. The experimental evaluation was conducted in two phases: first, assessing driver distracted activities, gaze direction, and hands-on-wheel using a CNN-based model and videos from three cameras that were placed inside the vehicle, and second, evaluating the multi-task fusion algorithm, considering the aggregated danger score, which was introduced in this paper, as a representation of the driver’s attentiveness based on the multi-task data fusion algorithm. The proposed methodology was built and evaluated using a DMD dataset; additionally, model robustness was tested on the AUC_V2 and SAMDD driver distraction datasets. The proposed algorithm effectively combines multi-task information from different perspectives and evaluates the attention level of the driver.

Journal Article

Share this book

Add to My Shelf

MPASL: multi-perspective learning knowledge graph attention network for synthetic lethality prediction in human cancer

by Yan, Chaokun , Liang, Wenjuan , Zhang, Ge in Antineoplastic drugs , attention mechanism , Cancer therapies

2024

Synthetic lethality (SL) is widely used to discover the anti-cancer drug targets. However, the identification of SL interactions through wet experiments is costly and inefficient. Hence, the development of efficient and high-accuracy computational methods for SL interactions prediction is of great significance. In this study, we propose MPASL, a multi-perspective learning knowledge graph attention network to enhance synthetic lethality prediction. MPASL utilizes knowledge graph hierarchy propagation to explore multi-source neighbor nodes related to genes. The knowledge graph ripple propagation expands gene representations through existing gene SL preference sets. MPASL can learn the gene representations from both gene-entity perspective and entity-entity perspective. Specifically, based on the aggregation method, we learn to obtain gene-oriented entity embeddings. Then, the gene representations are refined by comparing the various layer-wise neighborhood features of entities using the discrepancy contrastive technique. Finally, the learned gene representation is applied in SL prediction. Experimental results demonstrated that MPASL outperforms several state-of-the-art methods. Additionally, case studies have validated the effectiveness of MPASL in identifying SL interactions between genes.

Journal Article

Share this book

Add to My Shelf

PWFNet: Pyramidal Wavelet–Frequency Attention Network for Road Extraction

by Zhao, Xiaolin , Yang, Xue , Xu, Dinglin in Accuracy , Architecture , Attention

2025

Road extraction from remote sensing imagery plays a critical role in applications such as autonomous driving, urban planning, and infrastructure development. Although deep learning methods have achieved notable progress, current approaches still struggle with complex backgrounds, varying road widths, and strong texture interference, often leading to fragmented road predictions or the misclassification of background regions. Given that roads typically exhibit smooth low-frequency characteristics while background clutter tends to manifest in mid- and high-frequency ranges, incorporating frequency-domain information can enhance the model’s structural perception and discrimination capabilities. To address these challenges, we propose a novel frequency-aware road extraction network, termed PWFNet, which combines frequency-domain modeling with multi-scale feature enhancement. PWFNet comprises two key modules. First, the Pyramidal Wavelet Convolution (PWC) module employs multi-scale wavelet decomposition fused with localized convolution to accurately capture road structures across various spatial resolutions. Second, the Frequency-aware Adjustment Module (FAM) partitions the Fourier spectrum into multiple frequency bands and incorporates a spatial attention mechanism to strengthen low-frequency road responses while suppressing mid- and high-frequency background noise. By integrating complementary modeling from both spatial and frequency domains, PWFNet significantly improves road continuity, edge clarity, and robustness under complex conditions. Experiments on the DeepGlobe and CHN6-CUG road datasets demonstrate that PWFNet achieves IoU improvements of 3.8% and 1.25% over the best-performing baseline methods, respectively. In addition, we conducted cross-region transfer experiments by directly applying the trained model to remote sensing images from different geographic regions and at varying resolutions to assess its generalization capability. The results demonstrate that PWFNet maintains the continuity of main and branch roads and preserves edge details in these transfer scenarios, effectively reducing false positives and missed detections. This further validates its practicality and robustness in diverse real-world environments.

Journal Article

Share this book

Add to My Shelf

Multi-perspective dynamic consistency learning for semi-supervised medical image segmentation

by Zhu, Yongfa , Wang, Xue , Liu, Taihui in 631/114/2397 , 631/1647/48 , 639/166/985

2025

Semi-supervised learning (SSL) is an effective method for medical image segmentation as it alleviates the dependence on clinical pixel-level annotations. Among the SSL methods, pseudo-labels and consistency regularization play a key role as the dominant paradigm. However, current consistency regularization methods based on shared encoder structures are prone to trap the model in cognitive bias, which impairs the segmentation performance. Furthermore, traditional fixed-threshold-based pseudo-label selection methods lack the utilization of low-confidence pixels, making the model’s initial segmentation capability insufficient, especially for confusing regions. To this end, we propose a multi-perspective dynamic consistency (MPDC) framework to mitigate model cognitive bias and to fully utilize the low-confidence pixels. Specially, we propose a novel multi-perspective collaborative learning strategy that encourages the sub-branch networks to learn discriminative features from multiple perspectives, thus avoiding the problem of model cognitive bias and enhancing boundary perception. In addition, we further employ a dynamic decoupling consistency scheme to fully utilize low-confidence pixels. By dynamically adjusting the threshold, more pseudo-labels are involved in the early stages of training. Extensive experiments on several challenging medical image segmentation datasets show that our method achieves state-of-the-art performance, especially on boundaries, with significant improvements.

Journal Article

Share this book

Add to My Shelf

Multi-Perspective Anomaly Detection

by Madan, Manav , Schmid-Schirling, Tobias , Valada, Abhinav in Algorithms , anomaly detection , data fusion

2021

Anomaly detection is a critical problem in the manufacturing industry. In many applications, images of objects to be analyzed are captured from multiple perspectives which can be exploited to improve the robustness of anomaly detection. In this work, we build upon the deep support vector data description algorithm and address multi-perspective anomaly detection using three different fusion techniques, i.e., early fusion, late fusion, and late fusion with multiple decoders. We employ different augmentation techniques with a denoising process to deal with scarce one-class data, which further improves the performance (ROC AUC =80%). Furthermore, we introduce the dices dataset, which consists of over 2000 grayscale images of falling dices from multiple perspectives, with 5% of the images containing rare anomalies (e.g., drill holes, sawing, or scratches). We evaluate our approach on the new dices dataset using images from two different perspectives and also benchmark on the standard MNIST dataset. Extensive experiments demonstrate that our proposed multi-perspective approach exceeds the state-of-the-art single-perspective anomaly detection on both the MNIST and dices datasets. To the best of our knowledge, this is the first work that focuses on addressing multi-perspective anomaly detection in images by jointly using different perspectives together with one single objective function for anomaly detection.

Journal Article

Share this book

Add to My Shelf

Multi-label mental health classification in social media posts with multi-perspective prompt ensemble and auxiliary self-supervision

by Lin, Ching-Sheng , Liu, Feng-Chi , Ye, Qing-Yuan in 4014/477 , 631/477 , 639/705

2025

Anxiety and depression have become major global health concerns. With the rapid rise of social media, people increasingly share emotions and personal struggles through posts, which often convey multiple mental states simultaneously. To address this multi-label classification challenge in mental health texts, this study proposes a multi-task framework with two main modules, a multi-perspective prompt design module and a perturbation-based self-supervised learning module, based on a pre-trained language model backbone. Prompts from sociological, psychological, and educational perspectives are used to enhance semantic understanding. To improve model robustness, we formulate self-supervised auxiliary tasks where the model predicts whether a sentence has undergone insertion, swap, or deletion. Experiments on the MultiWD dataset, covering six wellness dimensions, show that our method outperforms all baselines. Furthermore, ablation studies explore the impact of different training configurations and confirm the critical contributions of both proposed modules.

Journal Article

Share this book

Add to My Shelf

MPCTrans: Multi-Perspective Cue-Aware Joint Relationship Representation for 3D Hand Pose Estimation via Swin Transformer

by Lin, Mingyu , Liu, Tingting , Rao, Ning in 3D hand pose estimation , Accuracy , Cameras

2024

The objective of 3D hand pose estimation (HPE) based on depth images is to accurately locate and predict keypoints of the hand. However, this task remains challenging because of the variations in hand appearance from different viewpoints and severe occlusions. To effectively address these challenges, this study introduces a novel approach, called the multi-perspective cue-aware joint relationship representation for 3D HPE via the Swin Transformer (MPCTrans, for short). This approach is designed to learn multi-perspective cues and essential information from hand depth images. To achieve this goal, three novel modules are proposed to utilize features from multiple virtual views of the hand, namely, the adaptive virtual multi-viewpoint (AVM), hierarchy feature estimation (HFE), and virtual viewpoint evaluation (VVE) modules. The AVM module adaptively adjusts the angles of the virtual viewpoint and learns the ideal virtual viewpoint to generate informative multiple virtual views. The HFE module estimates hand keypoints through hierarchical feature extraction. The VVE module evaluates virtual viewpoints by using chained high-level functions from the HFE module. Transformer is used as a backbone to extract the long-range semantic joint relationships in hand depth images. Extensive experiments demonstrate that the MPCTrans model achieves state-of-the-art performance on four challenging benchmark datasets.

Journal Article

Share this book

Add to My Shelf

Multi-perspective convolutional neural networks for citywide crowd flow prediction

by Kong, Weiyang , Zhang, Sen , Liu, Yubao in Artificial Intelligence , Artificial neural networks , Computer Science

2023

Crowd flow prediction is an important problem of urban computing with many applications, such as public security. Inspired by the success of deep learning, various deep learning models have been proposed to solve this problem. Although existing methods have achieved good prediction performance, they cannot effectively capture richer spatial-temporal correlations that are important for crowd flow prediction. To address the limitation of existing methods, we propose a novel 2D CNN-based (convolutional neural networks) model via multiple perspectives called the MPCNN to capture richer spatial-temporal correlations. In particular, three perspective CNNs are included in the MPCNN: the front CNN, the side CNN and the top CNN. Then, we propose a fusion layer to combine the results of the three CNNs. In addition, in the MPCNN, we use external factors to enhance prediction performance. Based on four real-world datasets, we performed a series of experiments to compare the proposed method with existing methods, and experimental results demonstrate the effectiveness and efficiency of the proposed method.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter