Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
26
result(s) for
"Computational inference of protein conformations and interactions"
Sort by:
Predicting intercellular communication based on metabolite-related ligand-receptor interactions with MRCLinkdb
by
Zhang, Yang
,
Zhan, Meixiao
,
Sun, Taoping
in
Algorithms
,
Animals
,
Biomedical and Life Sciences
2024
Background
Metabolite-associated cell communications play critical roles in maintaining human biological function. However, most existing tools and resources focus only on ligand-receptor interaction pairs where both partners are proteinaceous, neglecting other non-protein molecules. To address this gap, we introduce the MRCLinkdb database and algorithm, which aggregates and organizes data related to non-protein L-R interactions in cell-cell communication, providing a valuable resource for predicting intercellular communication based on metabolite-related ligand-receptor interactions.
Results
Here, we manually curated the metabolite-ligand-receptor (ML-R) interactions from the literature and known databases, ultimately collecting over 790 human and 670 mouse ML-R interactions. Additionally, we compiled information on over 1900 enzymes and 260 transporter entries associated with these metabolites. We developed Metabolite-Receptor based Cell Link Database (MRCLinkdb) to store these ML-R interactions data. Meanwhile, the platform also offers extensive information for presenting ML-R interactions, including fundamental metabolite information and the overall expression landscape of metabolite-associated gene sets (such as receptor, enzymes, and transporter proteins) based on single-cell transcriptomics sequencing (covering 35 human and 26 mouse tissues, 52 human and 44 mouse cell types) and bulk RNA-seq/microarray data (encompassing 62 human and 39 mouse tissues). Furthermore, MRCLinkdb introduces a web server dedicated to the analysis of intercellular communication based on ML-R interactions. MRCLinkdb is freely available at
https://www.cellknowledge.com.cn/mrclinkdb/
.
Conclusions
In addition to supplementing ligand-receptor databases, MRCLinkdb may provide new perspectives for decoding the intercellular communication and advancing related prediction tools based on ML-R interactions.
Journal Article
Prediction of protein–protein interaction based on interaction-specific learning and hierarchical information
by
Jiang, Jing
,
Wang, Peng
,
Cao, Xiaofeng
in
Biomedical and Life Sciences
,
Computational Biology - methods
,
Computational inference of protein conformations and interactions
2025
Background
Prediction of protein–protein interactions (PPIs) is fundamental for identifying drug targets and understanding cellular processes. The rapid growth of PPI studies necessitates the development of efficient and accurate tools for automated prediction of PPIs. In recent years, several robust deep learning models have been developed for PPI prediction and have found widespread application in proteomics research. Despite these advancements, current computational tools still face limitations in modeling both the pairwise interactions and the hierarchical relationships between proteins.
Results
We present HI-PPI, a novel deep learning method that integrates hierarchical representation of PPI network and interaction-specific learning for protein–protein interaction prediction. HI-PPI extracts the hierarchical information by embedding structural and relational information into hyperbolic space. A gated interaction network is then employed to extract pairwise features for interaction prediction. Experiments on multiple benchmark datasets demonstrate that HI-PPI outperforms the state-of-the-art methods; HI-PPI improves Micro-F1 scores by 2.62%–7.09% over the second-best method. Moreover, HI-PPI offers explicit interpretability of the hierarchical organization within the PPI network. The distance between the origin and the hyperbolic embedding computed by HI-PPI naturally reflects the hierarchical level of proteins.
Conclusions
Overall, the proposed HI-PPI effectively addresses the limitations of existing PPI prediction methods. By leveraging the hierarchical structure of PPI network, HI-PPI significantly enhances the accuracy and robustness of PPI predictions.
Journal Article
Negative sampling strategies impact the prediction of scale-free biomolecular network interactions with machine learning
2025
Background
Understanding protein-molecular interaction is crucial for unraveling the mechanisms underlying diverse biological processes. Machine learning (ML) techniques have been extensively employed in predicting these interactions and have garnered substantial research focus. Previous studies have predominantly centered on improving model performance through novel and efficient ML approaches, often resulting in overoptimistic predictive estimates. However, these advancements frequently neglect the inherent biases stemming from network properties, particularly in biological contexts.
Results
In this study, we examined the biases inherent in ML models during the learning and prediction of protein-molecular interactions, particularly those arising from the scale-free property of biological networks—a characteristic where in a few nodes have many connections while most have very few. Our comprehensive analysis across diverse tasks, datasets, and ML methods provides compelling evidence of these biases. We discovered that the training and evaluation of ML models are profoundly influenced by network topology, potentially distorting model performance assessments. To mitigate this issue, we propose the degree distribution balanced (DDB) sampling strategy, a straightforward yet potent approach that alleviates biases stemming from network properties. This method further underscores the limitations of certain ML models in learning protein-molecular interactions solely from intrinsic molecular features.
Conclusions
Our findings present a novel perspective for assessing the performance of ML models in inferring protein-molecular interactions with greater fairness. By addressing biases introduced by network properties, the DDB sampling approach provides a more balanced and precise assessment of model capabilities. These insights hold the potential to bolster the reliability of ML models in bioinformatics, fostering a more stringent evaluation framework for predicting protein-molecular interactions.
Journal Article
Bridging chemical structure and conceptual knowledge enables accurate prediction of compound-protein interaction
by
Jiang, Jing
,
Zeng, Li
,
Ma, Tengfei
in
Artificial intelligence
,
Biological activity
,
Biomedical and Life Sciences
2024
Background
Accurate prediction of compound-protein interaction (CPI) plays a crucial role in drug discovery. Existing data-driven methods aim to learn from the chemical structures of compounds and proteins yet ignore the conceptual knowledge that is the interrelationships among the fundamental elements in the biomedical knowledge graph (KG). Knowledge graphs provide a comprehensive view of entities and relationships beyond individual compounds and proteins. They encompass a wealth of information like pathways, diseases, and biological processes, offering a richer context for CPI prediction. This contextual information can be used to identify indirect interactions, infer potential relationships, and improve prediction accuracy. In real-world applications, the prevalence of knowledge-missing compounds and proteins is a critical barrier for injecting knowledge into data-driven models.
Results
Here, we propose BEACON, a data and knowledge dual-driven framework that bridges chemical structure and conceptual knowledge for CPI prediction. The proposed BEACON learns the consistent representations by maximizing the mutual information between chemical structure and conceptual knowledge and predicts the missing representations by minimizing their conditional entropy. BEACON achieves state-of-the-art performance on multiple datasets compared to competing methods, notably with 5.1% and 6.6% performance gain on the BIOSNAP and DrugBank datasets, respectively. Moreover, BEACON is the only approach capable of effectively predicting knowledge representations for knowledge-lacking compounds and proteins.
Conclusions
Overall, our work provides a general approach for directly injecting conceptual knowledge to enhance the performance of CPI prediction.
Journal Article
An interpretable geometric graph neural network for enhancing the generalizability of drug–target interaction prediction
by
Cui, Feifei
,
Xiong, An
,
Xia, Yan
in
Biomedical and Life Sciences
,
Computational inference of protein conformations and interactions
,
Cross-attention
2025
Background
Accurate prediction of drug–target interactions (DTIs) is essential for advancing drug discovery. Although numerous computational methods have been proposed, many exhibit limited generalization, particularly when dealing with unseen drugs or targets.
Results
To address this challenge, we introduce GPS-DTI, a deep learning framework designed to capture both local and global features of drugs and proteins, thereby enhancing predictive robustness. Specifically, GPS-DTI employs a graph isomorphism network with edge features (GINE)–based graph neural network, combined with a multi-head attention mechanism (MHAM), to effectively model the structural characteristics of drug molecules. For proteins, representations are derived from the pre-trained Evolutionary Scale Model (ESM-2) model and further refined through convolutional neural networks (CNNs), yielding rich feature embeddings. A cross-attention module integrates drug and protein features, uncovering biologically meaningful interactions and improving model interpretability.
Conclusions
Comprehensive benchmarking across in-domain and cross-domain DTI prediction tasks demonstrates that GPS-DTI outperforms existing methods, underscoring its strong generalization capability. Notably, the model achieves state-of-the-art performance on drug–target affinity (DTA) tasks and shows robust adaptability when evaluated on an independent Coronavirus Disease 2019 (COVID-19)–related test set. Furthermore, visualization of cross-attention maps offers interpretable insights into key molecular interactions, highlighting the potential of GPS-DTI in real-world drug discovery applications.
Journal Article
Accurate prediction of drug-protein interactions by maintaining the original topological relationships among embeddings
2025
Background
Learning-based methods have recently demonstrated strong potential in predicting drug-protein interactions (DPIs). However, existing approaches often fail to achieve accurate predictions on real-world imbalanced datasets while maintaining high generalizability and scalability, limiting their practical applicability.
Results
This study proposes a highly generalized model, GLDPI, aimed at improving prediction accuracy in imbalanced scenarios by preserving the topological relationships among initial molecular representations in the embedding space. Specifically, GLDPI employs dedicated encoders to transform one-dimensional sequence information of drugs and proteins into embedding representations and efficiently calculates the likelihood of DPIs using cosine similarity. Additionally, we introduce a prior loss function based on the guilt-by-association principle to ensure that the topology of the embedding space aligns with the structure of the initial drug-protein network. This design enables GLDPI to effectively capture network relationships and key features of molecular interactions, thereby significantly enhancing predictive performance.
Conclusions
Experimental results highlight GLDPI’s superior performance on multiple highly imbalanced benchmark datasets, achieving over a 100% improvement in the AUPR metric compared to state-of-the-art methods. Additionally, GLDPI demonstrates exceptional generalization capabilities in cold-start experiments, excelling in predicting novel drug-protein interactions. Furthermore, the model exhibits remarkable scalability, efficiently inferring approximately
1.2
×
10
10
drug-protein pairs in less than 10 h.
Journal Article
MESM: integrating multi-source data for high-accuracy protein-protein interactions prediction through multimodal language models
by
Chang, Shan
,
Chu, Jinming
,
Shen, Liyan
in
Amino acid sequence
,
Amino acids
,
Artificial neural networks
2025
Background
Protein-protein interactions (PPIs) play a critical role in essential biological processes such as signal transduction, enzyme activity regulation, cytoskeletal structure, immune responses, and gene regulation. However, current methods mainly focus on extracting features from protein sequences and using graph neural network (GNN) to acquire interaction information from the PPI network graph. This limits the model’s ability to learn richer and more effective interaction information, thereby affecting prediction performance.
Results
In this study, we propose a novel deep learning method, MESM, for effectively predicting PPI. The datasets used for the PPI prediction task were primarily constructed from the STRING database, including two Homo sapiens PPI datasets, SHS27k and SHS148k, and two Saccharomyces cerevisiae PPI datasets, SYS30k and SYS60k. MESM consists of three key modules, as follows: First, MESM extracts multimodal representations from protein sequence information, protein structure information, and point cloud features through Sequence Variational Autoencoder (SVAE), Variational Graph Autoencoder (VGAE), and PointNet Autoencoder (PAE). Then, Fusion Autoencoder (FAE) is used to integrate these multimodal features, generating rich and balanced protein representations. Next, MESM leverages GraphGPS to learn structural information from the PPI network graph structure and combines Graph Attention Network (GAT) to further capture protein interaction information. Finally, MESM uses Graph Convolutional Network (GCN) and SubgraphGCN to extract global and local features from the perspective of the overall graph and subgraphs. Moreover, we build seven independent graphs from the overall PPI network graph to specifically learn the features of each PPI type, thereby enhancing the model’s learning ability for different types of interactions.
Conclusions
Compared to the state-of-the-art methods, MESM achieved improvements of 8.77%, 4.98%, 7.48%, and 6.08% on SHS27k, SHS148k, SYS30k, and SYS60k, respectively. The experimental results demonstrate that MESM exhibits significant improvements in PPI prediction performance.
Journal Article
CRBPSA: CircRNA-RBP interaction sites identification using sequence structural attention model
2024
Background
Due to the ability of circRNA to bind with corresponding RBPs and play a critical role in gene regulation and disease prevention, numerous identification algorithms have been developed. Nevertheless, most of the current mainstream methods primarily capture one-dimensional sequence features through various descriptors, while neglecting the effective extraction of secondary structure features. Moreover, as the number of introduced descriptors increases, the issues of sparsity and ineffective representation also rise, causing a significant burden on computational models and leaving room for improvement in predictive performance.
Results
Based on this, we focused on capturing the features of secondary structure in sequences and developed a new architecture called CRBPSA, which is based on a sequence-structure attention mechanism. Firstly, a base-pairing matrix is generated by calculating the matching probability between each base, with a Gaussian function introduced as a weight to construct the secondary structure. Then, a Structure_Transformer is employed to extract base-pairing information and spatial positional dependencies, enabling the identification of binding sites through deeper feature extraction. Experimental results using the same set of hyperparameters on 37 circRNA datasets, totaling 671,952 samples, show that the CRBPSA algorithm achieves an average AUC of 99.93%, surpassing all existing prediction methods.
Conclusions
CRBPSA is a lightweight and efficient prediction tool for circRNA-RBP, which can capture structural features of sequences with minimal computational resources and accurately predict protein-binding sites. This tool facilitates a deeper understanding of the biological processes and mechanisms underlying circRNA and protein interactions.
Journal Article
PharmRL: pharmacophore elucidation with deep geometric reinforcement learning
2024
Background
Molecular interactions between proteins and their ligands are important for drug design. A pharmacophore consists of favorable molecular interactions in a protein binding site and can be utilized for virtual screening. Pharmacophores are easiest to identify from co-crystal structures of a bound protein-ligand complex. However, designing a pharmacophore in the absence of a ligand is a much harder task.
Results
In this work, we develop a deep learning method that can identify pharmacophores in the absence of a ligand. Specifically, we train a CNN model to identify potential favorable interactions in the binding site, and develop a deep geometric Q-learning algorithm that attempts to select an optimal subset of these interaction points to form a pharmacophore. With this algorithm, we show better prospective virtual screening performance, in terms of F1 scores, on the DUD-E dataset than random selection of ligand-identified features from co-crystal structures. We also conduct experiments on the LIT-PCBA dataset and show that it provides efficient solutions for identifying active molecules. Finally, we test our method by screening the COVID moonshot dataset and show that it would be effective in identifying prospective lead molecules even in the absence of fragment screening experiments.
Conclusions
PharmRL addresses the need for automated methods in pharmacophore design, particularly in cases where a cognate ligand is unavailable. Experimental results demonstrate that PharmRL generates functional pharmacophores. Additionally, we provide a Google Colab notebook to facilitate the use of this method.
Journal Article
PBertKla: a protein large language model for predicting human lysine lactylation sites
2025
Background
Lactylation is a newly discovered type of post-translational modification, primarily occurring on lysine (K) residues of both histones and non-histones to exert diverse effects on target proteins. Research has shown that lysine lactylation (Kla) modification is ubiquitous in different cells and participates in the determination of cell function and fate, as well as in the initiation and progression of various diseases. Precise identification of Kla sites is fundamental for elucidating their biological functions and uncovering their application potential.
Results
Here, we proposed a novel human Kla site predictor (named PBertKla) through curating a reliable benchmark dataset with proper sample length and sequence identity threshold to train a protein large language model with optimal hyperparameters. Extensive experimental results consistently demonstrated that our model possessed robust human Kla site prediction ability, achieving an AUC (area under receiver operating characteristic curve) value of over 0.880 on the independent validation data. Feature visualization analysis further validated the effectiveness of in feature learning and representation from Kla sequences. Moreover, we benchmarked PBertKla against other cutting-edge models on an independent testing dataset from different sources, highlighting its superiority and transferability.
Conclusions
All results indicated that PBertKla excelled as an automatic predictor of human Kla sites, and it would advance the investigation of lactylation modifications and their significance in health and disease.
Journal Article