Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
596
result(s) for
"Gao, Xinyue"
Sort by:
Hierarchical clustering-based coarse-to-fine classification framework for microbial protein function prediction
by
Liu, Honglei
,
Zhu, Congmin
,
Chen, Shengyang
in
Algorithms
,
Automatic classification
,
Bacterial proteins
2025
Background
Accurate prediction of microbial protein functions is essential for understanding microbial physiology, discovering novel probiotics, and driving biotechnological innovation. However, protein function prediction remains challenging due to the hierarchical and class-imbalanced nature of functional labels, particularly in large-scale annotations such as Enzyme Commission (EC) numbers and Gene Ontology (GO) terms. Most existing deep learning approaches fail to adequately address the long-tail distribution problem.
Methods
We propose a Hierarchical Cascaded Context Network (HCCN) that explicitly models functional hierarchies and emphasizes prediction of low-frequency (long-tail) labels. For EC classification, we design a coarse-to-fine network that captures parent–child dependencies among hierarchical labels. For GO prediction, we construct a semantically grounded hierarchical structure using ontology embedding and clustering, and develop an attention-based multi-level cascade predictor to exploit structured dependencies across Biological Process (BPO), Molecular Function (MFO), and Cellular Component (CCO). To mitigate label imbalance, we introduce a dynamic resampling strategy and a hierarchical loss weighting mechanism, which enforce inter-level regularization and enhance sensitivity to rare functions.
Results
Experimental results show that HCCN consistently outperforms traditional sequence-alignment methods (e.g., DIAMOND, BLAST) and baseline neural networks (MLP and DeepGOPlus) across all major functional categories. On the full test set, HCCN achieves AUPR gains of up to 5.5% (EC), 6.5% (BPO), 4.9% (MFO), and 5.3% (CCO) over the best baseline. For low-frequency labels, HCCN demonstrates strong few-shot generalization, with improvements of + 11.2% (EC low) , + 6.7% (BPO low) , + 9.2% (MFO low) and + 4.6% (CCO low) in mAUPR.
Conclusions
The proposed HCCN framework provides an effective solution to hierarchical and imbalanced protein function prediction, significantly improving performance on long-tail functional labels. Code and data are publicly available at:
https://github.com/YangLab-BUPT/HCCN
.
Journal Article
DSTANet: A Lightweight and High-Precision Network for Fine-Grained and Early Identification of Maize Leaf Diseases in Field Environments
2025
Early and accurate identification of maize diseases is crucial for ensuring sustainable agricultural development. However, existing maize disease identification models face challenges including high inter-class similarity, intra-class variability, and limited capability in identifying early-stage symptoms. To address these limitations, we proposed DSTANet (decomposed spatial token aggregation network), a lightweight and high-performance model for maize leaf disease identification. In this study, we constructed a comprehensive maize leaf image dataset comprising six common disease types and healthy samples, with early and late stages of northern leaf blight and eyespot specifically differentiated. DSTANet employed MobileViT as the backbone architecture, combining the advantages of CNNs for local feature extraction with transformers for global feature modeling. To enhance lesion localization and mitigate interference from complex field backgrounds, DSFM (decomposed spatial fusion module) was introduced. Additionally, the MSTA (multi-scale token aggregator) was designed to leverage hidden-layer feature channels more effectively, improving information flow and preventing gradient vanishing. Experimental results showed that DSTANet achieved an accuracy of 96.11%, precision of 96.17%, recall of 96.11%, and F1-score of 96.14%. With only 1.9M parameters, 0.6 GFLOPs (floating point operations), and an inference speed of 170 images per second, the model meets real-time deployment requirements on edge devices. This study provided a novel and practical approach for fine-grained and early-stage maize disease identification, offering technical support for smart agriculture and precision crop management.
Journal Article
Exploring Molecular Heteroencoders with Latent Space Arithmetic: Atomic Descriptors and Molecular Operators
by
Baimacheva, Natalia
,
Aires-de-Sousa, Joao
,
Gao, Xinyue
in
Algorithms
,
atomic descriptors
,
Fluorine compounds
2024
A variational heteroencoder based on recurrent neural networks, trained with SMILES linear notations of molecular structures, was used to derive the following atomic descriptors: delta latent space vectors (DLSVs) obtained from the original SMILES of the whole molecule and the SMILES of the same molecule with the target atom replaced. Different replacements were explored, namely, changing the atomic element, replacement with a character of the model vocabulary not used in the training set, or the removal of the target atom from the SMILES. Unsupervised mapping of the DLSV descriptors with t-distributed stochastic neighbor embedding (t-SNE) revealed a remarkable clustering according to the atomic element, hybridization, atomic type, and aromaticity. Atomic DLSV descriptors were used to train machine learning (ML) models to predict 19F NMR chemical shifts. An R2 of up to 0.89 and mean absolute errors of up to 5.5 ppm were obtained for an independent test set of 1046 molecules with random forests or a gradient-boosting regressor. Intermediate representations from a Transformer model yielded comparable results. Furthermore, DLSVs were applied as molecular operators in the latent space: the DLSV of a halogenation (H→F substitution) was summed to the LSVs of 4135 new molecules with no fluorine atom and decoded into SMILES, yielding 99% of valid SMILES, with 75% of the SMILES incorporating fluorine and 56% of the structures incorporating fluorine with no other structural change.
Journal Article
Effects of Extreme Climatic Events on the Autumn Phenology in Northern China Are Related to Vegetation Types and Background Climates
2024
The increased intensity and frequency of extreme climate events (ECEs) have significantly impacted vegetation phenology, further profoundly affecting the structure and functioning of terrestrial ecosystems. However, the mechanisms by which ECEs affect the end of the growing season (EOS), a crucial phenological phase, remain unclear. In this study, we first evaluated the temporal variations in the EOS anomalies in Northern China (NC) based on the Normalized Difference Vegetation Index (NDVI) and Enhanced Vegetation Index (EVI) from 2001 to 2018. We then used event coincidence analysis (ECA) to assess the susceptibility of EOS to four ECEs (i.e., extreme heat, extreme cold, extreme wet and extreme dry events). Finally, we examined the dependence of the response of EOS to ECEs on background climate conditions. Our results indicated a slight decrease in the proportion of areas experiencing extreme heat and dry events (1.10% and 0.66% per year, respectively) and a slight increase in the proportion of areas experiencing extreme wet events (0.77% per year) during the preseason period. Additionally, EOS exhibited a delaying trend at a rate of 0.25 days/a during the study period. The susceptibility of EOS to ECEs was closely related to local hydrothermal conditions, with higher susceptibility to extreme dry and extreme hot events in drier and warmer areas and higher susceptibility to extreme cold and extreme wet events in wetter regions. Grasslands, in contrast to forests, were more sensitive to extreme dry, hot and cold events due to their weaker resistance to water deficits and cold stress. This study sheds light on how phenology responds to ECEs across various ecosystems and hydrothermal conditions. Our results could also provide a valuable guide for ecosystem management in arid regions.
Journal Article
Multi-granularity sentiment analysis and learning outcome prediction for Chinese educational texts based on transformer architecture
2025
With the increasing adoption of intelligent tutoring systems, accurately interpreting students’ emotional states in educational contexts is crucial for providing personalized learning support. In computer science, natural language processing (NLP) techniques offer promising solutions for sentiment analysis and academic performance prediction. In the field of Chinese language education, students’ emotional states significantly influence their learning outcomes. However, traditional sentiment analysis methods exhibit limited adaptability to educational texts, failing to capture multi-granularity emotional expressions effectively. To bridge this gap, this study proposes a Transformer-based multi-granularity sentiment analysis framework tailored specifically for Chinese educational texts, integrating sentiment classification with learning outcome prediction. Our approach operates across three distinct levels, sentence, paragraph, and full text, to extract nuanced emotional features comprehensively. Furthermore, we develop a predictive model that integrates these sentiment features with learning behavior data to estimate students’ academic performance accurately. Experimental results demonstrate that our framework consistently outperforms traditional and deep learning baseline models in sentiment classification and learning outcome prediction tasks. These findings highlight the substantial potential of NLP techniques to enhance adaptive learning strategies and optimize personalized learning experiences.
Journal Article
A spatial-frequency patching metasurface enabling super-capacity perfect vector vortex beams
2024
Optical vortices, featured with an infinite number of orthogonal channels of orbital angular momentum, have demonstrated marvelous potentials in optical multiplexing and associated applications. However, conventional vortex beams with global phase modulation approach usually possess a single topological charge (TC) and a uniform radial distance with the donut-shaped intensity, leaving unlimited spatial intensity information unexplored. Here, to break the spatial capacity limitation, we introduce an entirely new concept of a spatial-frequency patching metasurface by patching the field distribution piece-by-piece in the spatial-frequency domain, thereby breaking the symmetry of the beam morphology and allowing for local manipulation of spatial intensity and TC distributions. Moreover, by superimposing two orthogonal circular polarized perfect VBs, our breakthrough offers a super-capacity with at least 13 channels across a 3D parametric space, including morphology, polarization azimuth and ellipticity angle, namely super-capacity perfect vector vortex beams (SC-PVVBs). Furthermore, we have designed an optimized Dammann grating to facilitate an array of SC-PVVBs, thereby unleashing the full potentials across 13 channels/bits for multi-dimensional complex information communications. Our findings promise dense data transmission in an ultra-secure manner using VBs, opening up new avenues in super-capacity optical information technology in an integrated metasurface platform.
Journal Article
Discontinuous orbital angular momentum metasurface holography
2025
Orbital angular momentum (OAM) multiplexing holography has emerged as a pivotal technology for high-capacity optical communication, encryption and display, but it requires multiple inputs for decoding and its security remain constrained due to the rotational symmetry of topological charge (TC) distribution in conventional OAM modes. Here, we introduce a general paradigm of OAM multiplexing holography that enables multi-channel holographic encoding using a single incident light. Our methodology leverages a discontinuous OAM with a spatially varying TC across the azimuth, which breaks the rotational symmetry and imposes angular selectivity for information retrieval. Notably, by rationally designing the TC distribution, the discontinuous OAM exhibits self-orthogonality at different rotation angles, laying the foundation for multiplexed holography. A modified weighted Gerchberg-Saxton algorithm is developed to calculate the holographic phase profile, which can then be encoded onto a pure geometry-phase metasurface. By further integrating different pairs of discontinuous OAMs, we successfully expand the channel capacity for holographic multiplexing, significantly advancing high-security and high-capacity optical information encryption. Our work establishes discontinuous OAM as a versatile platform for secure optical communications, high-density data storage, and dynamic holographic displays, bridging the gap between structured light manipulation and cryptographic robustness.
Gao et al. realized a discontinuous orbital angular momentum metasurface holography, which enhances the channel capacity for holographic multiplexing and makes significant strides in high-security optical information encryption.
Journal Article
Evaluation of chirality descriptors derived from SMILES heteroencoders
by
Baimacheva, Natalia
,
Aires-de-Sousa, Joao
,
Gao, Xinyue
in
Algorithms
,
Arithmetic
,
Artificial intelligence
2025
Molecular representations of chirality, derived from latent space vectors (LSVs) of SMILES heteroencoders, were explored to train machine learning models to predict chiral properties, and were compared to conventional circular fingerprints. Latent space arithmetic was applied to enhance the representation of chirality, by calculating differences between the original descriptor of a molecule and the descriptor of its enantiomer, or the difference between the original descriptor and the descriptor obtained with the stereochemistry-depleted SMILES string. Machine learning was performed with the Random Forest algorithm applied to a dataset of 3858 molecules extracted from the literature (1929 pairs of enantiomers) to predict the elution order observed on the Chiralpak® AD-H column, as well as intrinsic structural chirality labels (R/S or canonical SMILES @/@@). The descriptors derived from the heteroencoders achieved an accuracy of up to 0.75 in the prediction of the elution order, and the fingerprints were superior (0.82). A better predictive ability was observed with the difference LSV descriptors than with the original descriptors.
Scientific contribution
Our work proposes latent space arithmetic to obtain descriptors of molecular chirality from SMILES heteroencoders. We used this molecular representation to build quantitative structure-enantioselectivity relationships for the prediction of the elution order of enantiomers in chiral chromatography and compared the results with those of circular fingerprints. We showed that delta descriptors of opposite enantiomers enhance the ability of latent space vectors to encode chirality.
Journal Article
Aluminum-Based Fuels as Energy Carriers for Controllable Power and Hydrogen Generation—A Review
2023
Metallic aluminum is widely used in propellants, energy-containing materials, and batteries due to its high energy density. In addition to burning in the air, aluminum can react with water to generate hydrogen. Aluminum is carbon-free and the solid-phase products can be recycled easily after the reaction. Micron aluminum powder is stable in the air and enables global trade. Aluminum metal is considered to be a viable recyclable carrier for clean energy. Based on the reaction characteristics of aluminum fuel in air and water, this work summarizes the energy conversion system of aluminum fuel, the combustion characteristics of aluminum, and the recycling of aluminum. The conversion path and application direction of electric energy and chemistry in the aluminum energy conversion system are described. The reaction properties of aluminum in the air are described, as well as the mode of activation and the effects of the aluminum-water reaction. In situ hydrogen production is achievable through the aluminum-water reaction. The development of low-carbon and energy-saving electrolytic aluminum technology is introduced. The work also analyzes the current difficulties and development directions for the large-scale application of aluminum fuel energy storage technology. The development of energy storage technology based on aluminum is conducive to transforming the energy structure.
Journal Article
Sustainability Effects of Free Trade Zones: Evidence from Water Pollution in China
2025
Under the collaborative framework of sustainable development and environmental pollution control in China, there is an urgent need to break the governance dilemma of traditional environmental regulations and explore innovative paths for sustainability. This paper empirically tests the direct impact, spatial spillover effects, and mechanisms of free trade zones (FTZs) in China in reducing water pollution. Using a spatial Durbin model (SDM) combined with the staggered difference-in-differences (STA-DID) method on a dataset of 266 Chinese cities encompassing eastern, central, and western regions with diverse economic and environmental baselines from 2003 to 2023, the study finds that FTZs significantly reduce local water pollution by 9.17 million tons of untreated sewage discharge (β = −916.6, p < 0.01), with a spatial spillover effect that decreases pollution in surrounding cities by 12.33 million tons (β = −1232.9, p < 0.01). Upgrading industrial structure, accelerating technological innovation, and strengthening government environmental governance constitute the core mediating channels. This study provides theoretical support for institutional innovation in environmental governance and empirical evidence to address the trade-off between economic growth and environmental protection in China, contributing to the understanding of how context-specific institutional innovations can advance regional sustainability, aligning with the United Nations Sustainable Development Goals (SDGs).
Journal Article