Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
26
result(s) for
"Deng, Yangdong"
Sort by:
Multiplexed nanomaterial-assisted laser desorption/ionization for pan-cancer diagnosis and classification
2022
As cancer is increasingly considered a metabolic disorder, it is postulated that serum metabolite profiling can be a viable approach for detecting the presence of cancer. By multiplexing mass spectrometry fingerprints from two independent nanostructured matrixes through machine learning for highly sensitive detection and high throughput analysis, we report a laser desorption/ionization (LDI) mass spectrometry-based liquid biopsy for pan-cancer screening and classification. The
M
ultiplexed
N
anomaterial-
A
ssisted
L
DI for
C
ancer
I
dentification (MNALCI) is applied in 1,183 individuals that include 233 healthy controls and 950 patients with liver, lung, pancreatic, colorectal, gastric, thyroid cancers from two independent cohorts. MNALCI demonstrates 93% sensitivity at 91% specificity for distinguishing cancers from healthy controls in the internal validation cohort, and 84% sensitivity at 84% specificity in the external validation cohort, with up to eight metabolite biomarkers identified. In addition, across those six different cancers, the overall accuracy for identifying the tumor tissue of origin is 92% in the internal validation cohort and 85% in the external validation cohort. The excellent accuracy and minimum sample consumption make the high throughput assay a promising solution for non-invasive cancer diagnosis.
As cancer is increasingly considered a metabolic disorder, it is postulated that serum metabolite profiling can be a viable approach for detecting the presence of cancer. Here, the authors report a machine learning model using mass spectrometry-based liquid biopsy data for pan-cancer screening and classification.
Journal Article
A spatiotemporal deep neural network for fine-grained multi-horizon wind prediction
2023
The prediction of wind in terms of both wind speed and direction, which has a crucial impact on many real-world applications like aviation and wind power generation, is extremely challenging due to the high stochasticity and complicated correlation in the weather data. Existing methods typically focus on a sub-set of influential factors and thus lack a systematic treatment of the problem. In addition, fine-grained forecasting is essential for efficient industry operations, but has been less attended in the literature. In this work, we propose a novel data-driven model, multi-horizon spatiotemporal network (MHSTN), generally for accurate and efficient fine-grained wind prediction. MHSTN integrates multiple deep neural networks targeting different factors in a sequence-to-sequence (Seq2Seq) backbone to effectively extract features from various data sources and produce multi-horizon predictions for all sites within a given region. MHSTN is composed of four major modules. First, a temporal module fuses coarse-grained forecasts derived by numerical weather prediction (NWP) and historical on-site observation data at stations so as to leverage both global and local atmospheric information. Second, a spatial module exploits spatial correlation by modeling the joint representation of all stations. Third, an ensemble module weighs the above two modules for final predictions. Furthermore, a covariate selection module automatically choose influential meteorological variables as initial input. MHSTN is already integrated into the scheduling platform of one of the busiest international airports of China. The evaluation results demonstrate that our model outperforms competitors by a significant margin.
Journal Article
Removing Backdoors in Pre-trained Models by Regularized Continual Pre-training
by
Cui, Ganqu
,
Qin, Yujia
,
Deng, Yangdong
in
Agnosticism
,
Classification
,
Computational linguistics
2023
Recent research has revealed that pre-trained models (PTMs) are vulnerable to backdoor attacks before the fine-tuning stage. The attackers can implant transferable
backdoors in PTMs, and control model outputs on any downstream task, which poses severe security threats to all downstream applications. Existing backdoor-removal defenses focus on task-specific classification models and they are not suitable for defending PTMs against
backdoor attacks. To this end, we propose the first
backdoor removal method for PTMs. Based on the
phenomenon in backdoored PTMs, we design a simple and effective backdoor eraser, which continually pre-trains the backdoored PTMs with a regularization term in an end-to-end approach. The regularization term removes backdoor functionalities from PTMs while the continual pre-training maintains the normal functionalities of PTMs. We conduct extensive experiments on pre-trained models across different modalities and architectures. The experimental results show that our method can effectively remove backdoors inside PTMs and preserve benign functionalities of PTMs with a few downstream-task-irrelevant auxiliary data, e.g., unlabeled plain texts. The average attack success rate on three downstream datasets is reduced from 99.88% to 8.10% after our defense on the backdoored BERT. The codes are publicly available at
.
Journal Article
Catching Critical Transition in Engineered Systems
by
Huang, Jin
,
Meng, Tianchuang
,
Deng, Yangdong
in
Datasets
,
Dynamical systems
,
Early warning systems
2021
A variety of engineered systems can encounter critical transitions where the system suddenly shifts from one stable state to another at a critical threshold. The critical transition has aroused vital concerns for its potentially disastrous impacts. We validate an often taken-for-granted hypothesis that the failure of engineered systems can be attributed to the respective critical transitions and show how early warning signals are closely associated with critical transitions. We demonstrate that it is feasible to use early warning signals to predict system failures. Our findings open a new path to forecast failures of engineered systems with a generic method and provide supporting evidence for the universal existence of critical transition in dynamical systems at multiple scales.
Journal Article
A High-Throughput, High-Accuracy System-Level Simulation Framework for System on Chips
by
Wang, Dawei
,
Wang, Xu
,
Deng, Yangdong
in
Circuit design
,
Computer simulation
,
Computer-generated environments
2011
Today's System-on-Chips (SoCs) design is extremely challenging because it involves complicated design tradeoffs and heterogeneous design expertise. To explore the large solution space, system architects have to rely on system-level simulators to identify an optimized SoC architecture. In this paper, we propose a system-level simulation framework, System Performance Simulation Implementation Mechanism, or SPSIM. Based on SystemC TLM2.0, the framework consists of an executable SoC model, a simulation tool chain, and a modeling methodology. Compared with the large body of existing research in this area, this work is aimed at delivering a high simulation throughput and, at the same time, guaranteeing a high accuracy on real industrial applications. Integrating the leading TLM techniques, our simulator can attain a simulation speed that is not slower than that of the hardware execution by a factor of 35 on a set of real-world applications. SPSIM incorporates effective timing models, which can achieve a high accuracy after hardware-based calibration. Experimental results on a set of mobile applications proved that the difference between the simulated and measured results of timing performance is within 10%, which in the past can only be attained by cycle-accurate models.
Journal Article
A Spatiotemporal Deep Neural Network for Fine-Grained Multi-Horizon Wind Prediction
by
Deng, Yangdong
,
Huang, Fanling
in
Artificial neural networks
,
Mathematical models
,
Meteorological data
2023
The prediction of wind in terms of both wind speed and direction, which has a crucial impact on many real-world applications like aviation and wind power generation, is extremely challenging due to the high stochasticity and complicated correlation in the weather data. Existing methods typically focus on a sub-set of influential factors and thus lack a systematic treatment of the problem. In addition, fine-grained forecasting is essential for efficient industry operations, but has been less attended in the literature. In this work, we propose a novel data-driven model, Multi-Horizon SpatioTemporal Network (MHSTN), generally for accurate and efficient fine-grained wind prediction. MHSTN integrates multiple deep neural networks targeting different factors in a sequence-to-sequence (Seq2Seq) backbone to effectively extract features from various data sources and produce multi-horizon predictions for all sites within a given region. MHSTN is composed of four major modules. First, a temporal module fuses coarse-grained forecasts derived by Numerical Weather Prediction (NWP) and historical on-site observation data at stations so as to leverage both global and local atmospheric information. Second, a spatial module exploits spatial correlation by modeling the joint representation of all stations. Third, an ensemble module weighs the above two modules for final predictions. Furthermore, a covariate selection module automatically choose influential meteorological variables as initial input. MHSTN is already integrated into the scheduling platform of one of the busiest international airports of China. The evaluation results demonstrate that our model outperforms competitors by a significant margin.
TCGAN: Convolutional Generative Adversarial Network for Time Series Classification and Clustering
2023
Recent works have demonstrated the superiority of supervised Convolutional Neural Networks (CNNs) in learning hierarchical representations from time series data for successful classification. These methods require sufficiently large labeled data for stable learning, however acquiring high-quality labeled time series data can be costly and potentially infeasible. Generative Adversarial Networks (GANs) have achieved great success in enhancing unsupervised and semi-supervised learning. Nonetheless, to our best knowledge, it remains unclear how effectively GANs can serve as a general-purpose solution to learn representations for time series recognition, i.e., classification and clustering. The above considerations inspire us to introduce a Time-series Convolutional GAN (TCGAN). TCGAN learns by playing an adversarial game between two one-dimensional CNNs (i.e., a generator and a discriminator) in the absence of label information. Parts of the trained TCGAN are then reused to construct a representation encoder to empower linear recognition methods. We conducted comprehensive experiments on synthetic and real-world datasets. The results demonstrate that TCGAN is faster and more accurate than existing time-series GANs. The learned representations enable simple classification and clustering methods to achieve superior and stable performance. Furthermore, TCGAN retains high efficacy in scenarios with few-labeled and imbalanced-labeled data. Our work provides a promising path to effectively utilize abundant unlabeled time series data.
A Multi-Level Framework for Accelerating Training Transformer Models
2024
The fast growing capabilities of large-scale deep learning models, such as Bert, GPT and ViT, are revolutionizing the landscape of NLP, CV and many other domains. Training such models, however, poses an unprecedented demand for computing power, which incurs exponentially increasing energy cost and carbon dioxide emissions. It is thus critical to develop efficient training solutions to reduce the training costs. Motivated by a set of key observations of inter- and intra-layer similarities among feature maps and attentions that can be identified from typical training processes, we propose a multi-level framework for training acceleration. Specifically, the framework is based on three basic operators, Coalescing, De-coalescing and Interpolation, which can be orchestrated to build a multi-level training framework. The framework consists of a V-cycle training process, which progressively down- and up-scales the model size and projects the parameters between adjacent levels of models via coalescing and de-coalescing. The key idea is that a smaller model that can be trained for fast convergence and the trained parameters provides high-qualities intermediate solutions for the next level larger network. The interpolation operator is designed to break the symmetry of neurons incurred by de-coalescing for better convergence performance. Our experiments on transformer-based language models (e.g. Bert, GPT) as well as a vision model (e.g. DeiT) prove that the proposed framework reduces the computational cost by about 20% on training BERT/GPT-Base models and up to 51.6% on training the BERT-Large model while preserving the performance.
InstCache: A Predictive Cache for LLM Serving
by
Zou, Longwei
,
Liu, Tingfeng
,
Deng, Yangdong
in
Algorithms
,
Energy consumption
,
Large language models
2024
Large language models are revolutionizing every aspect of human life. However, the unprecedented power comes at the cost of significant computing intensity, suggesting long latency and large energy footprint. Key-Value Cache and Semantic Cache have been proposed as a solution to the above problem, but both suffer from limited scalability due to significant memory cost for each token or instruction embeddings. Motivated by the observations that most instructions are short, repetitive and predictable by LLMs, we propose to predict user-instructions by an instruction-aligned LLM and store them in a predictive cache, so-called InstCache. We introduce an instruction pre-population algorithm based on the negative log likelihood of instructions, determining the cache size with regard to the hit rate. The proposed InstCache is efficiently implemented as a hash table with minimal lookup latency for deployment. Experimental results show that InstCache can achieve up to 51.34% hit rate on LMSys dataset, which corresponds to a 2x speedup, at a memory cost of only 4.5GB.