Catalogue Search | MBRL

Deep active learning for classifying cancer pathology reports

by Gao, Shang , Durbin, Eric B. , Stroup, Antoinette in Active learning , Algorithms , Annotations

2021

Background Automated text classification has many important applications in the clinical setting; however, obtaining labelled data for training machine learning and deep learning models is often difficult and expensive. Active learning techniques may mitigate this challenge by reducing the amount of labelled data required to effectively train a model. In this study, we analyze the effectiveness of 11 active learning algorithms on classifying subsite and histology from cancer pathology reports using a Convolutional Neural Network as the text classification model. Results We compare the performance of each active learning strategy using two differently sized datasets and two different classification tasks. Our results show that on all tasks and dataset sizes, all active learning strategies except diversity-sampling strategies outperformed random sampling, i.e., no active learning. On our large dataset (15K initial labelled samples, adding 15K additional labelled samples each iteration of active learning), there was no clear winner between the different active learning strategies. On our small dataset (1K initial labelled samples, adding 1K additional labelled samples each iteration of active learning), marginal and ratio uncertainty sampling performed better than all other active learning techniques. We found that compared to random sampling, active learning strongly helps performance on rare classes by focusing on underrepresented classes. Conclusions Active learning can save annotation cost by helping human annotators efficiently and intelligently select which samples to label. Our results show that a dataset constructed using effective active learning techniques requires less than half the amount of labelled data to achieve the same performance as a dataset constructed using random sampling.

Journal Article

Share this book

Add to My Shelf

Improving Text Classification with Large Language Model-Based Data Augmentation

by Zhao, Huanhuan , Ruggles, Thomas A. , Singh, Debjani in Artificial intelligence , Chatbots , ChatGPT

2024

Large Language Models (LLMs) such as ChatGPT possess advanced capabilities in understanding and generating text. These capabilities enable ChatGPT to create text based on specific instructions, which can serve as augmented data for text classification tasks. Previous studies have approached data augmentation (DA) by either rewriting the existing dataset with ChatGPT or generating entirely new data from scratch. However, it is unclear which method is better without comparing their effectiveness. This study investigates the application of both methods to two datasets: a general-topic dataset (Reuters news data) and a domain-specific dataset (Mitigation dataset). Our findings indicate that: 1. ChatGPT generated new data consistently enhanced model’s classification results for both datasets. 2. Generating new data generally outperforms rewriting existing data, though crafting the prompts carefully is crucial to extract the most valuable information from ChatGPT, particularly for domain-specific data. 3. The augmentation data size affects the effectiveness of DA; however, we observed a plateau after incorporating 10 samples. 4. Combining the rewritten sample with new generated sample can potentially further improve the model’s performance.

Journal Article

Share this book

Add to My Shelf

Development of message passing-based graph convolutional networks for classifying cancer pathology reports

by Tourassi, Georgia D. , Stroup, Antoinette , Coyle, Linda in Algorithms , Analysis , Artificial intelligence

2024

Background Applying graph convolutional networks (GCN) to the classification of free-form natural language texts leveraged by graph-of-words features (TextGCN) was studied and confirmed to be an effective means of describing complex natural language texts. However, the text classification models based on the TextGCN possess weaknesses in terms of memory consumption and model dissemination and distribution. In this paper, we present a fast message passing network (FastMPN), implementing a GCN with message passing architecture that provides versatility and flexibility by allowing trainable node embedding and edge weights, helping the GCN model find the better solution. We applied the FastMPN model to the task of clinical information extraction from cancer pathology reports, extracting the following six properties: main site, subsite, laterality, histology, behavior, and grade. Results We evaluated the clinical task performance of the FastMPN models in terms of micro- and macro-averaged F1 scores. A comparison was performed with the multi-task convolutional neural network (MT-CNN) model. Results show that the FastMPN model is equivalent to or better than the MT-CNN. Conclusions Our implementation revealed that our FastMPN model, which is based on the PyTorch platform, can train a large corpus (667,290 training samples) with 202,373 unique words in less than 3 minutes per epoch using one NVIDIA V100 hardware accelerator. Our experiments demonstrated that using this implementation, the clinical task performance scores of information extraction related to tumors from cancer pathology reports were highly competitive.

Journal Article

Share this book

Add to My Shelf

Scalable deep text comprehension for Cancer surveillance on high-performance computing

by Srivastava, Kshitij , Watson, Thomas P. , Blair Christian, J. in Algorithms , Bioinformatics , Biomedical and Life Sciences

2018

Background Deep Learning (DL) has advanced the state-of-the-art capabilities in bioinformatics applications which has resulted in trends of increasingly sophisticated and computationally demanding models trained by larger and larger data sets. This vastly increased computational demand challenges the feasibility of conducting cutting-edge research. One solution is to distribute the vast computational workload across multiple computing cluster nodes with data parallelism algorithms. In this study, we used a High-Performance Computing environment and implemented the Downpour Stochastic Gradient Descent algorithm for data parallelism to train a Convolutional Neural Network (CNN) for the natural language processing task of information extraction from a massive dataset of cancer pathology reports. We evaluated the scalability improvements using data parallelism training and the Titan supercomputer at Oak Ridge Leadership Computing Facility. To evaluate scalability, we used different numbers of worker nodes and performed a set of experiments comparing the effects of different training batch sizes and optimizer functions. Results We found that Adadelta would consistently converge at a lower validation loss, though requiring over twice as many training epochs as the fastest converging optimizer, RMSProp. The Adam optimizer consistently achieved a close 2nd place minimum validation loss significantly faster; using a batch size of 16 and 32 allowed the network to converge in only 4.5 training epochs. Conclusions We demonstrated that the networked training process is scalable across multiple compute nodes communicating with message passing interface while achieving higher classification accuracy compared to a traditional machine learning algorithm.

Journal Article

Share this book

Add to My Shelf

The Role of Work-Family Balance Policy for Enhancing Social Sustainability: A Choice Experiment Analysis of Koreans in their Twenties and Thirties

by Yoon, Hong Jun , Oh, Inha , Hwang, Won-Sik in Adult , Aging , Algorithms

2019

Korea is facing problems, such as inequality within society and an aging population, that places a burden on public health expenditure. The active adoption of policies that promote work-family balance (WFB), such as parental leave and workplace childcare centers, is known to help solve these problems. However, there has, as yet, been little quantitative evidence accumulated to support this notion. This study used the choice experiment methodology on 373 Koreans in their twenties and thirties, to estimate the level of utility derived from work-family balance policies. The results show that willingness to pay for parental leave was found to be valued at 7.81 million Korean won, while it was 4.83 million won for workplace childcare centers. In particular, WFB policies were found to benefit workers of lower socioeconomic status or belonging to disadvantaged groups, such as women, those with low education levels, and those with low incomes. Furthermore, the utility derived from WFB policies was found to be greater among those who desire children compared to those who do not. The results suggest that the proactive introduction of WFB policies will help solve problems such as inequality within society and population aging.

Journal Article

Share this book

Add to My Shelf

Multiridgelets for texture analysis

by Yoon, Hong-Jun in Electrical engineering , Medical imaging

2011

Directional wavelets have orientation selectivity and thus are able to efficiently represent highly anisotropic elements such as line segments and edges. Ridgelet transform is a kind of directional multi-resolution transform and has been successful in many image processing and texture analysis applications. The objective of this research is to develop multi-ridgelet transform by applying multiwavelet transform to the Radon transform so as to attain attractive improvements. By adapting the cardinal orthogonal multiwavelets to the ridgelet transform, it is shown that the proposed cardinal multiridgelet transform (CMRT) possesses cardinality, approximate translation invariance, and approximate rotation invariance simultaneously, whereas no single ridgelet transform can hold all these properties at the same time. These properties are beneficial to image texture analysis. This is demonstrated in three studies of texture analysis applications. Firstly a texture database retrieval study taking a portion of the Brodatz texture album as an example has demonstrated that the CMRT-based texture representation for database retrieval performed better than other directional wavelet methods. Secondly the study of the LCD mura defect detection was based upon the classification of simulated abnormalities with a linear support vector machine classifier, the CMRT-based analysis of defects were shown to provide efficient features for superior detection performance than other competitive methods. Lastly and the most importantly, a study on the prostate cancer tissue image classification was conducted. With the CMRT-based texture extraction, Gaussian kernel support vector machines have been developed to discriminate prostate cancer Gleason grade 3 versus grade 4. Based on a limited database of prostate specimens, one classifier was trained to have remarkable test performance. This approach is unquestionably promising and is worthy to be fully developed.

Dissertation

Share this book

Add to My Shelf

Enhancing Diagnosis through AI-driven Analysis of Reflectance Confocal Microscopy

by Leachman, Sancy A , Ludzik, Joanna , Hanson, Heidi A in Dermatology , Diagnosis , Image resolution

2024

Reflectance Confocal Microscopy (RCM) is a non-invasive imaging technique used in biomedical research and clinical dermatology. It provides virtual high-resolution images of the skin and superficial tissues, reducing the need for physical biopsies. RCM employs a laser light source to illuminate the tissue, capturing the reflected light to generate detailed images of microscopic structures at various depths. Recent studies explored AI and machine learning, particularly CNNs, for analyzing RCM images. Our study proposes a segmentation strategy based on textural features to identify clinically significant regions, empowering dermatologists in effective image interpretation and boosting diagnostic confidence. This approach promises to advance dermatological diagnosis and treatment.

Paper

Share this book

Add to My Shelf

Ultra-Long Sequence Distributed Transformer

by Luo, Tao , Gouley, John , Wahib, Mohamed in Accuracy , Algorithms , Computer memory

2023

Transformer models trained on long sequences often achieve higher accuracy than short sequences. Unfortunately, conventional transformers struggle with long sequence training due to the overwhelming computation and memory requirements. Existing methods for long sequence training offer limited speedup and memory reduction, and may compromise accuracy. This paper presents a novel and efficient distributed training method, the Long Short-Sequence Transformer (LSS Transformer), for training transformer with long sequences. It distributes a long sequence into segments among GPUs, with each GPU computing a partial self-attention for its segment. Then, it uses a fused communication and a novel double gradient averaging technique to avoid the need to aggregate partial self-attention and minimize communication overhead. We evaluated the performance between LSS Transformer and the state-of-the-art Nvidia sequence parallelism on a Wikipedia enwik8 dataset. Results show that our proposed method lead to 5.6x faster and 10.2x more memory-efficient implementation compared to state-of-the-art sequence parallelism on 144 Nvidia V100 GPUs. Moreover, our algorithm scales to an extreme sequence length of 50,112 at 3,456 GPUs, achieving 161% super-linear parallel efficiency and a throughput of 32 petaflops.

Paper

Share this book

Add to My Shelf

Deep Active Learning for Classifying Cancer Pathology Reports

by Gao, Shang , Stroup, Antoinette , Coyle, Linda in Active learning , Datasets

2020

Background: Automated text classiﬁcation has many important applications in the clinical setting; however, obtaining labelled data for training machine learning and deep learning models is often diﬃcult and expensive. Active learning techniques may mitigate this challenge by reducing the amount of labelled data required to eﬀectively train a model. In this study, we analyze the eﬀectiveness of eleven active learning algorithms on classifying subsite and histology from cancer pathology reports using a Convolutional Neural Network (CNN) as the text classiﬁcation model. Results: We compare the performance of each active learning strategy using two diﬀerently sized datasets and two diﬀerent classiﬁcation tasks. Our results show that on all tasks and dataset sizes, all active learning strategies except diversity-sampling strategies outperformed random sampling, i.e., no active learning. On our large dataset (15K initial labelled samples, adding 15K additional labelled samples each iteration of active learning), there was no clear winner between the diﬀerent active learning strategies. On our small dataset (1K initial labelled samples, adding 1K additional labelled samples each iteration of active learning), marginal and ratio uncertainty sampling performed better than all other active learning techniques. We found that compared to random sampling, active learning strongly helps performance on rare classes by focusing on underrepresented classes. Conclusions: Active learning can save annotation cost by helping human annotators eﬃciently and intelligently select which samples to label. Our results show that a dataset constructed using eﬀective active learning techniques requires less than half the amount of labelled data to achieve the same performance as a dataset that constructed using random sampling.

Web Resource

Share this book

Add to My Shelf

Unleashing the full potential of Hsp90 inhibitors as cancer therapeutics through simultaneous inactivation of Hsp90, Grp94, and TRAP1

by Hu, Sung , Nam Dougu , Yoon Sora in Antineoplastic drugs , Antitumor activity , Antitumor agents

2020

The Hsp90 family proteins Hsp90, Grp94, and TRAP1 are present in the cell cytoplasm, endoplasmic reticulum, and mitochondria, respectively; all play important roles in tumorigenesis by regulating protein homeostasis in response to stress. Thus, simultaneous inhibition of all Hsp90 paralogs is a reasonable strategy for cancer therapy. However, since the existing pan-Hsp90 inhibitor does not accumulate in mitochondria, the potential anticancer activity of pan-Hsp90 inhibition has not yet been fully examined in vivo. Analysis of The Cancer Genome Atlas database revealed that all Hsp90 paralogs were upregulated in prostate cancer. Inactivation of all Hsp90 paralogs induced mitochondrial dysfunction, increased cytosolic calcium, and activated calcineurin. Active calcineurin blocked prosurvival heat shock responses upon Hsp90 inhibition by preventing nuclear translocation of HSF1. The purine scaffold derivative DN401 inhibited all Hsp90 paralogs simultaneously and showed stronger anticancer activity than other Hsp90 inhibitors. Pan-Hsp90 inhibition increased cytotoxicity and suppressed mechanisms that protect cancer cells, suggesting that it is a feasible strategy for the development of potent anticancer drugs. The mitochondria-permeable drug DN401 is a newly identified in vivo pan-Hsp90 inhibitor with potent anticancer activity.Cancer therapeutics: Extending a drug’s reachA new drug that blocks heat shock proteins (HSPs), helper proteins that are co-opted by cancer cells to promote tumor growth, shows promise for cancer treatment. Several drugs have targeted HSPs, since cancer cells are known to hijack these helper proteins to shield themselves from destruction by the body. However, the drugs have had limited success. Hye-Kyung Park and Byoung Heon Kang at Ulsan National Institutes of Science and Technology in South Korea and coworkers noticed that the drugs were not absorbed into mitochondria, a key cellular compartment, and HSPs in this compartment were therefore not being blocked. They identified a new HSP inhibitor that can reach every cellular compartment and inhibit all HSPs. Testing in mice showed that this inhibitor effectively triggered death of tumor cells, and therefore shows promise for anti-cancer therapy.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter