Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
10 result(s) for "Cherukuri, Teja Krishna"
Sort by:
Transfer learning based novel ensemble classifier for COVID-19 detection from chest CT-scans
Coronavirus Disease 2019 (COVID-19) is a deadly infection that affects the respiratory organs in humans as well as animals. By 2020, this disease turned out to be a pandemic affecting millions of individuals across the globe. Conducting rapid tests for a large number of suspects preventing the spread of the virus has become a challenge. In the recent past, several deep learning based approaches have been developed for automating the process of detecting COVID-19 infection from Lung Computerized Tomography (CT) scan images. However, most of them rely on a single model prediction for the final decision which may or may not be accurate. In this paper, we propose a novel ensemble approach that aggregates the strength of multiple deep neural network architectures before arriving at the final decision. We use various pre-trained models such as VGG16, VGG19, InceptionV3, ResNet50, ResNet50V2, InceptionResNetV2, Xception, and MobileNet and fine-tune them using Lung CT Scan images. All these trained models are further used to create a strong ensemble classifier that makes the final prediction. Our experiments exhibit that the proposed ensemble approach is superior to existing ensemble approaches and set state-of-the-art results for detecting COVID-19 infection from lung CT scan images. •Using Lung CT scan images for COVID-19 infection detection and pre-processing them for feature extraction.•Applying transfer learning paradigm by fine-tuning eight different pre-trained models such as VGG19, VGG19, ResNet50, ResNet50V2, InceptionV3, InceptionResNetV2, Xception, and MobileNet with frozen convolution base and trainable classification head.•Designing a composite neural network architecture and training it in an end-to-end manner.•Evaluating the proposed model on SARS-CoV-2 and COVID-CT datasets and comparing with existing ensemble approaches proposed in literature.
Hinge attention network: A joint model for diabetic retinopathy severity grading
Diabetic Retinopathy is one of the prominent reasons for permanent blindness in working age, long term diabetic patients. With the prevalence in raise of diabetics, majority of the people are endangered to permanent vision loss. The advancements in medical imaging techniques enabled the research community to focus on developing automated and computerized systems for diagnosing retinopathy in early stages. But, it is a very complex challenge due to the presence of high intra-class variations and imbalanced data distribution for higher grades of severity. In recent years, various deep learning based models have been designed for automating the process of retinopathy severity classification. In this research work, we present a fascinating deep learning model with multiple attention stages called Hinge Attention Network (HA-Net). Proposed model consists of a pre-trained VGG16 base to extract initial spatial representation from retinal scan images, spatial attention autoencoder to learn lesion specific latent representations in spatial dimensions and a channel attention based hinge neural network to grab category based discriminative features in channel dimension and classify the severity grade of retinopathy. In addition to spatial and channel attention mechanism, we use Convolutional LSTM layer to prioritize highly important spatial maps before passing to hinge neural network. All these components of HA-Net, enabled it to make generalised and accurate predictions on unseen data. The effectiveness and acceptability of proposed model is proved by validating it using two benchmark datasets, Kaggle APTOS 2019 and ISBI IDRiD. Extensive experimental studies on these datasets reveal that, proposed HA-Net outstrip several existing models by achieving an accuracy of 85.54% on Kaggle APTOS, and an accuracy of 66.41% on IDRiD datasets.
Lesion-aware attention with neural support vector machine for retinopathy diagnosis
Diabetic retinopathy (DR) is a severe eye disease which can lead to permanent blindness. Identifying DR in early stages by using computer-aided diagnosis (CAD) systems can help the ophthalmologists to give proper treatment rationally, there by preventing many people from going blind. Due to intra-class variations and imbalanced data distribution, it is highly difficult to design a CAD system for DR severity diagnosis with greater generalizability. In this article, we propose a multi-stage deep learning pipeline, lesion-aware attention with neural support vector machine, for diabetic retinopathy diagnosis. Proposed pipeline consists of a pre-trained convolution base for learning retinal image spatial representations, lesion-aware attention for weighting lesion specific features, convolution autoencoder for learning latent attention representations and a neural support vector machine for discrimination. Convolutional autoencoder and neural support vector machine are jointly trained in end-to-end fashion to obtain category based lesion specific latent attention features by complementing each other in re-constructor and discriminator paths. Proposed approach is validated using two benchmark retinal scan image datasets, Kaggle APTOS 2019 and ISBI 2018 IDRiD, for DR type and severity grade classification tasks. Our experimental studies expose that using lesion-aware attention along with the joint training of autoencoder and neural support vector machine boosted the performance of models used for DR diagnosis, thereby outperforming existing works presented in the literature for DR severity grading. Proposed model achieved the highest accuracy of 90.45%, 84.31% on APTOS dataset and an accuracy of 79.85%, 63.24% on IDRiD dataset for DR type and severity grade classification tasks, respectively.
Visual attention based composite dense neural network for facial expression recognition
Facial Expression Recognition (FER) models have received special attention in the field of computer vision and provide the basis for many real-time applications. This article proposes a unique deep learning model called Visual Attention based Composite Dense Neural Network (VA-CDNN) for recognising expressions from facial images. We extract eye-pair, mouth, and normalized face regions from facial images using localized facial landmark points. Eye-pair and mouth regions provide local information, and normalized face provides comprehensive and holistic information about facial expression. All these cropped facial regions are passed through the pre-trained Xception deep ConvNet independently to obtain the most discriminating spatial representations from each of the regions. These representations serve as input to proposed Visual Attention block. Rather than giving equal importance to each feature in the spatial representation, attention weight is computed for each feature map to indicate the amount of attention to be paid. These attention based features obtained from all the three regions are then fused to obtain a compact and discriminatory representation that ultimately leads to better identification of facial expressions. A regularized dense neural network is trained on these visual attention based features to identify the type of facial expression. Efficacy and robustness of the attention based approach are proved based on the experimental studies on the benchmark JAFFE and CK+ datasets. Proposed VA-CDNN achieved a highest test accuracy of 97.67% and 97.46% on CK+ and JAFFE datasets respectively. Results obtained from the experimental studies reveal that the proposed method using attention based features is comparable to the recent best models with consistently improving performance regardless of the number of expressions being considered for recognition.
Medtransnet: advanced gating transformer network for medical image classification
Accurate medical image classification poses a significant challenge in designing expert computer-aided diagnosis systems. While deep learning approaches have shown remarkable advancements over traditional techniques, addressing inter-class similarity and intra-class dissimilarity across medical imaging modalities remains challenging. This work introduces the advanced gating transformer network (MedTransNet), a deep learning model tailored for precise medical image classification. MedTransNet utilizes channel and multi-gate attention mechanisms, coupled with residual interconnections, to learn category-specific attention representations from diverse medical imaging modalities. Additionally, the use of gradient centralization during training helps in preventing overfitting and improving generalization, which is especially important in medical imaging applications where the availability of labeled data is often limited. Evaluation on benchmark datasets, including APTOS-2019, Figshare, and SARS-CoV-2, demonstrates effectiveness of the proposed MedTransNet across tasks such as diabetic retinopathy severity grading, multi-class brain tumor classification, and COVID-19 detection. Experimental results showcase MedTransNet achieving 85.68% accuracy for retinopathy grading, 98.37% ( ± 0.44 ) for tumor classification, and 99.60% for COVID-19 detection, surpassing recent deep learning models. MedTransNet holds promise for significantly improving medical image classification accuracy.
M3T: Multi-Modal Medical Transformer to bridge Clinical Context with Visual Insights for Retinal Image Medical Description Generation
Automated retinal image medical description generation is crucial for streamlining medical diagnosis and treatment planning. Existing challenges include the reliance on learned retinal image representations, difficulties in handling multiple imaging modalities, and the lack of clinical context in visual representations. Addressing these issues, we propose the Multi-Modal Medical Transformer (M3T), a novel deep learning architecture that integrates visual representations with diagnostic keywords. Unlike previous studies focusing on specific aspects, our approach efficiently learns contextual information and semantics from both modalities, enabling the generation of precise and coherent medical descriptions for retinal images. Experimental studies on the DeepEyeNet dataset validate the success of M3T in meeting ophthalmologists' standards, demonstrating a substantial 13.5% improvement in BLEU@4 over the best-performing baseline model.
Guided Context Gating: Learning to leverage salient lesions in retinal fundus images
Effectively representing medical images, especially retinal images, presents a considerable challenge due to variations in appearance, size, and contextual information of pathological signs called lesions. Precise discrimination of these lesions is crucial for diagnosing vision-threatening issues such as diabetic retinopathy. While visual attention-based neural networks have been introduced to learn spatial context and channel correlations from retinal images, they often fall short in capturing localized lesion context. Addressing this limitation, we propose a novel attention mechanism called Guided Context Gating, an unique approach that integrates Context Formulation, Channel Correlation, and Guided Gating to learn global context, spatial correlations, and localized lesion context. Our qualitative evaluation against existing attention mechanisms emphasize the superiority of Guided Context Gating in terms of explainability. Notably, experiments on the Zenodo-DR-7 dataset reveal a substantial 2.63% accuracy boost over advanced attention mechanisms & an impressive 6.53% improvement over the state-of-the-art Vision Transformer for assessing the severity grade of retinopathy, even with imbalanced and limited training samples for each class.
GCS-M3VLT: Guided Context Self-Attention based Multi-modal Medical Vision Language Transformer for Retinal Image Captioning
Retinal image analysis is crucial for diagnosing and treating eye diseases, yet generating accurate medical reports from images remains challenging due to variability in image quality and pathology, especially with limited labeled data. Previous Transformer-based models struggled to integrate visual and textual information under limited supervision. In response, we propose a novel vision-language model for retinal image captioning that combines visual and textual features through a guided context self-attention mechanism. This approach captures both intricate details and the global clinical context, even in data-scarce scenarios. Extensive experiments on the DeepEyeNet dataset demonstrate a 0.023 BLEU@4 improvement, along with significant qualitative advancements, highlighting the effectiveness of our model in generating comprehensive medical captions.
Multi-modal Imaging Genomics Transformer: Attentive Integration of Imaging with Genomic Biomarkers for Schizophrenia Classification
Schizophrenia (SZ) is a severe brain disorder marked by diverse cognitive impairments, abnormalities in brain structure, function, and genetic factors. Its complex symptoms and overlap with other psychiatric conditions challenge traditional diagnostic methods, necessitating advanced systems to improve precision. Existing research studies have mostly focused on imaging data, such as structural and functional MRI, for SZ diagnosis. There has been less focus on the integration of genomic features despite their potential in identifying heritable SZ traits. In this study, we introduce a Multi-modal Imaging Genomics Transformer (MIGTrans), that attentively integrates genomics with structural and functional imaging data to capture SZ-related neuroanatomical and connectome abnormalities. MIGTrans demonstrated improved SZ classification performance with an accuracy of 86.05% (+/- 0.02), offering clear interpretations and identifying significant genomic locations and brain morphological/connectivity patterns associated with SZ.
Spatial Sequence Attention Network for Schizophrenia Classification from Structural Brain MR Images
Schizophrenia is a debilitating, chronic mental disorder that significantly impacts an individual's cognitive abilities, behavior, and social interactions. It is characterized by subtle morphological changes in the brain, particularly in the gray matter. These changes are often imperceptible through manual observation, demanding an automated approach to diagnosis. This study introduces a deep learning methodology for the classification of individuals with Schizophrenia. We achieve this by implementing a diversified attention mechanism known as Spatial Sequence Attention (SSA) which is designed to extract and emphasize significant feature representations from structural MRI (sMRI). Initially, we employ the transfer learning paradigm by leveraging pre-trained DenseNet to extract initial feature maps from the final convolutional block which contains morphological alterations associated with Schizophrenia. These features are further processed by the proposed SSA to capture and emphasize intricate spatial interactions and relationships across volumes within the brain. Our experimental studies conducted on a clinical dataset have revealed that the proposed attention mechanism outperforms the existing Squeeze & Excitation Network for Schizophrenia classification.