Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
26 result(s) for "Zia, Tehseen"
Sort by:
A Physical-Enhanced Spatio-Temporal Graph Convolutional Network for River Flow Prediction
River flow forecasting remains a critical yet challenging task in hydrological science, owing to the inherent trade-offs between physics-based models and data-driven methods. While physics-based models offer interpretability and process-based insights, they often struggle with real-world complexity and adaptability. Conversely, purely data-driven models, though powerful in capturing data patterns, lack physical grounding and often underperform in extreme scenarios. To address this gap, we propose PESTGCN, a Physical-Enhanced Spatio-Temporal Graph Convolutional Network that integrates hydrological domain knowledge with the flexibility of graph-based learning. PESTGCN models the watershed system as a Heterogeneous Information Network (HIN), capturing various physical entities (e.g., gauge stations, rainfall stations, reservoirs) and their diverse interactions (e.g., spatial proximity, rainfall influence, and regulation effects) within a unified graph structure. To better capture the latent semantics, meta-path-based encoding is employed to model higher-order relationships. Furthermore, a hybrid attention mechanism incorporating both local temporal features and global spatial dependencies enables comprehensive sequence learning. Importantly, key variables from the HEC-HMS hydrological model are embedded into the framework to improve physical interpretability and generalization. Experimental results on four real-world benchmark watersheds demonstrate that PESTGCN achieves statistically significant improvements over existing state-of-the-art models, with relative reductions in MAE ranging from 5.3% to 13.6% across different forecast horizons. These results validate the effectiveness of combining physical priors with graph-based temporal modeling.
Decoupling Rainfall and Surface Runoff Effects Based on Spatio-Temporal Spectra of Wireless Channel State Information
Leveraging ubiquitous wireless signals for environmental sensing provides a highly promising pathway toward constructing low-cost and high-density flood monitoring systems. However, in real-world flood scenarios, the wireless channel is simultaneously affected by rainfall-induced signal attenuation and complex multipath effects caused by surface runoff (water accumulation). These two physical phenomena become intertwined in the received signals, resulting in severe feature ambiguity. This not only greatly limits the accuracy of environmental sensing but also hinders communication systems from performing effective channel compensation. How to disentangle these combined effects from a single wireless link represents a fundamental scientific challenge for achieving high-precision wireless environmental sensing and ensuring communication reliability under harsh conditions. To address this challenge, we propose a novel signal processing framework that aims to effectively decouple the effects of rainfall and surface runoff from Channel State Information (CSI) collected using commercial Wi-Fi devices. The core idea of our method lies in first constructing a two-dimensional CSI spatiotemporal spectrogram from continuously captured multicarrier CSI data. This spectrogram enables high-resolution visualization of the unique “fingerprints” of different physical effects—rainfall manifests as smooth background attenuation, whereas surface runoff appears as sparse high-frequency textures. Building upon this representation, we design and implement a Dual-Decoder Convolutional Autoencoder deep learning model. The model employs a shared encoder to learn the mixed CSI features, while two distinct decoder branches are responsible for reconstructing the global background component attributed to rainfall and the local texture component associated with surface runoff, respectively. Based on the decoupled signal components, we achieve simultaneous and highly accurate estimation of rainfall intensity (mean absolute error below 1.5 mm/h) and surface water accumulation (detection accuracy of 98%). Furthermore, when the decoupled and refined channel estimates are applied to a communication receiver for channel equalization, the Bit Error Rate (BER) is reduced by more than one order of magnitude compared to conventional equalization methods.
A latent diffusion approach to visual attribution in medical imaging
Visual attribution in medical imaging seeks to make evident the diagnostically-relevant components of a medical image, in contrast to the more common detection of diseased tissue deployed in standard machine vision pipelines (which are less straightforwardly interpretable/explainable to clinicians). We here present a novel generative visual attribution technique, one that leverages latent diffusion models in combination with domain-specific large language models, in order to generate normal counterparts of abnormal images. The discrepancy between the two hence gives rise to a mapping indicating the diagnostically-relevant image components. To achieve this, we deploy image priors in conjunction with appropriate conditioning mechanisms in order to control the image generative process, including natural language text prompts acquired from medical science and applied radiology. We perform experiments and quantitatively evaluate our results on the COVID-19 Radiography Database containing labelled chest X-rays with differing pathologies via the Frechet Inception Distance (FID), Structural Similarity (SSIM) and Multi Scale Structural Similarity Metric (MS-SSIM) metrics obtained between real and generated images. The resulting system also exhibits a range of latent capabilities including zero-shot localized disease induction , which are evaluated with real examples from the cheXpert dataset.
An inherently interpretable deep learning model for local explanations using visual concepts
Over the past decade, deep learning has become the leading approach for various computer vision tasks and decision support systems. However, the opaque nature of deep learning models raises significant concerns about their fairness, reliability, and the underlying inferences they make. Many existing methods attempt to approximate the relationship between low-level input features and outcomes. However, humans tend to understand and reason based on high-level concepts rather than low-level input features. To bridge this gap, several concept-based interpretable methods have been developed. Most of these methods compute the importance of each discovered concept for a specific class. However, they often fail to provide local explanations. Additionally, these approaches typically rely on labeled concepts or learn directly from datasets, leading to the extraction of irrelevant concepts. They also tend to overlook the potential of these concepts to interpret model predictions effectively. This research proposes a two-stream model called the Cross-Attentional Fast/Slow Thinking Network (CA-SoftNet) to address these issues. The model is inspired by dual-process theory and integrates two key components: a shallow convolutional neural network (sCNN) as System-I for rapid, implicit pattern recognition and a cross-attentional concept memory network as System-II for transparent, controllable, and logical reasoning. Our evaluation across diverse datasets demonstrates the model’s competitive accuracy, achieving 85.6%, 83.7%, 93.6%, and 90.3% on CUB 200-2011, Stanford Cars, ISIC 2016, and ISIC 2017, respectively. This performance outperforms existing interpretable models and is comparable to non-interpretable counterparts. Furthermore, our novel concept extraction method facilitates identifying and selecting salient concepts. These concepts are then used to generate concept-based local explanations that align with human thinking. Additionally, the model’s ability to share similar concepts across distinct classes, such as in fine-grained classification, enhances its scalability for large datasets. This feature also induces human-like cognition and reasoning within the proposed framework.
Residual Recurrent Highway Networks for Learning Deep Sequence Prediction Models
A contemporary approach for acquiring the computational gains of depth in recurrent neural networks (RNNs) is to hierarchically stack multiple recurrent layers. However, such performance gains come with the cost of challenging optimization of hierarchal RNNs (HRNNs) which are deep both hierarchically and temporally. The researchers have exclusively highlighted the significance of using shortcuts for learning deep hierarchical representations and deep temporal dependencies. However, no significant efforts are made to unify these finding into a single framework for learning deep HRNNs. We propose residual recurrent highway network (R2HN) that contains highways within temporal structure of the network for unimpeded information propagation, thus alleviating gradient vanishing problem. The hierarchical structure learning is posed as residual learning framework to prevent performance degradation problem. The proposed R2HN contain significantly reduced data-dependent parameters as compared to related methods. The experiments on language modeling (LM) tasks have demonstrated that the proposed architecture leads to design effective models. On LM experiments with Penn TreeBank, the model achieved 60.3 perplexity and outperformed baseline and related models that we tested.
Analyzing Transfer Learning of Vision Transformers for Interpreting Chest Radiography
Limited availability of medical imaging datasets is a vital limitation when using “data hungry” deep learning to gain performance improvements. Dealing with the issue, transfer learning has become a de facto standard, where a pre-trained convolution neural network (CNN), typically on natural images (e.g., ImageNet), is finetuned on medical images. Meanwhile, pre-trained transformers, which are self-attention-based models, have become de facto standard in natural language processing (NLP) and state of the art in image classification due to their powerful transfer learning abilities. Inspired by the success of transformers in NLP and image classification, large-scale transformers (such as vision transformer) are trained on natural images. Based on these recent developments, this research aims to explore the efficacy of pre-trained natural image transformers for medical images. Specifically, we analyze pre-trained vision transformer on CheXpert and pediatric pneumonia dataset. We use CNN standard models including VGGNet and ResNet as baseline models. By examining the acquired representations and results, we discover that transfer learning from the pre-trained vision transformer shows improved results as compared to pre-trained CNN which demonstrates a greater transfer ability of the transformers in medical imaging.
Face recognition with Bayesian convolutional networks for robust surveillance systems
Recognition of facial images is one of the most challenging research issues in surveillance systems due to different problems including varying pose, expression, illumination, and resolution. The robustness of recognition method strongly relies on the strength of extracted features and the ability to deal with low-quality face images. The proficiency to learn robust features from raw face images makes deep convolutional neural networks (DCNNs) attractive for face recognition. The DCNNs use softmax for quantifying model confidence of a class for an input face image to make a prediction. However, the softmax probabilities are not a true representation of model confidence and often misleading in feature space that may not be represented with available training examples. The primary goal of this paper is to improve the efficacy of face recognition systems by dealing with false positives through employing model uncertainty. Results of experimentations on open-source datasets show that 3–4% of accuracy is improved with model uncertainty over the DCNNs and conventional machine learning techniques.
Transfer learning for histopathology images: an empirical study
Histopathology imaging is one of the key methods used to determine the presence of cancerous cells. However, determining the results from such medical images is a tedious task because of their size, which may cause a delay in results for days. Even though CNNs are widely used to analyze medical images, they can only learn short-term dependency and ignore long-term dependency, which could be crucial in processing higher dimensional histology images. Transformers, however, make use of a self-attention mechanism, which might be helpful to learn dependencies across an entire set of features. To process histology images, deep learning models require a large number of images, which is usually not available. Transfer learning, which is often used to deal with this issue, involves fine-tuning a trained model for use with medical images by adding features. In context, it is essential to analyze which CNNs or transformers are more conducive to transfer learning. In this study, we performed an empirical study to evaluate the performance of different pre-trained deep learning models for the classification of lung and colon cancer on histology images. Vision transformer and CNN models pre-trained on image-net are analyzed for the classification of histopathology images. We performed an experiment on the LC25000 dataset for the evaluation of models. The dataset consists of five classes, two belong to colon and three belong to lung cancer. The insights and observations obtained from an ablation study performed on different pre-trained models show vision transformers perform better than CNN based models for histopathology image classification using transfer learning. Moreover, the vision transformer with more layers of ViT-L32 performs better than ViTB32 with fewer layers.
MDVA-GAN: multi-domain visual attribution generative adversarial networks
Some pixels of an input image have thick information and give insights about a particular category during classification decisions. Visualization of these pixels is a well-studied problem in computer vision, called visual attribution (VA), which helps radiologists to recognize abnormalities and identify a particular disease in the medical image. In recent years, several classification-based techniques for domain-specific attribute visualization have been proposed, but these techniques can only highlight a small subset of most discriminative features. Therefore, their generated VA maps are inadequate to visualize all effects in an input image. Due to recent advancements in generative models, generative model-based VA techniques are introduced which generate efficient VA maps and visualize all affected regions. To deal the issue, generative adversarial network-based VA techniques are recently proposed, where the researchers leverage the advances in domain adaption techniques to learn a map for abnormal-to-normal medical image translation. As these approaches rely on a two-domain translation model, it would require training as many models as number of diseases in a medical dataset, which is a tedious and compute-intensive task. In this work, we introduce a unified multi-domain VA model that generates a VA map of more than one disease at a time. The proposed unified model gets images from a particular domain and its domain label as input to generate VA map and visualize all the affected regions by that particular disease. Experiments on the CheXpert dataset, which is a publicly available multi-disease chest radiograph dataset, and the TBX11K dataset show that the proposed model generates identical results.
Counterfactual explanation of Bayesian model uncertainty
Artificial intelligence systems are becoming ubiquitous in everyday life as well as in high-risk environments, such as autonomous driving, medical treatment, and medicine. The opaque nature of the deep neural network raises concerns about its adoption in high-risk environments. It is important for researchers to explain how these models reach their decisions. Most of the existing methods rely on softmax to explain model decisions. However, softmax is shown to be often misleading, particularly giving unjustified high confidence even for samples far from the training data. To overcome this shortcoming, we propose Bayesian model uncertainty for producing counterfactual explanations. In this paper, we compare the counterfactual explanation of models based on Bayesian uncertainty and softmax score. This work predictively produces minimal important features, which maximally change classifier output to explain the decision-making process of the Bayesian model. We used MNIST and Caltech Bird 2011 datasets for experiments. The results show that the Bayesian model outperforms the softmax model and produces more concise and human-understandable counterfactuals.