Catalogue Search | MBRL

Image Matching from Handcrafted to Deep Features: A Survey

by Junchi, Yan , Ma, Jiayi , Jiang Xingyu in Matching , Visual tasks

2021

As a fundamental and critical task in various visual applications, image matching can identify then correspond the same or similar structure/content from two or more images. Over the past decades, growing amount and diversity of methods have been proposed for image matching, particularly with the development of deep learning techniques over the recent years. However, it may leave several open questions about which method would be a suitable choice for specific applications with respect to different scenarios and task requirements and how to design better image matching methods with superior performance in accuracy, robustness and efficiency. This encourages us to conduct a comprehensive and systematic review and analysis for those classical and latest techniques. Following the feature-based image matching pipeline, we first introduce feature detection, description, and matching techniques from handcrafted methods to trainable ones and provide an analysis of the development of these methods in theory and practice. Secondly, we briefly introduce several typical image matching-based applications for a comprehensive understanding of the significance of image matching. In addition, we also provide a comprehensive and objective comparison of these classical and latest techniques through extensive experiments on representative datasets. Finally, we conclude with the current status of image matching technologies and deliver insightful discussions and prospects for future works. This survey can serve as a reference for (but not limited to) researchers and engineers in image matching and related fields.

Journal Article

Share this book

Add to My Shelf

DepthFormer: Exploiting Long-range Correlation and Local Information for Accurate Monocular Depth Estimation

by Li, Zhenyu , Chen, Zehui , Liu, Xianming in Ablation , Competition , Convolution

2023

This paper aims to address the problem of supervised monocular depth estimation. We start with a meticulous pilot study to demonstrate that the long-range correlation is essential for accurate depth estimation. Moreover, the Transformer and convolution are good at long-range and close-range depth estimation, respectively. Therefore, we propose to adopt a parallel encoder architecture consisting of a Transformer branch and a convolution branch. The former can model global context with the effective attention mechanism and the latter aims to preserve the local information as the Transformer lacks the spatial inductive bias in modeling such contents. However, independent branches lead to a shortage of connections between features. To bridge this gap, we design a hierarchical aggregation and heterogeneous interaction module to enhance the Transformer features and model the affinity between the heterogeneous features in a set-to-set translation manner. Due to the unbearable memory cost introduced by the global attention on high-resolution feature maps, we adopt the deformable scheme to reduce the complexity. Extensive experiments on the KITTI, NYU, and SUN RGB-D datasets demonstrate that our proposed model, termed DepthFormer, surpasses state-of-the-art monocular depth estimation methods with prominent margins. The effectiveness of each proposed module is elaborately evaluated through meticulous and intensive ablation studies.

Journal Article

Share this book

Add to My Shelf

Fully 1 × 1 Convolutional Network for Lightweight Image Super-resolution

by Jiang, Kui , Wu, Gang , Liu, Xianming in Ablation , Artificial neural networks , Computational efficiency

2024

Deep convolutional neural networks, particularly large models with large kernels (3 × 3 or more), have achieved significant progress in single image super-resolution (SISR) tasks. However, the heavy computational footprint of such models prevents their deployment in real-time, resource-constrained environments. Conversely, 1 × 1 convolutions have substantial computational efficiency, but struggle with aggregating local spatial representations, which is an essential capability for SISR models. In response to this dichotomy, we propose to harmonize the merits of both 3 × 3 and 1 × 1 kernels, and exploit their great potential for lightweight SISR tasks. Specifically, we propose a simple yet effective fully 1 × 1 convolutional network, named shift-Conv-based network (SCNet). By incorporating a parameter-free spatial-shift operation, the fully 1 × 1 convolutional network is equipped with a powerful representation capability and impressive computational efficiency. Extensive experiments demonstrate that SCNets, despite their fully 1 × 1 convolutional structure, consistently match or even surpass the performance of existing lightweight SR models that employ regular convolutions. The code and pretrained models can be found at https://github.com/Aitical/SCNet.

Journal Article

Share this book

Add to My Shelf

Satellite Image Super-Resolution via Multi-Scale Residual Deep Neural Network

by Lu, Tao , Wang, Jiaming , Jiang, Junjun in Algorithms , Artificial neural networks , Clustering

2019

Recently, the application of satellite remote sensing images is becoming increasingly popular, but the observed images from satellite sensors are frequently in low-resolution (LR). Thus, they cannot fully meet the requirements of object identification and analysis. To utilize the multi-scale characteristics of objects fully in remote sensing images, this paper presents a multi-scale residual neural network (MRNN). MRNN adopts the multi-scale nature of satellite images to reconstruct high-frequency information accurately for super-resolution (SR) satellite imagery. Different sizes of patches from LR satellite images are initially extracted to fit different scale of objects. Large-, middle-, and small-scale deep residual neural networks are designed to simulate differently sized receptive fields for acquiring relative global, contextual, and local information for prior representation. Then, a fusion network is used to refine different scales of information. MRNN fuses the complementary high-frequency information from differently scaled networks to reconstruct the desired high-resolution satellite object image, which is in line with human visual experience (“look in multi-scale to see better”). Experimental results on the SpaceNet satellite image and NWPU-RESISC45 databases show that the proposed approach outperformed several state-of-the-art SR algorithms in terms of objective and subjective image qualities.

Journal Article

Share this book

Add to My Shelf

Deep Distillation Recursive Network for Remote Sensing Imagery Super-Resolution

by Jiang, Kui , Xiao, Jing , Yi, Peng in Algorithms , Artificial neural networks , Compensation

2018

Deep convolutional neural networks (CNNs) have been widely used and achieved state-of-the-art performance in many image or video processing and analysis tasks. In particular, for image super-resolution (SR) processing, previous CNN-based methods have led to significant improvements, when compared with shallow learning-based methods. However, previous CNN-based algorithms with simple direct or skip connections are of poor performance when applied to remote sensing satellite images SR. In this study, a simple but effective CNN framework, namely deep distillation recursive network (DDRN), is presented for video satellite image SR. DDRN includes a group of ultra-dense residual blocks (UDB), a multi-scale purification unit (MSPU), and a reconstruction module. In particular, through the addition of rich interactive links in and between multiple-path units in each UDB, features extracted from multiple parallel convolution layers can be shared effectively. Compared with classical dense-connection-based models, DDRN possesses the following main properties. (1) DDRN contains more linking nodes with the same convolution layers. (2) A distillation and compensation mechanism, which performs feature distillation and compensation in different stages of the network, is also constructed. In particular, the high-frequency components lost during information propagation can be compensated in MSPU. (3) The final SR image can benefit from the feature maps extracted from UDB and the compensated components obtained from MSPU. Experiments on Kaggle Open Source Dataset and Jilin-1 video satellite images illustrate that DDRN outperforms the conventional CNN-based baselines and some state-of-the-art feature extraction approaches.

Journal Article

Share this book

Add to My Shelf

Towards fairness-aware and privacy-preserving enhanced collaborative learning for healthcare

by Bai, Guo , Zhang, Feilong , Ye, Qixiang in 692/700/1421 , 692/700/3935 , Algorithms

2025

The widespread integration of AI algorithms in healthcare has sparked ethical concerns, particularly regarding privacy and fairness. Federated Learning (FL) offers a promising solution to learn from a broad spectrum of patient data without directly accessing individual records, enhancing privacy while facilitating knowledge sharing across distributed data sources. However, healthcare institutions face significant variations in access to crucial computing resources, with resource budgets often linked to demographic and socio-economic factors, exacerbating unfairness in participation. While heterogeneous federated learning methods allow healthcare institutions with varying computational capacities to collaborate, they fail to address the performance gap between resource-limited and resource-rich institutions. As a result, resource-limited institutions may receive suboptimal models, further reinforcing disparities in AI-driven healthcare outcomes. Here, we propose a resource-adaptive framework for collaborative learning that dynamically adjusts to varying computational capacities, ensuring fair participation. Our approach enhances model accuracy, safeguards patient privacy, and promotes equitable access to trustworthy and efficient AI-driven healthcare solutions. Building a trustworthy and effective healthcare AI ecosystem requires a scheme that embodies fairness, communication-efficient characteristics, and enhanced privacy protection. Here, the authors develop a framework for healthcare institutions with varying resources to fairly collaborate on AI training while protecting patient privacy, ensuring equitable access to high-quality medical AI regardless of resource constraints.

Journal Article

Share this book

Add to My Shelf

Machine learning-based in-hospital mortality prediction of HIV/AIDS patients with Talaromyces marneffei infection in Guangxi, China

by Chen, Rongfeng , Meng, Sirun , Li, Yueqi in Acquired immune deficiency syndrome , Acquired Immunodeficiency Syndrome , Additives

2022

Talaromycosis is a serious regional disease endemic in Southeast Asia. In China, Talaromyces marneffei (T. marneffei) infections is mainly concentrated in the southern region, especially in Guangxi, and cause considerable in-hospital mortality in HIV-infected individuals. Currently, the factors that influence in-hospital death of HIV/AIDS patients with T. marneffei infection are not completely clear. Existing machine learning techniques can be used to develop a predictive model to identify relevant prognostic factors to predict death and appears to be essential to reducing in-hospital mortality. We prospectively enrolled HIV/AIDS patients with talaromycosis in the Fourth People's Hospital of Nanning, Guangxi, from January 2012 to June 2019. Clinical features were selected and used to train four different machine learning models (logistic regression, XGBoost, KNN, and SVM) to predict the treatment outcome of hospitalized patients, and 30% internal validation was used to evaluate the performance of models. Machine learning model performance was assessed according to a range of learning metrics, including area under the receiver operating characteristic curve (AUC). The SHapley Additive exPlanations (SHAP) tool was used to explain the model. A total of 1927 HIV/AIDS patients with T. marneffei infection were included. The average in-hospital mortality rate was 13.3% (256/1927) from 2012 to 2019. The most common complications/coinfections were pneumonia (68.9%), followed by oral candida (47.5%), and tuberculosis (40.6%). Deceased patients showed higher CD4/CD8 ratios, aspartate aminotransferase (AST) levels, creatinine levels, urea levels, uric acid (UA) levels, lactate dehydrogenase (LDH) levels, total bilirubin levels, creatine kinase levels, white blood-cell counts (WBC) counts, neutrophil counts, procaicltonin levels and C-reactive protein (CRP) levels and lower CD3+ T-cell count, CD8+ T-cell count, and lymphocyte counts, platelet (PLT), high-density lipoprotein cholesterol (HDL), hemoglobin (Hb) levels than those of surviving patients. The predictive XGBoost model exhibited 0.71 sensitivity, 0.99 specificity, and 0.97 AUC in the training dataset, and our outcome prediction model provided robust discrimination in the testing dataset, showing an AUC of 0.90 with 0.69 sensitivity and 0.96 specificity. The other three models were ruled out due to poor performance. Septic shock and respiratory failure were the most important predictive features, followed by uric acid, urea, platelets, and the AST/ALT ratios. The XGBoost machine learning model is a good predictor in the hospitalization outcome of HIV/AIDS patients with T. marneffei infection. The model may have potential application in mortality prediction and high-risk factor identification in the talaromycosis population.

Journal Article

Share this book

Add to My Shelf

Rethinking 3D-CNN in Hyperspectral Image Super-Resolution

by Ma, Qing , Wang, Wenbing , Liu, Xianming in 3D convolution , Classification , convolutional neural network

2023

Recently, CNN-based methods for hyperspectral image super-resolution (HSISR) have achieved outstanding performance. Due to the multi-band property of hyperspectral images, 3D convolutions are natural candidates for extracting spatial–spectral correlations. However, pure 3D CNN models are rare to see, since they are generally considered to be too complex, require large amounts of data to train, and run the risk of overfitting on relatively small-scale hyperspectral datasets. In this paper, we question this common notion and propose Full 3D U-Net (F3DUN), a full 3D CNN model combined with the U-Net architecture. By introducing skip connections, the model becomes deeper and utilizes multi-scale features. Extensive experiments show that F3DUN can achieve state-of-the-art performance on HSISR tasks, indicating the effectiveness of the full 3D CNN on HSISR tasks, thanks to the carefully designed architecture. To further explore the properties of the full 3D CNN model, we develop a 3D/2D mixed model, a popular kind of model prior, called Mixed U-Net (MUN) which shares a similar architecture with F3DUN. Through analysis on F3DUN and MUN, we find that 3D convolutions give the model a larger capacity; that is, the full 3D CNN model can obtain better results than the 3D/2D mixed model with the same number of parameters when it is sufficiently trained. Moreover, experimental results show that the full 3D CNN model could achieve competitive results with the 3D/2D mixed model on a small-scale dataset, suggesting that 3D CNN is less sensitive to data scaling than what people used to believe. Extensive experiments on two benchmark datasets, CAVE and Harvard, demonstrate that our proposed F3DUN exceeds state-of-the-art HSISR methods both quantitatively and qualitatively.

Journal Article

Share this book

Add to My Shelf

Spectral-Spatial Feature Extraction of Hyperspectral Images Based on Propagation Filter

by Jiang, Xinwei , Chen, Zhikun , Cai, Zhihua in Algorithms , Classification , Feature extraction

2018

Recently, image-filtering based hyperspectral image (HSI) feature extraction has been widely studied. However, due to limited spatial resolution and feature distribution complexity, the problems of cross-region mixing after filtering and spectral discriminative reduction still remain. To address these issues, this paper proposes a spectral-spatial propagation filter (PF) based HSI feature extraction method that can effectively address the above problems. The dimensionality/band of an HSI is typically high; therefore, principal component analysis (PCA) is first used to reduce the HSI dimensionality. Then, the principal components of the HSI are filtered with the PF. When cross-region mixture occurs in the image, the filter template reduces the weight assignments of the cross-region mixed pixels to handle the issue of cross-region mixed pixels simply and effectively. To validate the effectiveness of the proposed method, experiments are carried out on three common HSIs using support vector machine (SVM) classifiers with features learned by the PF. The experimental results demonstrate that the proposed method effectively extracts the spectral-spatial features of HSIs and significantly improves the accuracy of HSI classification.

Journal Article

Share this book

Add to My Shelf

Potential Inhibitors of Monkeypox Virus Revealed by Molecular Modeling Approach to Viral DNA Topoisomerase I

by Liao, Yanyan , Liang, Hao , An, Sanqi in Amino acids , Antibiotics , Antiviral agents

2023

The monkeypox outbreak has become a global public health emergency. The lack of valid and safe medicine is a crucial obstacle hindering the extermination of orthopoxvirus infections. The identification of potential inhibitors from natural products, including Traditional Chinese Medicine (TCM), by molecular modeling could expand the arsenal of antiviral chemotherapeutic agents. Monkeypox DNA topoisomerase I (TOP1) is a highly conserved viral DNA repair enzyme with a small size and low homology to human proteins. The protein model of viral DNA TOP1 was obtained by homology modeling. The reliability of the TOP1 model was validated by analyzing its Ramachandran plot and by determining the compatibility of the 3D model with its sequence using the Verify 3D and PROCHECK services. In order to identify potential inhibitors of TOP1, an integrated library of 4103 natural products was screened via Glide docking. Surface Plasmon Resonance (SPR) was further implemented to assay the complex binding affinity. Molecular dynamics simulations (100 ns) were combined with molecular mechanics Poisson–Boltzmann surface area (MM/PBSA) computations to reveal the binding mechanisms of the complex. As a result, three natural compounds were highlighted as potential inhibitors via docking-based virtual screening. Rosmarinic acid, myricitrin, quercitrin, and ofloxacin can bind TOP1 with KD values of 2.16 μM, 3.54 μM, 4.77 μM, and 5.46 μM, respectively, indicating a good inhibitory effect against MPXV. The MM/PBSA calculations revealed that rosmarinic acid had the lowest binding free energy at −16.18 kcal/mol. Myricitrin had a binding free energy of −13.87 kcal/mol, quercitrin had a binding free energy of −9.40 kcal/mol, and ofloxacin had a binding free energy of −9.64 kcal/mol. The outputs (RMSD/RMSF/Rg/SASA) also indicated that the systems were well-behaved towards the complex. The selected compounds formed several key hydrogen bonds with TOP1 residues (TYR274, LYS167, GLY132, LYS133, etc.) via the binding mode analysis. TYR274 was predicted to be a pivotal residue for compound interactions in the binding pocket of TOP1. The results of the enrichment analyses illustrated the potential pharmacological networks of rosmarinic acid. The molecular modeling approach may be acceptable for the identification and design of novel poxvirus inhibitors; however, further studies are warranted to evaluate their therapeutic potential.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter