Catalogue Search | MBRL

Efficient Hybrid Supervision for Instance Segmentation in Aerial Images

by Fu, Ying , Chen, Linwei , Liu, Hongzhe in aerial images , aerial photography , data collection

2021

Instance segmentation in aerial images is of great significance for remote sensing applications, and it is inherently more challenging because of cluttered background, extremely dense and small objects, and objects with arbitrary orientations. Besides, current mainstream CNN-based methods often suffer from the trade-off between labeling cost and performance. To address these problems, we present a pipeline of hybrid supervision. In the pipeline, we design an ancillary segmentation model with the bounding box attention module and bounding box filter module. It is able to generate accurate pseudo pixel-wise labels from real-world aerial images for training any instance segmentation models. Specifically, bounding box attention module can effectively suppress the noise in cluttered background and improve the capability of segmenting small objects. Bounding box filter module works as a filter which removes the false positives caused by cluttered background and densely distributed objects. Our ancillary segmentation model can locate object pixel-wisely instead of relying on horizontal bounding box prediction, which has better adaptability to arbitrary oriented objects. Furthermore, oriented bounding box labels are utilized for handling arbitrary oriented objects. Experiments on iSAID dataset show that the proposed method can achieve comparable performance (32.1 AP) to fully supervised methods (33.9 AP), which is obviously higher than weakly supervised setting (26.5 AP), when using only 10% pixel-wise labels.

Journal Article

Share this book

Add to My Shelf

Semi-Supervised Learning in Medical Images Through Graph-Embedded Random Forest

by Liu, Zhenzhong , Gu, Lin , Zhao, Shen in Algorithms , Alzheimer's disease , central nervous system (CNS)

2020

In this paper, we propose a novel semi-supervised random forest to tackle the challenging problem of lacking annotation in medical imaging analysis. Observing that the bottleneck of standard random forest is the biased information gain estimation, we replace it with a novel graph-embedded entropy which incorporates information from both labelled and unlabelled data. Empirical results show that our information gain is more reliable than the one used in the traditional random forest under insufficient labelled data. By slightly modifying the training process of standard random forest, our algorithm significantly improves the performance while preserving the virtue of random forest. Our method has shown superior performance with very limited data in both brain imaging analysis and machine learning benchmark.

Journal Article

Share this book

Add to My Shelf

Artificial Intelligence for Dunhuang Cultural Heritage Protection: The Project and the Dataset

by Zhang, Shijie , Zhang, Jiawan , An, Huili in Archaeology , Artificial intelligence , Artists

2022

In this work, we introduce our project on Dunhuang cultural heritage protection using artificial intelligence. The Dunhuang Mogao Grottoes in China, also known as the Grottoes of the Thousand Buddhas, is a religious and cultural heritage located on the Silk Road. The grottoes were built from the 4th century to the 14th century. After thousands of years, the in grottoes decaying is serious. In addition, numerous historical records were destroyed throughout the years, making it difficult for archaeologists to reconstruct history. We aim to use modern computer vision and machine learning technologies to solve such challenges. First, we propose to use deep networks to automatically perform the restoration. Through out experiments, we find the automated restoration can provide comparable quality as those manually restored from an archaeologist. This can significantly speed up the restoration given the enormous size of the historical paintings. Second, we propose to use detection and retrieval for further analyzing the tremendously large amount of objects because it is unreasonable to manually label and analyze them. Several state-of-the-art methods are rigorously tested and quantitatively compared in different criteria and categorically. In this work, we created a new dataset, namely, AI for Dunhuang, to facilitate the research. Version v1.0 of the dataset comprises of data and label for the restoration, style transfer, detection, and retrieval. Specifically, the dataset has 10,000 images for restoration, 3455 for style transfer, and 6147 for property retrieval. Lastly, we propose to use style transfer to link and analyze the styles over time, given that the grottoes were build over 1000 years by numerous artists. This enables the possibly to analyze and study the art styles over 1000 years and further enable future researches on cross-era style analysis. We benchmark representative methods and conduct a comparative study on the results for our solution. The dataset will be publicly available along with this paper.

Journal Article

Share this book

Add to My Shelf

LucIE: Language-Guided Local Image Editing for Fashion Images

by Wen, Huanglu , Fu, Ying , You, Shaodi

2025

Journal Article

Share this book

Add to My Shelf

Editor’s Note: Special Issue on Physics-Based Vision Meets Deep Learning

in Deep learning , Physics , Vision

2024

Journal Article

Share this book

Add to My Shelf

Unsupervised anomaly detection with compact deep features for wind turbine blade images taken by a drone

by Kawakami, Rei , Ito, Masahiko , Harano, Tohru in Abnormalities , Anomalies , Artificial Intelligence

2019

Detecting anomalies in wind turbine blades from aerial images taken by drones can reduce the costs of periodic inspections. Deep learning is useful for image recognition, but it requires large amounts of data to be collected on rare abnormalities. In this paper, we propose a method to distinguish normal and abnormal parts of a blade by combining one-class support vector machine, an unsupervised learning method, with deep features learned from a generic image dataset. The images taken by a drone are subsampled, projected to the feature space, and compressed by using principle component analysis (PCA) to make them learnable. Experiments show that features in the lower layers of deep nets are useful for detecting anomalies in blade images.

Journal Article

Share this book

Add to My Shelf

RGB‐guided hyperspectral image super‐resolution with deep progressive learning

by Fu, Ying , Yan, Chenggang , Huang, Liwei in Artificial neural networks , Color imagery , computer vision

2024

Due to hardware limitations, existing hyperspectral (HS) camera often suffer from low spatial/temporal resolution. Recently, it has been prevalent to super‐resolve a low resolution (LR) HS image into a high resolution (HR) HS image with a HR RGB (or multispectral) image guidance. Previous approaches for this guided super‐resolution task often model the intrinsic characteristic of the desired HR HS image using hand‐crafted priors. Recently, researchers pay more attention to deep learning methods with direct supervised or unsupervised learning, which exploit deep prior only from training dataset or testing data. In this article, an efficient convolutional neural network‐based method is presented to progressively super‐resolve HS image with RGB image guidance. Specifically, a progressive HS image super‐resolution network is proposed, which progressively super‐resolve the LR HS image with pixel shuffled HR RGB image guidance. Then, the super‐resolution network is progressively trained with supervised pre‐training and unsupervised adaption, where supervised pre‐training learns the general prior on training data and unsupervised adaptation generalises the general prior to specific prior for variant testing scenes. The proposed method can effectively exploit prior from training dataset and testing HS and RGB images with spectral‐spatial constraint. It has a good generalisation capability, especially for blind HS image super‐resolution. Comprehensive experimental results show that the proposed deep progressive learning method outperforms the existing state‐of‐the‐art methods for HS image super‐resolution in non‐blind and blind cases.

Journal Article

Share this book

Add to My Shelf

VBMq: pursuit baremetal performance by embracing block I/O parallelism in virtualization

by ZHANG, Diming , YOU, Shaodi , HUANG, Hao in Affinity , Central processing units , Computer Science

2018

Barely acceptable block I/O performance prevents virtualization from being widely used in the High-Performance Computing field. Although the virtio paravirtual framework brings great I/O performance improvement, there is a sharp performance degradation when accessing high-performance NAND-flash-based devices in the virtual machine due to their data parallel design. The primary cause of this fact is the deficiency of block I/O parallelism in hypervisor, such as KVM and Xen. In this paper, we propose a novel design of block I/O layer for virtualization, named VBMq. VBMq is based on virtio paravirtual I/O model, aiming to solve the block I/O parallelism issue in virtualization. It uses multiple dedicated I/O threads to handle I/O requests in parallel. In the meanwhile, we use polling mechanism to alleviate overheads caused by the frequent context switches of the VM's notification to and from its hypervisor. Each dedicated I/O thread is assigned to a non-overlapping core to improve performance by avoiding unnecessary scheduling. In addition, we configure CPU affinity to optimize I/O completion for each request. The CPU affinity setting is very helpful to reduce CPU cache miss rate and increase CPU efficiency. The prototype system is based on Linux 4.1 kernel and QEMU 2.3.1. Our measurements show that the proposed method scales graciously in the multi-core environment, and provides performance which is 39.6x better than the baseline at most, and approaches bare-metal performance.

Journal Article

Share this book

Add to My Shelf

Pedestrian detection with motion features via two-stream ConvNets

by Trinh, Tu Tuan , Naemura, Takeshi , Kawakami, Rei in Artificial Intelligence , Artificial neural networks , Benchmarks

2018

Motion information can be important for detecting objects, but it has been used less for pedestrian detection, particularly with deep-learning-based methods. We propose a method that uses deep motion features as well as deep still-image features, following the success of two-stream convolutional networks, each of which are trained separately for spatial and temporal streams. To extract motion clues for detection differentiated from other background motions, the temporal stream takes as input the difference in frames that are weakly stabilized by optical flow. To make the networks applicable to bounding-box-level detection, the mid-level features are concatenated and combined with a sliding-window detector. We also introduce transfer learning from multiple sources in the two-stream networks, which can transfer still image and motion features from ImageNet and an action recognition dataset respectively, to overcome the insufficiency of training data for convolutional neural networks in pedestrian datasets. We conducted an evaluation on two popular large-scale pedestrian benchmarks, namely the Caltech Pedestrian Detection Benchmark and Daimler Mono Pedestrian Detection Benchmark. We observed 10% improvement compared to the same method but without motion features.

Journal Article

Share this book

Add to My Shelf

Modeling Weather Uncertainty for Multi-weather Co-Presence Estimation

by Bi, Qi , Gevers, Theo , You, Shaodi in Algorithms , Classification , Computer vision

2024

Images from outdoor scenes may be taken under various weather conditions. It is well studied that weather impacts the performance of computer vision algorithms and needs to be handled properly. However, existing algorithms model weather condition as a discrete status and estimate it using multi-label classification. The fact is that, physically, specifically in meteorology, weather are modeled as a continuous and transitional status. Instead of directly implementing hard classification as existing multi-weather classification methods do, we consider the physical formulation of multi-weather conditions and model the impact of physical-related parameter on learning from the image appearance. In this paper, we start with solid revisit of the physics definition of weather and how it can be described as a continuous machine learning and computer vision task. Namely, we propose to model the weather uncertainty, where the level of probability and co-existence of multiple weather conditions are both considered. A Gaussian mixture model is used to encapsulate the weather uncertainty and a uncertainty-aware multi-weather learning scheme is proposed based on prior-posterior learning. A novel multi-weather co-presence estimation transformer (MeFormer) is proposed. In addition, a new multi-weather co-presence estimation (MePe) dataset, along with 14 fine-grained weather categories and 16,078 samples, is proposed to benchmark both conventional multi-label weather classification task and multi-weather co-presence estimation task. Large scale experiments show that the proposed method achieves state-of-the-art performance and substantial generalization capabilities on both the conventional multi-label weather classification task and the proposed multi-weather co-presence estimation task. Besides, modeling weather uncertainty also benefits adverse-weather semantic segmentation.

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter