Catalogue Search | MBRL

Towards High Performance Human Keypoint Detection

by Chen, Zhe , Zhang, Jing , Tao Dacheng in Accuracy , Annotations , Benchmarks

2021

Human keypoint detection from a single image is very challenging due to occlusion, blur, illumination, and scale variance. In this paper, we address this problem from three aspects by devising an efficient network structure, proposing three effective training strategies, and exploiting four useful postprocessing techniques. First, we find that context information plays an important role in reasoning human body configuration and invisible keypoints. Inspired by this, we propose a cascaded context mixer (CCM), which efficiently integrates spatial and channel context information and progressively refines them. Then, to maximize CCM’s representation capability, we develop a hard-negative person detection mining strategy and a joint-training strategy by exploiting abundant unlabeled data. It enables CCM to learn discriminative features from massive diverse poses. Third, we present several sub-pixel refinement techniques for postprocessing keypoint predictions to improve detection accuracy. Extensive experiments on the MS COCO keypoint detection benchmark demonstrate the superiority of the proposed method over representative state-of-the-art (SOTA) methods. Our single model achieves comparable performance with the winner of the 2018 COCO Keypoint Detection Challenge. The final ensemble model sets a new SOTA on this benchmark. The source code will be released at https://github.com/chaimi2013/CCM.

Journal Article

Share this book

Add to My Shelf

Remote Sensing of Endogenous Pigmentation by Inducible Synthetic Circuits in Grasses

by Kenney, Samuel , Pathak, Sunita , Calderon, Santiago in Agriculture , anthocyanin biosynthesis , Anthocyanins

2026

Plant synthetic biology holds great promise for engineering plants to meet future demands. Genetic circuits are being designed, built and tested in plants to demonstrate the proof of concept. However, developing these components in monocots, which the world relies on for grain, lags behind dicot models, such as Arabidopsis thaliana and Nicotiana benthamiana. Here, we show the successful adaptation of a ligand‐inducible sensor to activate an endogenous anthocyanin pathway in the C4 monocot model Setaria viridis. We identify two transcription factors that can be expressed as a single transcript that are sufficient to induce endogenous anthocyanin production in S. viridis protoplasts and whole plants in a constitutive or ligand‐inducible manner. We also test multiple ligands to overcome physical barriers to ligand uptake, identifying triamcinolone acetonide (TA) as a highly potent inducer of this system. Using hyperspectral imaging and a discriminative target characterisation method in a near‐remote configuration, we can non‐destructively detect anthocyanin production in leaves in response to ligands. This work demonstrates the use of inducible expression systems in monocots to manipulate endogenous pigmentation production for remote detection. Applying inducible anthocyanin production coupled with sensitive detection algorithms could enable crop plants to report on the status of field contamination or detect undesirable chemicals impacting agriculture, ushering in an era of agriculture‐based sensor systems.

Journal Article

Share this book

Add to My Shelf

CE-FPN: enhancing channel information for object detection

by Guo, Jingjuan , Shen, Haibo , Luo, Yihao in Aliasing , Computer Communication Networks , Computer Science

2022

Feature pyramid network (FPN) has been an efficient framework to extract multi-scale features in object detection. However, current FPN-based methods mostly suffer from the intrinsic flaw of channel reduction, which brings about the loss of semantical information. And the miscellaneous feature maps may cause serious aliasing effects. In this paper, we present a novel channel enhancement feature pyramid network (CE-FPN) to alleviate these problems. Specifically, inspired by sub-pixel convolution, we propose sub-pixel skip fusion (SSF) to perform both channel enhancement and upsampling. Instead of the original 1 × 1 convolution and linear upsampling, it mitigates the information loss due to channel reduction. Then we propose sub-pixel context enhancement (SCE) for extracting stronger feature representations, which is superior to other context methods due to the utilization of rich channel information by sub-pixel convolution. Furthermore, we introduce a channel attention guided module (CAG) to optimize the final integrated features on each level. It alleviates the aliasing effect only with a few computational burdens. We evaluate our approaches on Pascal VOC and MS COCO benchmark. Extensive experiments show that CE-FPN achieves competitive performance and is more lightweight compared to state-of-the-art FPN-based detectors.

Journal Article

Share this book

Add to My Shelf

Sub-Pixel Extraction of Laser Stripe Center Using an Improved Gray-Gravity Method

by Li, Yuehua , Zhou, Jingbo , Liu, Lijian in adaptive sampling regions , improved gray-gravity method , line structured light sensor

2017

Laser stripe center extraction is a key step for the profile measurement of line structured light sensors (LSLS). To accurately obtain the center coordinates at sub-pixel level, an improved gray-gravity method (IGGM) was proposed. Firstly, the center points of the stripe were computed using the gray-gravity method (GGM) for all columns of the image. By fitting these points using the moving least squares algorithm, the tangential vector, the normal vector and the radius of curvature can be robustly obtained. One rectangular region could be defined around each of the center points. Its two sides that are parallel to the tangential vector could alter their lengths according to the radius of the curvature. After that, the coordinate for each center point was recalculated within the rectangular region and in the direction of the normal vector. The center uncertainty was also analyzed based on the Monte Carlo method. The obtained experimental results indicate that the IGGM is suitable for both the smooth stripes and the ones with sharp corners. The high accuracy center points can be obtained at a relatively low computation cost. The measured results of the stairs and the screw surface further demonstrate the effectiveness of the method.

Journal Article

Share this book

Add to My Shelf

YOLOv7-TS: A Traffic Sign Detection Model Based on Sub-Pixel Convolution and Feature Fusion

by Wang, Yunlei , Zhang, Fukai , Wu, Xuan in Accuracy , Algorithms , Classification

2024

In recent years, significant progress has been witnessed in the field of deep learning-based object detection. As a subtask in the field of object detection, traffic sign detection has great potential for development. However, the existing object detection methods for traffic sign detection in real-world scenes are plagued by issues such as the omission of small objects and low detection accuracies. To address these issues, a traffic sign detection model named YOLOv7-Traffic Sign (YOLOv7-TS) is proposed based on sub-pixel convolution and feature fusion. Firstly, the up-sampling capability of the sub-pixel convolution integrating channel dimension is harnessed and a Feature Map Extraction Module (FMEM) is devised to mitigate the channel information loss. Furthermore, a Multi-feature Interactive Fusion Network (MIFNet) is constructed to facilitate enhanced information interaction among all feature layers, improving the feature fusion effectiveness and strengthening the perception ability of small objects. Moreover, a Deep Feature Enhancement Module (DFEM) is established to accelerate the pooling process while enriching the highest-layer feature. YOLOv7-TS is evaluated on two traffic sign datasets, namely CCTSDB2021 and TT100K. Compared with YOLOv7, YOLOv7-TS, with a smaller number of parameters, achieves a significant enhancement of 3.63% and 2.68% in the mean Average Precision (mAP) for each respective dataset, proving the effectiveness of the proposed model.

Journal Article

Share this book

Add to My Shelf

Gap Measurement Method for Railway Switch Machines Based on the Fusion of Deep Vision and Geometric Features

by Li, Hong , Zou, Yiyang , Feng, Qingsheng in Algorithms , Annotations , Datasets

2026

The gap dimension of a railway switch machine is a critical physical quantity for determining the locking status of railway turnouts. Under operating conditions characterized by heavy oil contamination, complex illumination, and equipment vibration, existing visual measurement methods often struggle to maintain stability and achieve sub-pixel precision. To address this issue, this paper proposes a gap measurement method based on the fusion of vision and geometric features (G-VFM). The method first utilizes a confidence-aware optimized YOLOv8 model to achieve robust localization of the gap region. Subsequently, an improved multi-channel U-Net is employed to extract soft-edge probability maps, based on which a 20-dimensional structured geometric descriptor is constructed. Finally, visual semantic features and geometric priors are fused for regression through an R34-Fusion two-stream residual network, and systematic errors are corrected using a weighted Huber loss combined with a piecewise linear calibration strategy. Test results on a constructed field dataset show that the proposed method achieves a Mean Absolute Error (MAE) of 0.0076 mm and a maximum error of 0.0193 mm. It achieves a 100% pass rate under an industrial tolerance of 0.02 mm, with an end-to-end inference time of 52.23 ms (~19.15 FPS), balancing both precision and efficiency. Further tests on illumination degradation, noise interference, and cross-batch evaluations indicate that the method maintains relatively stable performance across various complex scenarios. However, performance decreases significantly under extremely low-light conditions, suggesting that actual deployment may require integration with active lighting or multi-sensor fusion to ensure system reliability across all working conditions. Overall, this method achieves high-precision gap measurement under current experimental conditions and provides a feasible solution for vision-based switch machine status monitoring.

Journal Article

Share this book

Add to My Shelf

AROSICS: An Automated and Robust Open-Source Image Co-Registration Software for Multi-Sensor Satellite Data

by Hollstein, André , Scheffler, Daniel , Segl, Karl in Fourier shift theorem , geometric pre-processing , image co-registration

2017

Geospatial co-registration is a mandatory prerequisite when dealing with remote sensing data. Inter- or intra-sensoral misregistration will negatively affect any subsequent image analysis, specifically when processing multi-sensoral or multi-temporal data. In recent decades, many algorithms have been developed to enable manual, semi- or fully automatic displacement correction. Especially in the context of big data processing and the development of automated processing chains that aim to be applicable to different remote sensing systems, there is a strong need for efficient, accurate and generally usable co-registration. Here, we present AROSICS (Automated and Robust Open-Source Image Co-Registration Software), a Python-based open-source software including an easy-to-use user interface for automatic detection and correction of sub-pixel misalignments between various remote sensing datasets. It is independent of spatial or spectral characteristics and robust against high degrees of cloud coverage and spectral and temporal land cover dynamics. The co-registration is based on phase correlation for sub-pixel shift estimation in the frequency domain utilizing the Fourier shift theorem in a moving-window manner. A dense grid of spatial shift vectors can be created and automatically filtered by combining various validation and quality estimation metrics. Additionally, the software supports the masking of, e.g., clouds and cloud shadows to exclude such areas from spatial shift detection. The software has been tested on more than 9000 satellite images acquired by different sensors. The results are evaluated exemplarily for two inter-sensoral and two intra-sensoral use cases and show registration results in the sub-pixel range with root mean square error fits around 0.3 pixels and better.

Journal Article

Share this book

Add to My Shelf

Spatial-Temporal Sub-Pixel Mapping Based on Swarm Intelligence Theory

by Feng, Ruyi , Zhang, Liangpei , Zhong, Yanfei in Algorithms , clonal selection sub-pixel mapping (CSSM) , differential evolution sub-pixel mapping (DESM)

2016

In the past decades, sub-pixel mapping algorithms have been extensively developed due to the large number of different applications. However, most of the sub-pixel mapping algorithms are based on single-temporal images, and the results are usually compromised without auxiliary information due to the ill-posed problem of sub-pixel mapping. In this paper, a novel spatial-temporal sub-pixel mapping algorithm based on swarm intelligence theory is proposed for multitemporal remote sensing imagery. Swarm intelligence theory involves clonal selection sub-pixel mapping (CSSM), which evolves the solution by emulating the biological advantage of the human immune system, and differential evolution sub-pixel mapping (DESM), which optimizes the solution by intelligent operations and heuristic searching in the solution pool. In addition, considering the under-determined problem of sub-pixel mapping, the spatial-temporal sub-pixel mapping method is used to obtain the distribution information at a fine spatial resolution from the bitemporal image pair, which exactly regularizes the ill-posed problem. Furthermore, the short-interval temporal information and the fine spatial distribution information within the bitemporal image pair can be integrated for further use, such as timely and detailed land-cover change detection (LCCD). To verify the validation of the swarm intelligence theory based spatial-temporal sub-pixel mapping algorithm, the proposed algorithm was compared with several traditional sub-pixel mapping algorithms, in both synthetic and real image experiments. The experimental results confirm that the proposed algorithm outperforms the traditional approaches, achieving a better sub-pixel mapping result both qualitatively and quantitatively, as well as improving the subsequent LCCD performance.

Journal Article

Share this book

Add to My Shelf

Sub-Pixel Edge Detection of Circular Holes via Adaptive Filtering and Improved Zernike Moments

by Dong, Senlin , Zhang, Weitang in Accuracy , Algorithms , canny operator

2026

To meet the requirements of high accuracy in image edge localization and strong noise resistance for computer vision calibration and precise measurement, an improved Zernike moment sub-pixel high-precision measurement method for circular hole-like workpieces is proposed. Firstly, the Canny operator is used as a coarse edge detection algorithm, with the traditional Gaussian filter in the Canny operator replaced by an improved Laplacian edge-adaptive median filter. This approach demonstrates improved edge preservation compared to traditional and adaptive median filtering, especially under high-concentration noise. Then, a sub-pixel edge detection algorithm is applied to refine the edges, thus enhancing the edge localization accuracy. An improved Zernike moment sub-pixel detection algorithm is employed for precise edge point detection. The improved algorithm selects a Zernike moment parameter template with higher detection accuracy. Finally, the inner and outer diameters of the circular hole-like part are measured by fitting the profile using the least squares method. Experimental results on several different workpieces demonstrate that the proposed algorithm achieves higher accuracy than the traditional Zernike moment sub-pixel method, with an error reduction of 75.1%, meeting the precision requirements in modern industrial part manufacturing processes.

Journal Article

Share this book

Add to My Shelf

Automated Quantification of Surface Water Inundation in Wetlands Using Optical Satellite Imagery

by Huang, Chengquan , DeVries, Ben , Lang, Megan in inundation , Landsat , sub-pixel water fraction

2017

We present a fully automated and scalable algorithm for quantifying surface water inundation in wetlands. Requiring no external training data, our algorithm estimates sub-pixel water fraction (SWF) over large areas and long time periods using Landsat data. We tested our SWF algorithm over three wetland sites across North America, including the Prairie Pothole Region, the Delmarva Peninsula and the Everglades, representing a gradient of inundation and vegetation conditions. We estimated SWF at 30-m resolution with accuracies ranging from a normalized root-mean-square-error of 0.11 to 0.19 when compared with various high-resolution ground and airborne datasets. SWF estimates were more sensitive to subtle inundated features compared to previously published surface water datasets, accurately depicting water bodies, large heterogeneously inundated surfaces, narrow water courses and canopy-covered water features. Despite this enhanced sensitivity, several sources of errors affected SWF estimates, including emergent or floating vegetation and forest canopies, shadows from topographic features, urban structures and unmasked clouds. The automated algorithm described in this article allows for the production of high temporal resolution wetland inundation data products to support a broad range of applications.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter