Catalogue Search | MBRL

Remote Sensing Object Detection in the Deep Learning Era—A Review

by Gui, Shengxi , Qin, Rongjun , Song, Shuang in automatic detection , Automation , Datasets

2024

Given the large volume of remote sensing images collected daily, automatic object detection and segmentation have been a consistent need in Earth observation (EO). However, objects of interest vary in shape, size, appearance, and reflecting properties. This is not only reflected by the fact that these objects exhibit differences due to their geographical diversity but also by the fact that these objects appear differently in images collected from different sensors (optical and radar) and platforms (satellite, aerial, and unmanned aerial vehicles (UAV)). Although there exists a plethora of object detection methods in the area of remote sensing, given the very fast development of prevalent deep learning methods, there is still a lack of recent updates for object detection methods. In this paper, we aim to provide an update that informs researchers about the recent development of object detection methods and their close sibling in the deep learning era, instance segmentation. The integration of these methods will cover approaches to data at different scales and modalities, such as optical, synthetic aperture radar (SAR) images, and digital surface models (DSM). Specific emphasis will be placed on approaches addressing data and label limitations in this deep learning era. Further, we survey examples of remote sensing applications that benefited from automatic object detection and discuss future trends of the automatic object detection in EO.

Journal Article

Share this book

Add to My Shelf

Segment Anything Is Not Always Perfect: An Investigation of SAM on Different Real-world Applications

in Image segmentation , Remote sensing , Source code

2024

Recently, Meta AI Research approaches a general, promptable segment anything model (SAM) pre-trained on an unprecedentedly large segmentation dataset (SA-1B). Without a doubt, the emergence of SAM will yield significant benefits for a wide array of practical image segmentation applications. In this study, we conduct a series of intriguing investigations into the performance of SAM across various applications, particularly in the fields of natural images, agriculture, manufacturing, remote sensing and healthcare. We analyze and discuss the benefits and limitations of SAM, while also presenting an outlook on its future development in segmentation tasks. By doing so, we aim to give a comprehensive understanding of SAM’s practical applications. This work is expected to provide insights that facilitate future research activities toward generic segmentation. Source code is publicly available at https://github.com/LiuTingWed/SAM-Not-Perfect.

Journal Article

Share this book

Add to My Shelf

Generalist Vision Foundation Models for Medical Imaging: A Case Study of Segment Anything Model on Zero-Shot Medical Segmentation

by Abaxi, Sai Mu Dalike , Yuan, Wu , Wei, Hao in Accuracy , Benchmarks , Case studies

2023

Medical image analysis plays an important role in clinical diagnosis. In this paper, we examine the recent Segment Anything Model (SAM) on medical images, and report both quantitative and qualitative zero-shot segmentation results on nine medical image segmentation benchmarks, covering various imaging modalities, such as optical coherence tomography (OCT), magnetic resonance imaging (MRI), and computed tomography (CT), as well as different applications including dermatology, ophthalmology, and radiology. Those benchmarks are representative and commonly used in model development. Our experimental results indicate that while SAM presents remarkable segmentation performance on images from the general domain, its zero-shot segmentation ability remains restricted for out-of-distribution images, e.g., medical images. In addition, SAM exhibits inconsistent zero-shot segmentation performance across different unseen medical domains. For certain structured targets, e.g., blood vessels, the zero-shot segmentation of SAM completely failed. In contrast, a simple fine-tuning of it with a small amount of data could lead to remarkable improvement of the segmentation quality, showing the great potential and feasibility of using fine-tuned SAM to achieve accurate medical image segmentation for a precision diagnostics. Our study indicates the versatility of generalist vision foundation models on medical imaging, and their great potential to achieve desired performance through fine-turning and eventually address the challenges associated with accessing large and diverse medical datasets in support of clinical diagnostics.

Journal Article

Share this book

Add to My Shelf

Using the Segment Anything Model to Develop Control Pallet Loading System

by Fao, Xenin , Andriyanov, Nikita , Mikhailova, Svetlana in Annotations , Pallets , Segments

2025

Modern warehouse complexes face the need for efficient and accurate pallet loading control in conditions of high dynamics and variety of objects. This paper proposes an approach to solving this problem based on the Segment Anything model (SAM) for automatic image tagging and the YOLOv8 model for subsequent accurate segmentation. This combination provides both high processing speed and adaptability to changing lighting conditions, partial overlaps, and complex object geometry. The proposed algorithm tracks changes in the area of segmented zones in order to estimate the addition of new cargo. The experiments show that YOLOv8 provides the best balance between accuracy and performance (Dice = 0.88), outperforming Mask R-CNN and the newer version YOLOv12. Additionally, the paper contains an analysis of the models' resistance to noise and visual distortions. The presented solution has the potential for integration into next-generation industrial logistics systems, reducing the need for manual annotation and increasing the autonomy of loading control.

Journal Article

Share this book

Add to My Shelf

Leveraging Segment Anything Model (SAM) for Weld Defect Detection in Industrial Ultrasonic B-Scan Images

by Baburao, Vinay S. , Naddaf-Sh, Amir-M. , Zargarzadeh, Hassan in Algorithms , automated ultrasonic testing , Automation

2025

Automated ultrasonic testing (AUT) is a critical tool for infrastructure evaluation in industries such as oil and gas, and, while skilled operators manually analyze complex AUT data, artificial intelligence (AI)-based methods show promise for automating interpretation. However, improving the reliability and effectiveness of these methods remains a significant challenge. This study employs the Segment Anything Model (SAM), a vision foundation model, to design an AI-assisted tool for weld defect detection in real-world ultrasonic B-scan images. It utilizes a proprietary dataset of B-scan images generated from AUT data collected during automated girth weld inspections of oil and gas pipelines, detecting a specific defect type: lack of fusion (LOF). The implementation includes integrating knowledge from the B-scan image context into the natural image-based SAM 1 and SAM 2 through a fully automated, promptable process. As part of designing a practical AI-assistant tool, the experiments involve applying both vanilla and low-rank adaptation (LoRA) fine-tuning techniques to the image encoder and mask decoder of different variants of both models, while keeping the prompt encoder unchanged. The results demonstrate that the utilized method achieves improved performance compared to a previous study on the same dataset.

Journal Article

Share this book

Add to My Shelf

SAMTrans: Leveraging the Segment Anything Model as a Bridge for Vehicle-Infrastructure Cross-domain Transfer to Enhance Roadside 3D Object Detection

by Huang, Hongcheng , Wang, Bowen , Dong, Yunda in Annotations , Data acquisition , Image processing

2025

Roadside LiDAR-based perception has the potential to provide comprehensive and high-precision sensing information, which is of critical importance for safety and efficiency in intelligent transportation systems. However, when deploying 3D roadside perception methods in new traffic scenarios, data acquisition and annotation are labor-intensive and costly. Unsupervised Domain Adaptation (UDA) can extract knowledge from existing data and transfer it to unknown domains for self-supervised model training, making it a promising approach to reduce the need for labeled roadside perception data. However, the effectiveness of UDA is constrained by the generalization capability of the pre-trained model from source domain. Foundational pre-trained large models such as SAM hold the potential for achieving cross-domain generalization across diverse scenarios. Inspired by this sight, this paper proposes a Segment Anything Model Transfer (SAMTrans) approach to enhance model performance. Firstly, we design a Bird’s-Eye View (BEV) pseudo-image generation method to enable Segment Anything Model (SAM) to interpret point cloud data. Additionally, we introduce a pseudo label fusion strategy based on confidence scores and Gaussian kernel density estimation. Evaluated on popular 3D roadside benchmarks, the proposed method demonstrates strong competitiveness in improving upon state-of-the-art performance. For instance, when integrated with the ST3D framework on the nuScenes to V2X-Seq task, our method boosts the 3D Average Precision (AP3D) from 30.1% to 40.6% under the Easy difficulty setting.

Journal Article

Share this book

Add to My Shelf

Automated labeling and segmentation based on segment anything model: Quantitative analysis of bubbles in gas–liquid flow

by Wang, Yi-Jun , Dang, Jia-Chen , Shi, Jia-Bin in Artificial intelligence , Bubble segmentation , Dispersed phases

2025

The quantitative analysis of dispersed phases (bubbles, droplets, and particles) in multiphase flow systems represents a persistent technological challenge in petroleum engineering applications, including CO2-enhanced oil recovery, foam flooding, and unconventional reservoir development. Current characterization methods remain constrained by labor-intensive manual workflows and limited dynamic analysis capabilities, particularly for processing large-scale microscopy data and video sequences that capture critical transient behavior like gas cluster migration and droplet coalescence. These limitations hinder the establishment of robust correlations between pore-scale flow patterns and reservoir-scale production performance. This study introduces a novel computer vision framework that integrates foundation models with lightweight neural networks to address these industry challenges. Leveraging the segment anything model's zero-shot learning capability, we developed an automated workflow that achieves an efficiency improvement of approximately 29 times in bubble labeling compared to manual methods while maintaining less than 2% deviation from expert annotations. Engineering-oriented optimization ensures lightweight deployment with 94% segmentation accuracy, while the integrated quantification system precisely resolves gas saturation, shape factors, and interfacial dynamics, parameters critical for optimizing gas injection strategies and predicting phase redistribution patterns. Validated through microfluidic gas–liquid displacement experiments for discontinuous phase segmentation accuracy, this methodology enables precise bubble morphology quantification with broad application potential in multiphase systems, including emulsion droplet dynamics characterization and particle transport behavior analysis. This work bridges the critical gap between pore-scale dynamics characterization and reservoir-scale simulation requirements, providing a foundational framework for intelligent flow diagnostics and predictive modeling in next-generation digital oilfield systems.

Journal Article

Share this book

Add to My Shelf

When Remote Sensing Meets Foundation Model: A Survey and Beyond

by Chen, Keming , Qi, Geqi , Huo, Chunlei in Adaptation , adapter , Artificial intelligence

2025

Most deep-learning-based vision tasks rely heavily on crowd-labeled data, and a deep neural network (DNN) is usually impacted by the laborious and time-consuming labeling paradigm. Recently, foundation models (FMs) have been presented to learn richer features from multi-modal data. Moreover, a single foundation model enables zero-shot predictions on various vision tasks. The above advantages make foundation models better suited for remote sensing images, where image annotations are more sparse. However, the inherent differences between natural images and remote sensing images hinder the applications of the foundation model. In this context, this paper provides a comprehensive review of common foundation models and domain-specific foundation models for remote sensing, and it summarizes the latest advances in vision foundation models, textually prompted foundation models, visually prompted foundation models, and heterogeneous foundation models. Despite the great potential of foundation models for vision tasks, open challenges concerning data, model, and task impact the performance of remote sensing images and make foundation models far from practical applications. To address open challenges and reduce the performance gap between natural images and remote sensing images, this paper discusses open challenges and suggests potential directions for future advancements.

Journal Article

Share this book

Add to My Shelf

Exploring Semantic Prompts in the Segment Anything Model for Domain Adaptation

by Zhang, Yongsheng , Jiang, Zhipeng , Li, Li in Adaptation , Algorithms , Annotations

2024

Robust segmentation in adverse weather conditions is crucial for autonomous driving. However, these scenes struggle with recognition and make annotations expensive, resulting in poor performance. As a result, the Segment Anything Model (SAM) was recently proposed to finely segment the spatial structure of scenes and to provide powerful prior spatial information, thus showing great promise in resolving these problems. However, SAM cannot be applied directly for different geographic scales and non-semantic outputs. To address these issues, we propose SAM-EDA, which integrates SAM into an unsupervised domain adaptation mean-teacher segmentation framework. In this method, we use a “teacher-assistant” model to provide semantic pseudo-labels, which will fill in the holes in the fine spatial structure given by SAM and generate pseudo-labels close to the ground truth, which then guide the student model for learning. Here, the “teacher-assistant” model helps to distill knowledge. During testing, only the student model is used, thus greatly improving efficiency. We tested SAM-EDA on mainstream segmentation benchmarks in adverse weather conditions and obtained a more-robust segmentation model.

Journal Article

Share this book

Add to My Shelf

Evaluating the Efficacy of Segment Anything Model for Delineating Agriculture and Urban Green Spaces in Multiresolution Aerial and Spaceborne Remote Sensing Images

by Bhardwaj, Anshuman , Gui, Baoling , Sam, Lydia in Accuracy , aerial imaging , Agricultural land

2024

Segmentation of Agricultural Remote Sensing Images (ARSIs) stands as a pivotal component within the intelligent development path of agricultural information technology. Similarly, quick and effective delineation of urban green spaces (UGSs) in high-resolution images is also increasingly needed as input in various urban simulation models. Numerous segmentation algorithms exist for ARSIs and UGSs; however, a model with exceptional generalization capabilities and accuracy remains elusive. Notably, the newly released Segment Anything Model (SAM) by META AI is gaining significant recognition in various domains for segmenting conventional images, yielding commendable results. Nevertheless, SAM’s application in ARSI and UGS segmentation has been relatively limited. ARSIs and UGSs exhibit distinct image characteristics, such as prominent boundaries, larger frame sizes, and extensive data types and volumes. Presently, there is a dearth of research on how SAM can effectively handle various ARSI and UGS image types and deliver superior segmentation outcomes. Thus, as a novel attempt in this paper, we aim to evaluate SAM’s compatibility with a wide array of ARSI and UGS image types. The data acquisition platform comprises both aerial and spaceborne sensors, and the study sites encompass most regions of the United States, with images of varying resolutions and frame sizes. It is noteworthy that the segmentation effect of SAM is significantly influenced by the content of the image, as well as the stability and accuracy across images of different resolutions and sizes. However, in general, our findings indicate that resolution has a minimal impact on the effectiveness of conditional SAM-based segmentation, maintaining an overall segmentation accuracy above 90%. In contrast, the unsupervised segmentation approach, SAM, exhibits performance issues, with around 55% of images (3 m and coarser resolutions) experiencing lower accuracy on low-resolution images. Whereas frame size exerts a more substantial influence, as the image size increases, the accuracy of unsupervised segmentation methods decreases extremely fast, and conditional segmentation methods also show some degree of degradation. Additionally, SAM’s segmentation efficacy diminishes considerably in the case of images featuring unclear edges and minimal color distinctions. Consequently, we propose enhancing SAM’s capabilities by augmenting the training dataset and fine-tuning hyperparameters to align with the demands of ARSI and UGS image segmentation. Leveraging the multispectral nature and extensive data volumes of remote sensing images, the secondary development of SAM can harness its formidable segmentation potential to elevate the overall standard of ARSI and UGS image segmentation.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter