Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
122 result(s) for "Geng, Guohua"
Sort by:
A novel geometry image to accurately represent a surface by preserving mesh topology
Geometry images parameterise a mesh with a square domain and store the information in a single chart. A one-to-one correspondence between the 2D plane and the 3D model is convenient for processing 3D models. However, the parameterised vertices are not all located at the intersection of the gridlines the existing geometry images. Thus, errors are unavoidable when a 3D mesh is reconstructed from the chart. In this paper, we propose parameterise surface onto a novel geometry image that preserves the constraint of topological neighbourhood information at integer coordinate points on a 2D grid and ensures that the shape of the reconstructed 3D mesh does not change from supplemented image data. We find a collection of edges that opens the mesh into simply connected surface with a single boundary. The point distribution with approximate blue noise spectral characteristics is computed by capacity-constrained delaunay triangulation without retriangulation. We move the vertices to the constrained mesh intersection, adjust the degenerate triangles on a regular grid, and fill the blank part by performing a local affine transformation between each triangle in the mesh and image. Unlike other geometry images, the proposed method results in no error in the reconstructed surface model when floating-point data are stored in the image. High reconstruction accuracy is achieved when the xyz positions are in a 16-bit data format in each image channel because only rounding errors exist in the topology-preserving geometry images, there are no sampling errors. This method performs one-to-one mapping between the 3D surface mesh and the points in the 2D image, while foldovers do not appear in the 2D triangular mesh, maintaining the topological structure. This also shows the potential of using a 2D image processing algorithm to process 3D models.
Feature-preserving simplification framework for 3D point cloud
To obtain a higher simplification rate while retaining geometric features, a simplification framework for the point cloud is proposed. Firstly, multi-angle images of the original point cloud are obtained with a virtual camera. Then, feature lines of each image are extracted by deep neural network. Furthermore, according to the proposed mapping relationship between the acquired 2D feature lines and original point cloud, feature points of the point cloud are extracted automatically. Finally, the simplified point cloud is obtained by fusing feature points and simplified non-feature points. The proposed simplification method is applied to four data sets and compared with the other six algorithms. The experimental results demonstrate that our proposed simplification method has the superiority in terms of both retaining geometric features and high simplification rate.
CRPGAN: Learning image-to-image translation of two unpaired images by cross-attention mechanism and parallelization strategy
Unsupervised image-to-image translation (UI2I) tasks aim to find a mapping between the source and the target domains from unpaired training data. Previous methods can not effectively capture the differences between the source and the target domain on different scales and often leads to poor quality of the generated images, noise, distortion, and other conditions that do not match human vision perception, and has high time complexity. To address this problem, we propose a multi-scale training structure and a progressive growth generator method to solve UI2I task. Our method refines the generated images from global structures to local details by adding new convolution blocks continuously and shares the network parameters in different scales and also in the same scale of network. Finally, we propose a new Cross-CBAM mechanism (CRCBAM), which uses a multi-layer spatial attention and channel attention cross structure to generate more refined style images. Experiments on our collected Opera Face, and other open datasets Summer↔Winter, Horse↔Zebra, Photo↔Van Gogh, show that the proposed algorithm is superior to other state-of-art algorithms.
Link Aggregation for Skip Connection–Mamba: Remote Sensing Image Segmentation Network Based on Link Aggregation Mamba
The semantic segmentation of satellite and UAV remote sensing imagery is pivotal for address exploration, change detection, quantitative analysis and urban planning. Recent advancements have seen an influx of segmentation networks utilizing convolutional neural networks and transformers. However, the intricate geographical features and varied land cover boundary interferences in remote sensing imagery still challenge conventional segmentation networks’ spatial representation and long-range dependency capabilities. This paper introduces a novel U-Net-like network for UAV image segmentation. We developed a link aggregation Mamba at the critical skip connection stage of UNetFormer. This approach maps and aggregates multi-scale features from different stages into a unified linear dimension through four Mamba branches containing state-space models (SSMs), ultimately decoupling and fusing these features to restore the contextual relationships in the mask. Moreover, the Mix-Mamba module is incorporated, leveraging a parallel self-attention mechanism with SSMs to merge the advantages of a global receptive field and reduce modeling complexity. This module facilitates nonlinear modeling across different channels and spaces through multipath activation, catering to international and local long-range dependencies. Evaluations on public remote sensing datasets like LovaDA, UAVid and Vaihingen underscore the state-of-the-art performance of our approach.
MLGTM: Multi-Scale Local Geometric Transformer-Mamba Application in Terracotta Warriors Point Cloud Classification
As an important representative of ancient Chinese cultural heritage, the classification of Terracotta Warriors point cloud data aids in cultural heritage preservation and digital reconstruction. However, these data face challenges such as complex morphological and structural variations, sparsity, and irregularity. This paper proposes a method named Multi-scale Local Geometric Transformer-Mamba (MLGTM) to improve the accuracy and robustness of Terracotta Warriors point cloud classification tasks. To effectively capture the geometric information of point clouds, we introduce local geometric encoding, including local coordinates and feature information, effectively capturing the complex local morphology and structural variations of the Terracotta Warriors and extracting representative local features. Additionally, we propose a multi-scale Transformer-Mamba information aggregation module, which employs a dual-branch Transformer with a Mamba structure and finally aggregates them on multiple scales to effectively handle the sparsity and irregularity of the Terracotta Warriors point cloud data. We conducted experiments on several datasets, including the ModelNet40, ScanObjectNN, ShapeNetPart, ETH, and 3D Terracotta Warriors fragment datasets. The results show that our method significantly improves the classification task of Terracotta Warriors point clouds, demonstrating strong accuracy.
CPDC-MFNet: conditional point diffusion completion network with Muti-scale Feedback Refine for 3D Terracotta Warriors
Due to the antiquity and difficulty of excavation, the Terracotta Warriors have suffered varying degrees of damage. To restore the cultural relics to their original appearance, utilizing point clouds to repair damaged Terracotta Warriors has always been a hot topic in cultural relic protection. The output results of existing methods in point cloud completion often lack diversity. Probability-based models represented by Denoising Diffusion Probabilistic Models have recently achieved great success in the field of images and point clouds and can output a variety of results. However, one drawback of diffusion models is that too many samples result in slow generation speed. Toward this issue, we propose a new neural network for Terracotta Warriors fragments completion. During the reverse diffusion stage, we initially decrease the number of sampling steps to generate a coarse result. This preliminary outcome undergoes further refinement through a multi-scale refine network. Additionally, we introduce a novel approach called Partition Attention Sampling to enhance the representation capabilities of features. The effectiveness of the proposed model is validated in the experiments on the real Terracotta Warriors dataset and public dataset. The experimental results conclusively demonstrate that our model exhibits competitive performance in comparison to other existing models.
Semi-Supervised Bidirectional Long Short-Term Memory and Conditional Random Fields Model for Named-Entity Recognition Using Embeddings from Language Models Representations
Increasingly, popular online museums have significantly changed the way people acquire cultural knowledge. These online museums have been generating abundant amounts of cultural relics data. In recent years, researchers have used deep learning models that can automatically extract complex features and have rich representation capabilities to implement named-entity recognition (NER). However, the lack of labeled data in the field of cultural relics makes it difficult for deep learning models that rely on labeled data to achieve excellent performance. To address this problem, this paper proposes a semi-supervised deep learning model named SCRNER (Semi-supervised model for Cultural Relics’ Named Entity Recognition) that utilizes the bidirectional long short-term memory (BiLSTM) and conditional random fields (CRF) model trained by seldom labeled data and abundant unlabeled data to attain an effective performance. To satisfy the semi-supervised sample selection, we propose a repeat-labeled (relabeled) strategy to select samples of high confidence to enlarge the training set iteratively. In addition, we use embeddings from language model (ELMo) representations to dynamically acquire word representations as the input of the model to solve the problem of the blurred boundaries of cultural objects and Chinese characteristics of texts in the field of cultural relics. Experimental results demonstrate that our proposed model, trained on limited labeled data, achieves an effective performance in the task of named entity recognition of cultural relics.
Salient Object Detection in Optical Remote Sensing Images Based on Hierarchical Semantic Interaction
Existing salient object detection methods for optical remote sensing images still face certain limitations due to complex background variations, significant scale discrepancies among targets, severe background interference, and diverse topological structures. On the one hand, the feature transmission process often neglects the constraints and complementary effects of high-level features on low-level features, leading to insufficient feature interaction and weakened model representation. On the other hand, decoder architectures generally rely on simple cascaded structures, which fail to adequately exploit and utilize contextual information. To address these challenges, this study proposes a Hierarchical Semantic Interaction Module to enhance salient object detection performance in optical remote sensing scenarios. The module introduces foreground content modeling and a hierarchical semantic interaction mechanism within a multi-scale feature space, reinforcing the synergy and complementarity among features at different levels. This effectively highlights multi-scale and multi-type salient regions in complex backgrounds. Extensive experiments on multiple optical remote sensing datasets demonstrate the effectiveness of the proposed method. Specifically, on the EORSSD dataset, our full model integrating both CA and PA modules improves the max F-measure from 0.8826 to 0.9100 (↑2.74%), increases maxE from 0.9603 to 0.9727 (↑1.24%), and enhances the S-measure from 0.9026 to 0.9295 (↑2.69%) compared with the baseline. These results clearly demonstrate the effectiveness of the proposed modules and verify the robustness and strong generalization capability of our method in complex remote sensing scenarios.
GC-MLP: Graph Convolution MLP for Point Cloud Analysis
With the objective of addressing the problem of the fixed convolutional kernel of a standard convolution neural network and the isotropy of features making 3D point cloud data ineffective in feature learning, this paper proposes a point cloud processing method based on graph convolution multilayer perceptron, named GC-MLP. Unlike traditional local aggregation operations, the algorithm generates an adaptive kernel through the dynamic learning features of points, so that it can dynamically adapt to the structure of the object, i.e., the algorithm first adaptively assigns different weights to adjacent points according to the different relationships between the different points captured. Furthermore, local information interaction is then performed with the convolutional layers through a weight-sharing multilayer perceptron. Experimental results show that, under different task benchmark datasets (including ModelNet40 dataset, ShapeNet Part dataset, S3DIS dataset), our proposed algorithm achieves state-of-the-art for both point cloud classification and segmentation tasks.
Enhancing point cloud registration with transformer: cultural heritage protection of the Terracotta Warriors
Point cloud registration technology, by precisely aligning repair components with the original artifacts, can accurately record the geometric shape of cultural heritage objects and generate three-dimensional models, thereby providing reliable data support for the digital preservation, virtual exhibition, and restoration of cultural relics. However, traditional point cloud registration methods face challenges when dealing with cultural heritage data, including complex morphological and structural variations, sparsity and irregularity, and cross-dataset generalization. To address these challenges, this paper introduces an innovative method called Enhancing Point Cloud Registration with Transformer (EPCRT). Firstly, we utilize local geometric perception for positional encoding and combine it with a dynamic adjustment mechanism based on local density information and geometric angle encoding, enhancing the flexibility and adaptability of positional encoding to better characterize the complex local morphology and structural variations of artifacts. Additionally, we introduce a convolutional-Transformer hybrid module to facilitate interactive learning of artifact point cloud features, effectively achieving local–global feature fusion and enhancing detail capture capabilities, thus effectively handling the sparsity and irregularity of artifact point cloud data. We conduct extensive evaluations on the 3DMatch, ModelNet, KITTI, and MVP-RG datasets, and validate our method on the Terracotta Warriors cultural heritage dataset. The results demonstrate that our method has significant performance advantages in handling the complexity of morphological and structural variations, sparsity and irregularity of relic data, and cross-dataset generalization.