Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
73 result(s) for "Context block"
Sort by:
Deep atrous context convolution generative adversarial network with corner key point extracted feature for nuts classification
Deep learning-based nut classification has emerged as a viable way to automate the detection and categorization of different nut varieties in the food processing and agriculture sectors. Conventional techniques for classifying nuts mostly rely on manually created characteristics like texture, color, shape, or edges. These characteristics frequently fall short of capturing the image’s complete complexity, particularly when nuts show tiny visual variances. This research proposes Deep Atrous Context Convolution Generative Adversarial Network (DAC-GAN) model that categorize the 8 classes of nuts like brazil nuts, cashew, peanut, pecan nut, pistachio, chest nut, macadamia and Walnut. This research uses Common Nut KAGGLE dataset with 4,000 nuts images of 8 nuts classes. The DAC-GAN approach overcomes the difficulties of having limited labelled data for nut classification tasks by employing DCGANs’ ability to produce high-quality, synthetic nut images to supplement the dataset. The DCGAN comprises of a discriminator and a generator block. The discriminator block develops the ability to differentiate between synthetic and real images, while the generator block generates realistic nut images from random noise. The real images along with the DCGAN generated images are processed with feature filtering methods to extract the Corner Key Points Featured (CKPF) nuts images. To further enhance the feature selection, the CKPF edges are extracted from the image that provides unique, geometrically distinctive critical corners to further process for representative learning. To proceed with the effective feature extraction and model learning, the CKPF nuts images are processed with atrous convolution that capture the intricate details by expanding the receptive field without losing resolution. The novelty of this work exists by appending the filtration and atrous convolution that acquire the spatial data features from the nut’s images at various resolutions. Atrous convolution was refined by appending the pre-context and post-context block that add the image level information to the features. The effectiveness of the DAC-GAN model was validated with the traditional augmented dataset with all existing filtering images and CNN models. Implementation outcome shows that DAC-GAN found to exhibit high accuracy of 99.83% towards the nuts type classification. The superiority of the DAC-GAN method over traditional approaches is demonstrated by extensive experiments on augmented and DCGAN generated datasets, which achieve higher classification accuracy and generalization across a variety of nut type categorization. The outcome demonstrates that the DCGAN together with atrous convolution have the potential to be an effective tool for automating nut sorting in food industry.
Steel surface defect detection based on global context block
Steel is the most basic raw material in China’s industrial production, which plays a great role in promoting China’s industrialization process. Therefore, it is of great significance to detect defective steel and the surface quality of steel. In order to further improve the detection accuracy of steel surface defects, this paper proposes a steel surface defect detection algorithm based on global context block. In this paper, a global context module is introduced based on the UNet++ network model to achieve accurate segmentation and classification of complex steel surface defects. The results show that the improved UNet++ network model achieves a dice coefficient of 94.67% on the steel surface defect dataset provided by the Kaggle competition platform. Compared with semantic segmentation models such as UNet, LinkNet, and UNet++, the segmentation effect is more accurate. Therefore, the deep learning model based on the improved UNet++ can learn more semantic features from industrial steel images, so as to obtain more accurate steel defect information. This method can be a big help for real-world applications like defect detection in industrial images.
GC-YOLOv3: You Only Look Once with Global Context Block
In order to make the classification and regression of single-stage detectors more accurate, an object detection algorithm named Global Context You-Only-Look-Once v3 (GC-YOLOv3) is proposed based on the You-Only-Look-Once (YOLO) in this paper. Firstly, a better cascading model with learnable semantic fusion between a feature extraction network and a feature pyramid network is designed to improve detection accuracy using a global context block. Secondly, the information to be retained is screened by combining three different scaling feature maps together. Finally, a global self-attention mechanism is used to highlight the useful information of feature maps while suppressing irrelevant information. Experiments show that our GC-YOLOv3 reaches a maximum of 55.5 object detection mean Average Precision (mAP)@0.5 on Common Objects in Context (COCO) 2017 test-dev and that the mAP is 5.1% higher than that of the YOLOv3 algorithm on Pascal Visual Object Classes (PASCAL VOC) 2007 test set. Therefore, experiments indicate that the proposed GC-YOLOv3 model exhibits optimal performance on the PASCAL VOC and COCO datasets.
Multi - scale context U-net for breast cancer postoperative radiotherapy in patients with brachial plexus
Purpose In this study, we propose a Multi-Scale Context U-net (MSC-U-net) network model for the precise and automated segmentation of the brachial plexus in patients who have undergone postoperative radiotherapy for breast cancer. Methods A total of 389 patients who underwent postoperative radiotherapy for breast cancer were included in the training set, while 55 patients were included in the test set. The network model proposed in this study was optimized and trained to achieve the most accurate segmentation results. The performance of the model was evaluated using the Dice Similarity Coefficient (DSC) and the 95% Hausdorff distance (HD95). To further validate the effectiveness of the segmentation, comparison and ablation experiments were conducted. Subsequently, the clinical practicability was assessed within a clinical setting. Results MSC-U-net achieved precise segmentation results with DSC and HD95 values of 79.16 ± 0.05% and 6.95 ± 0.76 mm, respectively, in the test set. In comparison experiments with four other classical networks, the model showed significant statistical differences ( p  < 0.05) in performance. Ablation experiments further confirmed that the MSC-U-net network model achieved the best segmentation performance ( p  < 0.05). The radiation oncologists’ subjective evaluations also demonstrated the clinical applicability of the MSC-U-net network model. There was no statistically significant difference between manual segmentation and model segmentation in terms of segmentation accuracy and radiation dose ( p  > 0.05). The above results demonstrate the superior performance of the MSC-U-net network model in medical image segmentation, and also indicate its effectiveness in clinical applications. Conclusions In the context of segmenting the brachial plexus in localization CT images of breast cancer patients after postoperative radiotherapy, MSC-U-net has demonstrated exceptional performance, significantly minimizing the manual segmentation accuracy issues caused by human factors. This network exhibits high efficiency in automatic segmentation and a high level of accuracy. Notably, the MSC-U-net network holds significant importance in the advancement of radiotherapy automation, and it also offers valuable insights for future research in the field of automatic organ segmentation. Innovation (1) The structure of Non-Local Block and SE Block is reconsidered, and Multi-Scale Context Block is proposed. The multi-scale self-attention mechanism is used to model the global context, and the MLP is used to extract important features to effectively improve the segmentation accuracy. (2) In this study, we employed the nearest neighbor interpolation upsampling layer to replace the traditional transposed convolution layer. This design effectively mitigates the common checkerboard artifacts issues during the upsampling process and enables the generation of smoother segmentation boundaries, thereby significantly improving segmentation accuracy. (3) The accuracy of automatic segmentation of the brachial plexus nerve in postoperative breast cancer radiotherapy patients was greatly improved by the proposed method.
Enhancing pediatric distal radius fracture detection: optimizing YOLOv8 with advanced AI and machine learning techniques
Background In emergency departments, residents and physicians interpret X-rays to identify fractures, with distal radius fractures being the most common in children. Skilled radiologists typically ensure accurate readings in well-resourced hospitals, but rural areas often lack this expertise, leading to lower diagnostic accuracy and potential delays in treatment. Machine learning systems offer promising solutions by detecting subtle features that non-experts might miss. Recent advancements, including YOLOv8 and its attention-mechanism models, YOLOv8-AM, have shown potential in automated fracture detection. This study aims to refine the YOLOv8-AM model to improve the detection of distal radius fractures in pediatric patients by integrating targeted improvements and new attention mechanisms. Methods We enhanced the YOLOv8-AM model to improve pediatric wrist fracture detection, maintaining the YOLOv8 backbone while integrating attention mechanisms such as the Convolutional Block Attention Module (CBAM) and the Global Context (GC) block. We optimized the model through hyperparameter tuning, implementing data cleaning, augmentation, and normalization techniques using the GRAZPEDWRI-DX dataset. This process addressed class imbalances and significantly improved model performance, with mean Average Precision (mAP) increasing from 63.6 to 66.32%. Results and discussion The iYOLOv8 models demonstrated substantial improvements in performance metrics. The iYOLOv8 + GC model achieved the highest precision at 97.2%, with an F1-score of 67% and an mAP50 of 69.5%, requiring only 3.62 h of training time. In comparison, the iYOLOv8 + ECA model reached 96.7% precision, significantly reducing training time from 8.54 to 2.16 h. The various iYOLOv8-AM models achieved an average accuracy of 96.42% in fracture detection, although performance for detecting bone anomalies and soft tissues was lower due to dataset constraints. The improvements highlight the model’s effectiveness in pathological detection of the pediatric distal radius, suggesting that integrating these AI models into clinical practice could significantly enhance diagnostic efficiency. Conclusion Our improved YOLOv8-AM model, incorporating the GC attention mechanism, demonstrated superior speed and accuracy in pediatric distal radius fracture detection while reducing training time. Future research should explore additional features to further enhance detection capabilities in other musculoskeletal areas, as this model has the potential to adapt to various fracture types with appropriate training. Clinical trial number Not applicable.
SOCA-PRNet: Spatially Oriented Attention-Infused Structured-Feature-Enabled PoseResNet for 2D Human Pose Estimation
In the recent era, 2D human pose estimation (HPE) has become an integral part of advanced computer vision (CV) applications, particularly in understanding human behaviors. Despite challenges such as occlusion, unfavorable lighting, and motion blur, advancements in deep learning have significantly enhanced the performance of 2D HPE by enabling automatic feature learning from data and improving model generalization. Given the crucial role of 2D HPE in accurately identifying and classifying human body joints, optimization is imperative. In response, we introduce the Spatially Oriented Attention-Infused Structured-Feature-enabled PoseResNet (SOCA-PRNet) for enhanced 2D HPE. This model incorporates a novel element, Spatially Oriented Attention (SOCA), designed to enhance accuracy without significantly increasing the parameter count. Leveraging the strength of ResNet34 and integrating Global Context Blocks (GCBs), SOCA-PRNet precisely captures detailed human poses. Empirical evaluations demonstrate that our model outperforms existing state-of-the-art approaches, achieving a Percentage of Correct Keypoints at 0.5 (PCKh@0.5) of 90.877 at a 50% threshold and a Mean Precision (Mean@0.1) score of 41.137. These results underscore the potential of SOCA-PRNet in real-world applications such as robotics, gaming, and human–computer interaction, where precise and efficient 2D HPE is paramount.
Object Detection Algorithm of UAV Aerial Photography Image Based on Anchor-Free Algorithms
Aiming at the problems of the difficult extraction of small target feature information, complex background, and variable target scale in unmanned aerial vehicle (UAV) aerial photography images. In this paper, an anchor-free target detection algorithm based on fully convolutional one-stage object detection (FCOS) for UAV aerial photography images is proposed. For the problem of complex backgrounds, the global context module is introduced in the ResNet50 network, which is combined with feature pyramid networks (FPN) as the backbone feature extraction network to enhance the feature representation of targets in complex backgrounds. To address the problem of the difficult detection of small targets, an adaptive feature balancing sub-network is designed to filter the invalid information generated at all levels of feature fusion, strengthen multi-layer features, and improve the recognition capability of the model for small targets. To address the problem of variable target scales, complete intersection over union (CIOU) Loss is used to optimize the regression loss and strengthen the model’s ability to locate multi-scale targets. The algorithm of this paper is compared quantitatively and qualitatively on the VisDrone dataset. The experiments show that the proposed algorithm improves 4.96% on average precision (AP) compared with the baseline algorithm FCOS, and the detection speed is 35 frames per second (FPS), confirming that the algorithm has satisfactory detection performance, real-time inference speed, and has effectively improved the problem of missed detection and false detection of targets in UAV aerial images.
Attentive Multi-Scale Features with Adaptive Context PoseResNet for Resource-Efficient Human Pose Estimation
Human Pose Estimation (HPE) remains challenging due to scale variation, occlusion, and high computational costs. Standard methods often struggle to capture detailed spatial information when keypoints are obscured, and they typically rely on computationally expensive deconvolution layers for upsampling, making them inefficient for real-time or resource-constrained scenarios. We propose AMFACPose (Attentive Multi-scale Features with Adaptive Context PoseResNet) to address these limitations. Specifically, our architecture incorporates Coordinate Convolution 2D (CoordConv2d) to retain explicit spatial context, alleviating the loss of coordinate information in conventional convolutions. To reduce computational overhead while maintaining accuracy, we utilize Depthwise Separable Convolutions (DSCs), separating spatial and pointwise operations. At the core of our approach is an Adaptive Feature Pyramid Network (AFPN), which replaces costly deconvolution-based upsampling by efficiently aggregating multi-scale features to handle diverse human poses and body sizes. We further introduce Dual-Gate Context Blocks (DGCBs) that refine global context to manage partial occlusions and cluttered backgrounds. The model integrates Squeeze-and-Excitation (SE) blocks and the Spatial–Channel Refinement Module (SCRM) to emphasize the most informative feature channels and spatial regions, which is particularly beneficial for occluded or overlapping keypoints. For precise keypoint localization, we replace dense heatmap predictions with coordinate classification using Multi-Layer Perceptron (MLP) heads. Experiments on the COCO and CrowdPose datasets demonstrate that AMFACPose surpasses the existing 2D HPE methods in both accuracy and computational efficiency. Moreover, our implementation on edge devices achieves real-time performance while preserving high accuracy, confirming the suitability of AMFACPose for resource-constrained pose estimation in both benchmark and real-world environments.
GhostConv+CA-YOLOv8n: a lightweight network for rice pest detection based on the aggregation of low-level features in real-world complex backgrounds
Deep learning models for rice pest detection often face performance degradation in real-world field environments due to complex backgrounds and limited computational resources. Existing approaches suffer from two critical limitations: (1) inadequate feature representation under occlusion and scale variations, and (2) excessive computational costs for edge deployment. To overcome these limitations, this paper introduces GhostConv+CA-YOLOv8n, a lightweight object detection framework was proposed, which incorporates several innovative features: GhostConv replaces standard convolutional operations with computationally efficient ghost modules in the YOLOv8n’s backbone structure, reducing parameters by 40,458 while maintaining feature richness; a Context Aggregation (CA) module is applied after the large and medium-sized feature maps were output by the YOLOv8n’s neck structure. This module enhance low-level feature representation by fusing global and local context, which is particularly effective for detecting occluded pests in complex environments; Shape-IoU, which improves bounding box regression by accounting for target morphology, and Slide Loss, which addresses class imbalance by dynamically adjusting sample weighting during training were employed. Comprehensive evaluations on the Ricepest15 dataset, GhostConv+CA-YOLOv8n achieves 89.959% precision and 82.258% recall with improvements of 3.657% and 11.59%, and the model parameter reduced 1.34%, over the YOLOv8n baseline while maintaining a high mAP (94.527% vs. 84.994% baseline). Furthermore, the model shows strong generalization, achieving a 4.49%, 5.452%, and 3.407% improvement in F1-score, precision, and recall on the IP102 benchmark. This study bridges the gap between accuracy and efficiency for in field pest detection, providing a practical solution for real-time rice monitoring in smart agriculture systems.
MULTILAYER TENSOR FACTORIZATION WITH APPLICATIONS TO RECOMMENDER SYSTEMS
Recommender systems have been widely adopted by electronic commerce and entertainment industries for individualized prediction and recommendation, which benefit consumers and improve business intelligence. In this article, we propose an innovative method, namely the recommendation engine of multilayers (REM), for tensor recommender systems. The proposed method utilizes the structure of a tensor response to integrate information from multiple modes, and creates an additional layer of nested latent factors to accommodate between-subjects dependency. One major advantage is that the proposed method is able to address the “cold-start” issue in the absence of information from new customers, new products or new contexts. Specifically, it provides more effective recommendations through sub-group information. To achieve scalable computation, we develop a new algorithm for the proposed method, which incorporates a maximum block improvement strategy into the cyclic blockwise-coordinate-descent algorithm. In theory, we investigate algorithmic properties for convergence from an arbitrary initial point and local convergence, along with the asymptotic consistency of estimated parameters. Finally, the proposed method is applied in simulations and IRI marketing data with 116 million observations of product sales. Numerical studies demonstrate that the proposed method outperforms existing competitors in the literature.