Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
417 result(s) for "three-dimensional convolutional neural network"
Sort by:
Non-Invasive Estimation of Gleason Score by Semantic Segmentation and Regression Tasks Using a Three-Dimensional Convolutional Neural Network
The Gleason score (GS) is essential in categorizing prostate cancer risk using biopsy. The aim of this study was to propose a two-class GS classification (< and ≥GS 7) methodology using a three-dimensional convolutional neural network with semantic segmentation to predict GS non-invasively using multiparametric magnetic resonance images (MRIs). Four training datasets of T2-weighted images and apparent diffusion coefficient maps with and without semantic segmentation were used as test images. All images and lesion information were selected from a training cohort of the Society of Photographic Instrumentation Engineers, the American Association of Physicists in Medicine, and the National Cancer Institute (SPIE–AAPM–NCI) PROSTATEx Challenge dataset. Precision, recall, overall accuracy and area under the receiver operating characteristics curve (AUROC) were calculated from this dataset, which comprises publicly available prostate MRIs. Our data revealed that the GS ≥ 7 precision (0.73 ± 0.13) and GS < 7 recall (0.82 ± 0.06) were significantly higher using semantic segmentation (p < 0.05). Moreover, the AUROC in segmentation volume was higher than that in normal volume (ADCmap: 0.70 ± 0.05 and 0.69 ± 0.08, and T2WI: 0.71 ± 0.07 and 0.63 ± 0.08, respectively). However, there were no significant differences in overall accuracy between the segmentation and normal volume. This study generated a diagnostic method for non-invasive GS estimation from MRIs.
Stable polyp-scene classification via subsampling and residual learning from an imbalanced large dataset
This Letter presents a stable polyp-scene classification method with low false positive (FP) detection. Precise automated polyp detection during colonoscopies is essential for preventing colon-cancer deaths. There is, therefore, a demand for a computer-assisted diagnosis (CAD) system for colonoscopies to assist colonoscopists. A high-performance CAD system with spatiotemporal feature extraction via a three-dimensional convolutional neural network (3D CNN) with a limited dataset achieved about 80% detection accuracy in actual colonoscopic videos. Consequently, further improvement of a 3D CNN with larger training data is feasible. However, the ratio between polyp and non-polyp scenes is quite imbalanced in a large colonoscopic video dataset. This imbalance leads to unstable polyp detection. To circumvent this, the authors propose an efficient and balanced learning technique for deep residual learning. The authors’ method randomly selects a subset of non-polyp scenes whose number is the same number of still images of polyp scenes at the beginning of each epoch of learning. Furthermore, they introduce post-processing for stable polyp-scene classification. This post-processing reduces the FPs that occur in the practical application of polyp-scene classification. They evaluate several residual networks with a large polyp-detection dataset consisting of 1027 colonoscopic videos. In the scene-level evaluation, their proposed method achieves stable polyp-scene classification with 0.86 sensitivity and 0.97 specificity.
Ethnic dance movement instruction guided by artificial intelligence and 3D convolutional neural networks
This study aims to explore the potential application of artificial intelligence in ethnic dance action instruction and achieve movement recognition by utilizing the three-dimensional convolutional neural networks (3D-CNNs). In this study, the 3D-CNNs is introduced and combined with a residual network (ResNet), resulting in a proposed 3D-ResNet-based ethnic dance movement recognition model. The model operates in three stages. First, it collects data and constructs a dataset featuring movements from six specific ethnic dances, namely Miao, Dai, Tibetan, Uygur, Mongolian, and Yi. Second, 3D-ResNet is used to identify and classify these ethnic dance movements. Lastly, the model’s performance is evaluated. Experiments on the self-built dataset and NTU-RGBD60 database show that the proposed 3D-ResNet-based model’s accuracy is above 95%. This model performs well in movement recognition tasks, showing remarkable advantages in different dance types. It exhibits good versatility and adaptability to various cultural contexts, providing advanced technical support for ethnic dance instruction. The main contribution of this study is to identify and analyze six specific ethnic dances, verify the universality and adaptability of the proposed 3D-ResNet-based model, and offer reference and support for cross-cultural dance instruction.
Privacy-Preserved Fall Detection Method with Three-Dimensional Convolutional Neural Network Using Low-Resolution Infrared Array Sensor
Due to the rapid aging of the population in recent years, the number of elderly people in hospitals and nursing homes is increasing, which results in a shortage of staff. Therefore, the situation of elderly citizens requires real-time attention, especially when dangerous situations such as falls occur. If staff cannot find and deal with them promptly, it might become a serious problem. For such a situation, many kinds of human motion detection systems have been in development, many of which are based on portable devices attached to a user’s body or external sensing devices such as cameras. However, portable devices can be inconvenient for users, while optical cameras are affected by lighting conditions and face privacy issues. In this study, a human motion detection system using a low-resolution infrared array sensor was developed to protect the safety and privacy of people who need to be cared for in hospitals and nursing homes. The proposed system can overcome the above limitations and have a wide range of application. The system can detect eight kinds of motions, of which falling is the most dangerous, by using a three-dimensional convolutional neural network. As a result of experiments of 16 participants and cross-validations of fall detection, the proposed method could achieve 98.8% and 94.9% of accuracy and F1-measure, respectively. They were 1% and 3.6% higher than those of a long short-term memory network, and show feasibility of real-time practical application.
Attention Mechanism and Depthwise Separable Convolution Aided 3DCNN for Hyperspectral Remote Sensing Image Classification
Hyperspectral Remote Rensing Image (HRSI) classification based on Convolution Neural Network (CNN) has become one of the hot topics in the field of remote sensing. However, the high dimensional information and limited training samples are prone to the Hughes phenomenon for hyperspectral remote sensing images. Meanwhile, high-dimensional information processing also consumes significant time and computing power, or the extracted features may not be representative, resulting in unsatisfactory classification efficiency and accuracy. To solve these problems, an attention mechanism and depthwise separable convolution are introduced to the three-dimensional convolutional neural network (3DCNN). Thus, 3DCNN-AM and 3DCNN-AM-DSC are proposed for HRSI classification. Firstly, three hyperspectral datasets (Indian pines, University of Pavia and University of Houston) are used to analyze the patchsize and dataset allocation ratio (Training set: Validation set: Test Set) in the performance of 3DCNN and 3DCNN-AM. Secondly, in order to improve work efficiency, principal component analysis (PCA) and autoencoder (AE) dimension reduction methods are applied to reduce data dimensionality, and maximize the classification accuracy of the 3DCNN, but it will still take time. Furthermore, the HRSI classification model 3DCNN-AM and 3DCNN-AM-DSC are applied to classify with the three classic HRSI datasets. Lastly, the classification accuracy index and time consumption are evaluated. The results indicate that 3DCNN-AM could improve classification accuracy and reduce computing time with the dimension reduction dataset, and the 3DCNN-AM-DSC model can reduce the training time by a maximum of 91.77% without greatly reducing the classification accuracy. The results of the three classic hyperspectral datasets illustrate that 3DCNN-AM-DSC can improve the classification performance and reduce the time required for model training. It may be a new way to tackle hyperspectral datasets in HRSl classification tasks without dimensionality reduction.
Three-dimensional convolutional neural network-based classification of chronic kidney disease severity using kidney MRI
A three-dimensional convolutional neural network model was developed to classify the severity of chronic kidney disease (CKD) using magnetic resonance imaging (MRI) Dixon-based T1-weighted in-phase (IP)/opposed-phase (OP)/water-only (WO) imaging. Seventy-three patients with severe renal dysfunction (estimated glomerular filtration rate [eGFR] < 30 mL/min/1.73 m 2 , CKD stage G4–5); 172 with moderate renal dysfunction (30 ≤ eGFR < 60 mL/min/1.73 m 2 , CKD stage G3a/b); and 76 with mild renal dysfunction (eGFR ≥ 60 mL/min/1.73 m 2 , CKD stage G1–2) participated in this study. The model was applied to the right, left, and both kidneys, as well as to each imaging method (T1-weighted IP/OP/WO images). The best performance was obtained when using bilateral kidneys and IP images, with an accuracy of 0.862 ± 0.036. The overall accuracy was better for the bilateral kidney models than for the unilateral kidney models. Our deep learning approach using kidney MRI can be applied to classify patients with CKD based on the severity of kidney disease.
A three-dimensional prediction method of stiffness properties of composites based on deep learning
It is significant to determine the macroscopic mechanical properties of composite materials with complex microstructure efficiently and accurately in many fields. We propose a deep learning method based on three-dimensional convolutional neural network (3D CNN) to predict the elastic coefficients of composite materials with inclusions of arbitrary sizes, shapes and material parameters. 3D datasets are generated, and a storage algorithm is proposed to reduce great storage costs in 3D. A general framework for 3D CNN models is constructed, and numerical experiments are carried out using 3D CNNs of various scales. Our results demonstrate that the scale of full connection part is the key factor of prediction ability of 3D CNNs in this task. We also demonstrate that our method can effectively save computational time compared with traditional numerical methods such as the finite element method in large-scale prediction tasks.
SVM directed machine learning classifier for human action recognition network
Understanding human behavior and human action recognition are both essential components of effective surveillance video analysis for the purpose of guaranteeing public safety. However, existing approaches such as three-dimensional convolutional neural networks (3D CNN) and two-stream neural networks (2SNN) have computational hurdles due to the significant parameterization they require. In this paper, we offer HARNet, a specialized lightweight residual 3D CNN that is built on directed acyclic graphs and was created expressly to handle these issues and achieve effective human action detection. The suggested method presents an innovative pipeline for creating spatial motion data from raw video inputs, which makes successful latent representation learning of human motions easier to accomplish. This generated input is then supplied into HARNet, which processes spatial and motion information in a single stream in an effective manner, maximizing the benefits of both types of cues. The use of traditional machine learning classifiers is done in order to further improve the discriminative capacity of the features that have been learned. To be more specific, we use the latent representations that are stored in HARNet’s fully connected layer and use them as our deep learnt features. After that, these features are entered into the Support Vector Machine (SVM) classifier in order to accomplish action recognition. In order to evaluate the HARNet-SVM method that was developed, empirical tests were run on commonly used action recognition datasets such as UCF101, HMDB51, and the KTH dataset. These tests were carried out in order to gather data for the evaluation. The experimental results show that our method is superior to other state-of-the-art approaches, achieving considerable performance increases of 2.75% on UCF101, 10.94% on HMDB51, and 0.18% on the KTH dataset. These results were obtained by running the method on each dataset separately. Our findings demonstrate the usefulness of HARNet’s lightweight design and highlight the significance of utilizing SVM classifiers with deep learnt features for the purpose of accurate and computationally efficient human activity recognition in surveillance videos. This work helps to the advancement of surveillance technology, which in turn makes video analysis in applications that take place in the real world safer and more dependable.
A novel parameter dense three-dimensional convolution residual network method and its application in classroom teaching
Improving the rationality and accuracy of classroom quality analysis is crucial in modern education. Traditional methods, such as questionnaires and manual recordings, are resource-intensive and subjective, leading to inconsistent results. As a solution, computer vision (CV) technologies have emerged as powerful tools for real-time classroom monitoring. This study proposes a novel Dense 3D Convolutional Residual Network (D3DCNN_ResNet) to recognize students' expressions and behaviors in English classrooms. The proposed method combines Single Shot Multibox Detector (SSD) for target detection with an improved D3DCNN_ResNet model. The network applies 3D convolution in both spatial and temporal domains, with shortcut connections from residual blocks to increase network depth. Dense connections are introduced to enhance the flow of high- and low-level features. The model was tested on two datasets: the CK+ dataset for expression recognition and the KTH dataset for behavior recognition. The experiments show that the proposed method is highly efficient in optimizing model training and improving recognition accuracy. On the CK+ dataset, the model achieved an expression recognition accuracy of 97.94%, while on the KTH dataset, the behavior recognition accuracy reached 98.86%. The combination of residual blocks and dense connections reduced feature redundancy and improved gradient flow, leading to better model performance. The results demonstrate that the D3DCNN_ResNet is well-suited for classroom quality analysis and has the potential to enhance teaching strategies by providing real-time feedback on student engagement.
Rethinking Evaluation Metrics in Hydrological Deep Learning: Insights from Torrent Flow Velocity Prediction
Accurate estimation of flow velocities in torrents and steep rivers is essential for flood risk assessment, sediment transport analysis, and the sustainable management of water resources. While deep learning models are increasingly applied to such tasks, their evaluation often depends on statistical metrics that may yield conflicting interpretations. The objective of this study is to clarify how different evaluation metrics influence the interpretation of hydrological deep learning models. We analyze two models of flow velocity prediction in a torrential creek in Taiwan. Although the models differ in architecture, the critical distinction lies in the datasets used: the first model was trained on May–June data, whereas the second model incorporated May–August data. Four performance metrics were examined—root mean square error (RMSE), Nash–Sutcliffe efficiency (NSE), Willmott’s index of agreement (d), and mean absolute percentage error (MAPE). Quantitatively, the first model attained RMSE = 0.0471 m/s, NSE = 0.519, and MAPE = 7.78%, whereas the second model produced RMSE = 0.0572 m/s, NSE = 0.678, and MAPE = 11.56%. The results reveal a paradox. The first model achieved lower RMSE and MAPE, indicating predictions closer to the observed values, but its NSE fell below the 0.65 threshold often cited by reviewers as grounds for rejection. In contrast, the second model exceeded this NSE threshold and would likely be considered acceptable, despite producing larger errors in absolute terms. This paradox highlights the novelty of the study: model evaluation outcomes can be driven more by data variability and the choice of metric than by model architecture. This underscores the risk of misinterpretation if a single metric is used in isolation. For sustainability-oriented hydrology, robust assessment requires reporting multiple metrics and interpreting them in a balanced manner to support disaster risk reduction, resilient water management, and climate adaptation.