Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
3,590 result(s) for "Sample weights"
Sort by:
A novel data-driven method based on sample reliability assessment and improved CNN for machinery fault diagnosis with non-ideal data
Recently, numerous new data-driven methods have been proposed. But most of them focused on the innovation of models and algorithms, and rarely discussed and optimized from the perspective of data and samples. However, the reliability of sample quality directly determines the effectiveness of machine learning models. In this paper, a novel data-driven method based on sample reliability assessment (SRA) and improved convolutional neural network (ICNN) for mechanical fault diagnosis was designed. First, multinomial logistic regression (MLR) was conducted to construct the assessment model and a statistical approach named influence function was used to compute the sample weights efficiently. Then, ICNN with the improved loss function was proposed based on the strategies of sample weights, class weights and early-stopping. Compared with traditional deep learning models, ICNN can better eliminate the negative impact of the problems during the model training including sample quality imbalance, class imbalance, and overfitting phenomenon. Therefore, the fault diagnosis performance can be improved. Finally, the trained ICNN can automatically extract the fault characteristics and achieve the fault diagnosis with the input of compressed time–frequency images. Experiments on a benchmarking dataset and a gear dataset from a practical experimental platform verified the superiority of the proposed fault diagnosis method.
A multi-distance laser-induced breakdown spectroscopy data classification method based on deep convolutional neural network and spectral sample weight optimization
Laser-induced breakdown spectroscopy (LIBS) is a stand-off chemical analysis technique. In scenarios where the LIBS detection distance varies (e.g. Mars exploration), the distance effect poses a significant challenge to data analysis. In our prior work, a deep convolutional neural network (CNN) model was developed to directly process LIBS multi-distance spectra, achieving high classification accuracy even without performing conventional “distance correction”. The present study proposes a spectral sample weight optimization strategy to further improve the CNN model training process. Unlike the default equal-weight scheme, the new strategy tailors a specific weight value for every training spectral sample. On an eight-distance LIBS dataset acquired by the MarSCoDe duplicate instrument, the CNN model with the new weighting strategy can achieve a maximum testing accuracy of 92.06%, representing an improvement of 8.45 percentage points over our original CNN model. Besides accuracy, three other supplementary metrics also demonstrate the superiority of the new strategy: the precision, recall and F1-score can be averagely increased by 6.4, 7.0 and 8.2 percentage points, respectively. Moreover, the training time per epoch of the weight optimization strategy is almost identical to that of the original equal-weight scheme. These results indicate that the proposed methodology has great application potential in planetary exploration, and other LIBS-adopted scenarios involving varying detection distances.
Integration of former child and adolescent study participants into a national health online panel for a longitudinal study on young adult mental health
Background Longitudinal studies are essential for understanding health trajectories and determinants over time. Successfully re-engaging and monitoring participants at key stages such as the transition from adolescence to adulthood is crucial. This stage of life is characterized by many changes and challenges and is considered critical for the manifestation of mental health problems. This study addresses challenges of contacting, re-activation and comparability of individuals in transition to adulthood with the initial population-based sample last contacted up to 9 years ago. Methods In 2024, former participants of the “German Health Interview and Examination Survey for Children and Adolescents” (KiGGS) aged 16–25 years were invited to register in the new online panel infrastructure “Health in Germany” and participate in the “Study on Mental Health in Emerging Adulthood” (JEPSY) via a push-to-online panel approach. Logistic regression models identified predictors of registration and study participation. Weighting adjustments were applied to analyse and correct for selective participation. To assess potential selection bias, life satisfaction of JEPSY participants was benchmarked against a representative sample. Results Among 11,737 invitees, 4,451 (37.9%) registered, and among these, 3,063 (68.8%) completed the JEPSY survey. For participants aged 11 + years, higher probabilities for registration and participation were evident for females (OR = 2.61 [95% CI: 2.37–2.88]); OR = 2.85, [2.56–3.18]), individuals with higher occupational (OR = 1.08, [1.04–1.13]; OR = 1.06, [1.01–1.11]) or educational status (OR = 1.13, [1.08–1.17]; OR = 1.13, [1.08–1.17]). Individuals with higher emotional problems were less likely to register (OR = 0.98, [0.97–0.99]) and participate (OR = 0.98, [0.97–0.99]). Weighting adjustments reduced biases but increased statistical variance. Benchmarking life satisfaction revealed no significant differences between JEPSY participants and their representative counterparts. Conclusions This study demonstrated the feasibility of re-activating former participants and integrating them into a modern online panel. While re-activation was successful for a substantial proportion of invitees, specific challenges (e.g., selective re-participation) remain. The findings provide insights for refining re-contact and retention strategies to improve representativeness and quality of longitudinal mental health research, ultimately enabling more accurate monitoring of health trajectories and their determinants over time.
Reinforcement learning-based cost-sensitive classifier for imbalanced fault classification
Fault classification plays a crucial role in the industrial process monitoring domain. In the datasets collected from real-life industrial processes, the data distribution is usually imbalanced. The datasets contain a large amount of normal data (majority) and only a small amount of faulty data (minority); this phenomenon is also known as the imbalanced fault classification problem. To solve the imbalanced fault classification problem, a novel reinforcement learning (RL)-based cost-sensitive classifier (RLCC) based on policy gradient is proposed in this paper. In RLCC, a novel cost-sensitive learning strategy based on policy gradient and the actor-critic of RL is developed. The novel cost-sensitive learning strategy can adaptively learn the cost matrix and dynamically yield the sample weights. In addition, RLCC uses a newly designed reward to train the sample weight learner and classifier using an alternating iterative approach. The alternating iterative approach makes RLCC highly flexible and effective in solving the imbalanced fault classification problem. The effectiveness and practicability of the proposed RLCC method are verified through its application in a real-world dataset and an industrial process benchmark.
Accounting for spatially biased sampling effort in presence-only species distribution modelling
Aim Presence-only datasets represent an important source of information on species' distributions. Collections of presence-only data, however, are often spatially biased, particularly along roads and near urban populations. These biases can lead to inaccurate inferences and predicted distributions. We demonstrate a new approach of accounting for effort bias in presence-only data by explicitly incorporating sample biases in species distribution modelling. Location Alberta, Canada. Methods First, we used logistic regression to model sampling effort of recorded rare vascular plants, bryophytes and butterflies in Alberta. Second, we simulated presence/absence data for nine 'virtual' species based on three relative occurrence thresholds – common, rare and very rare – for each taxonomic group. We sampled these virtual species using our bias model to represent typical sampling effort characteristic of presence-only datasets. We then modelled the distributions of these virtual species using logistic regression and attempted to recover their original simulated distributions using a sample weighting term (prior weight) estimated as the inverse of probability of sampling. Bias-adjusted model estimates were compared to those obtained from random samples and biased samples without adjustment. We also compared prior-weight adjustment to bias-file and target-group background approaches in Maxent. Results Sample weighting recovered regression coefficients and mapped predictions estimated from unbiased presence-only data and improved model predictive accuracy as evaluated by regression and correlation coefficients, sensitivity and specificity. Similar model improvements were achieved using the Maxent bias-file method, but results were inconsistent for the target-group background approach. Main conclusions These results suggest that sample weighting can be used to account for spatially biased presence-only datasets in species distribution modelling. The framework presented is potentially widely applicable due to availability of online biodiversity databases and the flexibility of the approach.
The Influence of Processing Parameters on the Mechanical Properties of PLA 3D Printed Parts
In this paper, the effect of two process parameters on the mechanical properties of tensile specimens made by FDM was studied. A commercially available PLA filament (produced by Prusa) was used as raw material, from which several sets of specimens were produced, the varied parameters being the raster angle (RA) relative to the longitudinal axis of the specimen and the overflow (OF). Thus, three printing angles were chosen, 0°, 22.5° and 45°, each set of specimens being made with an OF of 95%, 100% and 105% respectively. The printed layer was chosen with a standard thickness of 0.2 mm. For the analysis of the mechanical properties, the specimen sets were subjected to tensile testing on an Instron 3382 machine and the results obtained were interpreted comparatively. Additionally, the fracture surfaces of the specimens were analysed by stereomicroscope. Two-way repeated measures ANOVA analysis of experimental data indicated that both parameters and their interaction significantly influence the specimen weight but, in the case of mechanical properties (modulus of elasticity, yield strength, tensile strength, yield elongation and tensile elongation) were insignificantly influenced by both process parameters. In this context regardless of raster angle, an overflow of 95% provides the same mechanical properties as an overflow of 105%, but at a minimum weight sample.
The China Mental Health Survey: II. Design and field procedures
China Mental Health Survey (CMHS), which was carried out from July 2013 to March 2015, was the first national representative community survey of mental disorders and mental health services in China using computer-assisted personal interview (CAPI). Face-to-face interviews were finished in the homes of respondents who were selected from a nationally representative multi-stage disproportionate stratified sampling procedure. Sample selection was integrated with the National Chronic Disease and Risk Factor Surveillance Survey administered by the National Centre for Chronic and Non-communicable Disease Control and Prevention in 2013, which made it possible to obtain both physical and mental health information of Chinese community population. One-stage design of data collection was used in the CMHS to obtain the information of mental disorders, including mood disorders, anxiety disorders, and substance use disorders, while two-stage design was applied for schizophrenia and other psychotic disorders, and dementia. A total of 28,140 respondents finished the survey with 72.9% of the overall response rate. This paper describes the survey mode, fieldwork organization, procedures, and the sample design and weighting of the CMHS. Detailed information is presented on the establishment of a new payment scheme for interviewers, results of the quality control in both stages, and evaluations to the weighting.
LIFWCM: local information-based fuzzy weighted C-means algorithm for image segmentation
Image segmentation aims to partition an image into non-overlapping regions that are coherent in appearance. Although the fuzzy C-means (FCM) algorithm is widely used for its simplicity and efficiency, it treats each pixel independently and is therefore sensitive to noise. We propose LIFWCM, a local information-based fuzzy weighted C-means algorithm that assigns a single-pass, data-driven weight to each pixel by aggregating neighborhood intensity variation and positional overlap, and then integrates these weights into the standard FCM objective and a spatially aware membership refinement. This design suppresses the influence of noisy and boundary pixels while preserving details with low computational overhead. Across six experiments on synthetic images and natural images from the Image Processing Toolbox and BSDS500, LIFWCM consistently improves segmentation quality under heavy noise. On the BSDS500 image with 30% salt-and-pepper noise, LIFWCM attains 98.96% segmentation accuracy, exceeding the best baseline, and surpassing classical FCM variants. LIFWCM also achieves higher MPA (0.94) and MIoU (0.82) than competing methods, while converging in a few iterations. These results demonstrate that LIFWCM is robust to high-intensity noise, preserves fine structures, and remains efficient due to one-time weight computation, making it suitable for real-world noisy images with complex structures.
Optimizing weighted k-means clustering with gradient-based methods
Clustering methods are essential in medical and data-centric research, helping to reveal underlying patterns without the need for labelled data. This study introduces a gradient-based K-means framework that jointly refines centroids, sample weights, and covariance matrices. In contrast to traditional weighted K-means, which treats these components separately, the proposed method enables a more cohesive and adaptive optimization strategy. By incorporating Mahalanobis distance to account for feature correlations and applying dynamic weighting, the approach is well-suited for complex clinical datasets. Tests on real-world medical data show that this method outperforms standard clustering algorithms, offering improved accuracy and more clearly defined cluster structures.
Estimation of Forest Disturbance from Retrospective Observations in a Broad-Scale Inventory
Understanding the extent and timing of forest disturbances and their impacts is critical to formulating effective management and policy responses. Broad-scale inventory programs provide key estimates of forest parameters that indicate the extent and severity of disturbance impacts. Here, we review the use of a post-stratified estimator in a panelized design, in the context of disturbance observations that are collected retrospectively. We further develop a sample weight adjustment that is requisite for proper estimation of the extent and timing of disturbances. Using populations from areas of Arkansas, California, and Maine in the US, the weight adjustment technique was tested in a Monte Carlo simulation. We found that the estimated area of disturbance using the weight adjustment technique had satisfactory agreement with the true population values and performed considerably better than the conventional post-stratified estimation approach. The proliferation of panelized forest inventory designs globally suggests that accurate estimates of areal extent and timing of disturbances will often require that weighting adjustment techniques be employed in the estimation process.