Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
3,539
result(s) for
"supervised model"
Sort by:
Predictive maintenance in Industry 4.0: a survey of planning models and machine learning techniques
by
Panjanathan, Rukmani
,
Hector, Ida
in
Artificial Intelligence
,
Artificial intelligence algorithms
,
Data Mining and Machine Learning
2024
Equipment downtime resulting from maintenance in various sectors around the globe has become a major concern. The effectiveness of conventional reactive maintenance methods in addressing interruptions and enhancing operational efficiency has become inadequate. Therefore, acknowledging the constraints associated with reactive maintenance and the growing need for proactive approaches to proactively detect possible breakdowns is necessary. The need for optimisation of asset management and reduction of costly downtime emerges from the demand for industries. The work highlights the use of Internet of Things (IoT)-enabled Predictive Maintenance (PdM) as a revolutionary strategy across many sectors. This article presents a picture of a future in which the use of IoT technology and sophisticated analytics will enable the prediction and proactive mitigation of probable equipment failures. This literature study has great importance as it thoroughly explores the complex steps and techniques necessary for the development and implementation of efficient PdM solutions. The study offers useful insights into the optimisation of maintenance methods and the enhancement of operational efficiency by analysing current information and approaches. The article outlines essential stages in the application of PdM, encompassing underlying design factors, data preparation, feature selection, and decision modelling. Additionally, the study discusses a range of ML models and methodologies for monitoring conditions. In order to enhance maintenance plans, it is necessary to prioritise ongoing study and improvement in the field of PdM. The potential for boosting PdM skills and guaranteeing the competitiveness of companies in the global economy is significant through the incorporation of IoT, Artificial Intelligence (AI), and advanced analytics.
Journal Article
Impact of Dataset Size on Classification Performance: An Empirical Evaluation in the Medical Domain
by
Al-Baity, Heyam
,
Abou Elwafa, Afnan
,
Dris, Alanoud Bin
in
Accuracy
,
Algorithms
,
Classification
2021
Dataset size is considered a major concern in the medical domain, where lack of data is a common occurrence. This study aims to investigate the impact of dataset size on the overall performance of supervised classification models. We examined the performance of six widely-used models in the medical field, including support vector machine (SVM), neural networks (NN), C4.5 decision tree (DT), random forest (RF), adaboost (AB), and naïve Bayes (NB) on eighteen small medical UCI datasets. We further implemented three dataset size reduction scenarios on two large datasets and analyze the performance of the models when trained on each resulting dataset with respect to accuracy, precision, recall, f-score, specificity, and area under the ROC curve (AUC). Our results indicated that the overall performance of classifiers depend on how much a dataset represents the original distribution rather than its size. Moreover, we found that the most robust model for limited medical data is AB and NB, followed by SVM, and then RF and NN, while the least robust model is DT. Furthermore, an interesting observation is that a robust machine learning model to limited dataset does not necessary imply that it provides the best performance compared to other models.
Journal Article
Using Satellite Telemetry and Aerial Counts to Estimate Space Use by Grey Seals around the British Isles
by
Duck, Callan
,
McConnell, Bernie
,
Matthiopoulos, Jason
in
Animal, plant and microbial ecology
,
Animals
,
Applied ecology
2004
1. In the UK, resolving conflicts between the conservation of grey seals, the management of fish stocks and marine exploitation requires knowledge of the seals' use of space. We present a map of grey seal usage around the British Isles based on satellite telemetry data from adult animals and haul-out survey data. 2. Our approach combined modelling and interpolation. To model the seals' association with particular coastal sites (the haul-outs), we divided the population into sub-populations associated with 24 haul-out groups. Haul-out-specific maps of accessibility were used to supervise usage estimation from satellite telemetry. The mean and variance of seal numbers at each haul-out group were obtained from haul-out counts. The aggregate map of usage for the entire population was produced by adding together the haul-out-specific usage maps, weighted by mean number of animals using that haul-out. 3. Seal usage was primarily concentrated (i) off the northern coasts of the British Isles, (ii) closer to the coast than might be expected purely on the basis of accessibility from the haul-outs and (iii) in a limited number of marine hot-spots. 4. Although our results currently represent the best estimate of how grey seals use the marine environment around Britain, they are neither definitive nor equally precise for all haul-outs. Further data collection should focus in the south-west of the British isles and aerial counts should be repeated for all haul-outs. 5. Synthesis and applications. This work provides environmental managers with current estimates of grey seal usage and describes a methodology for maximizing data efficiency. Our results could guide government departments in licensing marine exploitation by the oil industry, in estimating grey seal predation pressure on vulnerable or economically important prey and in delineating marine special areas of conservation (SAC). Our finding that grey seal usage is characterized by a limited number of hot-spots means that the species is particularly suited to localized conservation efforts.
Journal Article
Supervised Multi-Layer Conditional Variational Auto-Encoder for Process Modeling and Soft Sensor
2023
Variational auto-encoders (VAE) have been widely used in process modeling due to the ability of deep feature extraction and noise robustness. However, the construction of a supervised VAE model still faces huge challenges. The data generated by the existing supervised VAE models are unstable and uncontrollable due to random resampling in the latent subspace, meaning the performance of prediction is greatly weakened. In this paper, a new multi-layer conditional variational auto-encoder (M-CVAE) is constructed by injecting label information into the latent subspace to control the output data generated towards the direction of the actual value. Furthermore, the label information is also used as the input with process variables in order to strengthen the correlation between input and output. Finally, a neural network layer is embedded in the encoder of the model to achieve online quality prediction. The superiority and effectiveness of the proposed method are demonstrated by two real industrial process cases that are compared with other methods.
Journal Article
A deep learning approach to dysarthric utterance classification with BiLSTM-GRU, speech cue filtering, and log mel spectrograms
2024
Assessing the intelligibility of dysarthric speech, characterized by intricate speaking rhythms presents formidable challenges. Current techniques for objectively testing speech intelligibility are burdensome and subjective, particularly struggling with dysarthric spoken utterances. To tackle these hurdles, our method conducts an ablation analysis across speakers afflicted with speech impairment. We utilize a unified approach that incorporates both auditory and visual elements to improve the classification of dysarthric spoken utterances. In our quest to enhance spoken utterance recognition, we propose employing two distinctive extractive transformer-based approaches. Initially, we employ SepFormer to refine the speech signal, prioritizing the enhancement of signal clarity. Subsequently, we feed the improved audio samples into Swin transformer after converting them into log mel spectrograms. Additionally, we harness the power of the Swin transformer for visual classification, trained on a dataset of 14 million annotated images from ImageNet. The pre-trained scores from the Swin transformer are utilized as input for the deep bidirectional long short-term memory with gated recurrent unit (deep BiLSTM-GRU) model, facilitating the classification of spoken utterances. Our proposed deep BiLSTM-GRU model for spoken utterance classification yields impressive results on the EasyCall speech corpus, encompassing cognitive characteristics across spoken utterances ranging from 10 to 20, delivered by both healthy individuals and those with dysarthria. Notably, our results showcase an accuracy of 98.56% for 20 utterances in male speakers, 95.11% in female speakers, and 97.64% in combined male and female speakers. Across diverse scenarios, our approach consistently achieves remarkable accuracy, surpassing other contemporary methods, all without necessitating data augmentation.
Journal Article
Joint graph and reduced flexible manifold embedding for scalable semi-supervised learning
2023
Recently, graph-based semi-supervised learning (GSSL) has received much attention. On the other hand, less attention has been paid to the problem of large-scale GSSL for inductive multi-class classification. Existing scalable GSSL methods rely on a hard linear constraint. They cannot predict the labelling of test samples, or use predefined graphs, which limits their applications and performance. In this paper, we propose an inductive algorithm that can handle large databases by using anchors. The main contribution compared to existing scalable semi-supervised models is the integration of the anchor graph computation into the learned model. We develop a criterion to jointly estimate the unlabeled sample labels, the mapping of the feature space to the label space, and the affinity matrix of the anchor graph. Furthermore, the fusion of labels and features of anchors is used to construct the graph. Using the projection matrix, it can also predict the labels of the test samples by linear transformation. Experimental results on the large datasets NORB, RCV1 and Covtype show the effectiveness, scalability and superiority of the proposed method. The code of the proposed method can be found at the following link https://github.com/ZoulfikarIB/SGRFME .
Journal Article
Assessing the performance of machine learning models for default prediction under missing data and class imbalance: A simulation study
2024
In the field of machine learning, robust model performance is essential for accurate predictions and informed decision-making. One critical challenge that hampers the performance of machine learning algorithms is the presence of missing data. Missing values are ubiquitous in real-world datasets and can substantially impact the performance of predictive models. This study explored the impact of increasing levels of missing values on the performance of machine learning models. Simulated samples with missing values ranging from 5% to 50% were generated, and various models were evaluated accordingly. The results demonstrated a consistent trend of deteriorating model performance as the amount of missing values increases. Higher levels of missing values lead to decreased accuracy scores across all models. Among the models evaluated, decision trees (DT) and random forests (RF) consistently demonstrated high accuracy scores across all sampling techniques, showcasing their robustness in handling missing values. Logistic regression (LR) also performed relatively well, showing consistent performance across different levels of missing values. On the other hand, stochastic gradient descent classifier (SGDC), K-nearest neighbours (kNN), and naïve Bayes (NB) models consistently exhibited lower accuracy scores across all sampling techniques, indicating limitations in handling missing values even when the dataset was more balanced. Furthermore, the study highlights the superiority of the SMOTE (Synthetic Minority OVER-sampling Technique) sampling technique compared to the UNDER-sampling approach. Models trained using SMOTE consistently achieved higher accuracy scores across all levels of missing values. This suggests that SMOTE sampling effectively handles imbalanced datasets and enhances classification performance, particularly when dealing with missing values. As the quest for accurate predictions gains paramount importance, addressing the pervasive challenge of missing data emerges as a cornerstone for unlocking the true potential of machine learning in real-world applications.
Journal Article
DLCPD-25: A Large-Scale and Diverse Dataset for Crop Disease and Pest Recognition
2025
The accurate identification of crop pests and diseases is critical for global food security, yet the development of robust deep learning models is hindered by the limitations of existing datasets. To address this gap, we introduce DLCPD-25, a new large-scale, diverse, and publicly available benchmark dataset. We constructed DLCPD-25 by integrating 221,943 images from both online sources and extensive field collections, covering 23 crop types and 203 distinct classes of pests, diseases, and healthy states. A key feature of this dataset is its realistic complexity, including images from uncontrolled field environments and a natural long-tail class distribution, which contrasts with many existing datasets collected under controlled conditions. To validate its utility, we pre-trained several state-of-the-art self-supervised learning models (MAE, SimCLR v2, MoCo v3) on DLCPD-25. The learned representations, evaluated via linear probing, demonstrated strong performance, with the SimCLR v2 framework achieving a top accuracy of 72.1% and an F1 score (Macro F1) of 71.3% on a downstream classification task. Our results confirm that DLCPD-25 provides a valuable and challenging resource that can effectively support the training of generalizable models, paving the way for the development of comprehensive, real-world agricultural diagnostic systems.
Journal Article
Innovative Research on Illustration Design Integrating Color Science and Image Processing Technology
2024
How to create intelligently and efficiently has become a hot research topic in the field of illustration design. In this paper, starting from the acceptance intention to satisfy users’ favorites and needs, we propose an intelligent analysis method of style based on a deep clustering model and use the categorized style attributes as a reference to disperse designers’ creative thinking. In addition, a semi-supervised semantic segmentation method is introduced to construct an automatic coloring model for sketches, and the automatic coloring function is realized by incorporating the color label recognition process, which further improves the efficiency of designers’ work. The results of clustering analysis of the STL dataset show that the ACC (78.39%), NMI (67.78%), and ARI (62.93%) metrics of the VIT+K-Means model have the largest results compared with other feature extraction methods. Results are superior to other feature extraction methods. Not only that, adding a pseudo-label retraining process for this model further improves the results of the three metrics by 9.13%, 9.97%, and 10.78%, and the visual analysis experiments of clustering clusters also verified the performance enhancement effect. In the comparative analysis of illustration coloring models, the AdvSSL scheme with semi-supervised strategy achieves the best performance in the semantic segmentation task, with an improvement of 4.31% and 2.29% over the SSAN scheme and the S4GCN scheme, respectively. The experts’ coloring evaluation shows that the improved sketch coloring model has the highest results in all four dimensions compared to the traditional Tag2Pix model.
Journal Article
Self-Relation Attention and Temporal Awareness for Emotion Recognition via Vocal Burst
2022
Speech emotion recognition (SER) is one of the most exciting topics many researchers have recently been involved in. Although much research has been conducted recently on this topic, emotion recognition via non-verbal speech (known as the vocal burst) is still sparse. The vocal burst is concise and has meaningless content, which is harder to deal with than verbal speech. Therefore, in this paper, we proposed a self-relation attention and temporal awareness (SRA-TA) module to tackle this problem with vocal bursts, which could capture the dependency in a long-term period and focus on the salient parts of the audio signal as well. Our proposed method contains three main stages. Firstly, the latent features are extracted using a self-supervised learning model from the raw audio signal and its Mel-spectrogram. After the SRA-TA module is utilized to capture the valuable information from latent features, all features are concatenated and fed into ten individual fully-connected layers to predict the scores of 10 emotions. Our proposed method achieves a mean concordance correlation coefficient (CCC) of 0.7295 on the test set, which achieves the first ranking of the high-dimensional emotion task in the 2022 ACII Affective Vocal Burst Workshop & Challenge.
Journal Article