Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
48
result(s) for
"F1 score"
Sort by:
The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation
2020
Background
To evaluate binary classifications and their confusion matrices, scientific researchers can employ several statistical rates, accordingly to the goal of the experiment they are investigating. Despite being a crucial issue in machine learning, no widespread consensus has been reached on a unified elective chosen measure yet. Accuracy and F
1
score computed on confusion matrices have been (and still are) among the most popular adopted metrics in binary classification tasks. However, these statistical measures can dangerously show overoptimistic inflated results, especially on imbalanced datasets.
Results
The Matthews correlation coefficient (MCC), instead, is a more reliable statistical rate which produces a high score only if the prediction obtained good results in all of the four confusion matrix categories (true positives, false negatives, true negatives, and false positives), proportionally both to the size of positive elements and the size of negative elements in the dataset.
Conclusions
In this article, we show how MCC produces a more informative and truthful score in evaluating binary classifications than accuracy and F
1
score, by first explaining the mathematical properties, and then the asset of MCC in six synthetic use cases and in a real genomics scenario. We believe that the Matthews correlation coefficient should be preferred to accuracy and F
1
score in evaluating binary classification tasks by all scientific communities.
Journal Article
Analysis of Weed Growth in Rabi Crop Agriculture Using Deep Convolutional Neural Networks
by
Mishra, Anand Muni
,
Shahare, Yogesh
,
Gautam, Vinay
in
Deep convolutional Nural Network
,
EfficientNet-B7
,
F1 Score
2021
Weed interference for the duration of crop establishment is a severe difficulty for wheat in North India [22.9734 ° N, 78.6569 ° E]. In situ far-flung detection for precision herbicide application minimizes the danger of both crop damage and herbicide input. This research paper focuses on the comparative study of crop growth and its effect at three different places in Madhya Pradesh [24.5840° N, 81.5020° E] India[20.5937° N, 78.9629° E]. These weed species included Pigweed (Amaranthaceae ), Goosefoot [Chenopodiaceae], Wild oat species (Poaceae), livid amaranth (Amaranthus blitum L.), Fathen[Chenopodiaceae (L.) Wild.], and Bermuda grass (Poaceae L.) a significant weed for rabi crop production in India with sensitivity to clopyralid, is the best available put up broadleaf herbicide. The intention of the Takes a look to assess the accuracy of four different CNNs architectures to locate the weed images of the Rabi crop of the family of various Rabi crops growing in competition with Rabi crops at 3 sites in Madhya Pradesh. Four CNNs have been compared, including object detection-primarily based ResNet-50, image classification-based VGGNet-16, Inception v4 and EfficientNet-B7 the EfficientNet-B7 networks have been trained to hit upon both leaves or canopies Everlasting of weeds. Image classification the use of ResNet-50 and VGGNet-16 was largely unsuccessful all through validation with whole pics (Fl-score < 0.04). CNN training elevated the usage of cropped photographs Eternal Broad Fall detection at some stage invalidation for VGGNet (F1-score = 0.77) and ResNet-50 (F1-Score = 0.62). The rabi crop weed leaf-trained inception V4 and EfficientNet-B7 achieved the highest F1-Score (0.94) and F1 Score (0.96) respectively, The aim of leaf-based EfficientNet-B7 extended false positives, even though such errors could be won over with extra training images for network desensitization training. Photograph-based faraway sensing rabi crop will become the most viable CNN test for weeds in competition with the EfficientNet-B7 crop.
Journal Article
A Comprehensive Analysis on Detecting Chronic Kidney Disease by Employing Machine Learning Algorithms
by
Shikder, Fahim
,
Hoque, Md. Ashraful
,
Rahman Dip, Rezuanur
in
Accuracy
,
Algorithms
,
chronic kidney disease
2022
INTRODUCTION: Chronic Kidney Disease refers to the slow, progressive deterioration of kidney functions. However, the impairment is irreversible and imperceptible up until the disease reaches one of the later stages, demanding early detection and initiation of treatment in order to ensure a good prognosis and prolonged life. In this aspect, machine learning algorithms have proven to be promising, and points towards the future of disease diagnosis.OBJECTIVES: We aim to apply different machine learning algorithms for the purpose of assessing and comparing their accuracies and other performance parameters for the detection of chronic kidney disease.METHODS: The ‘chronic kidney disease dataset’ from the machine learning repository of University of California, Irvine, has been harnessed, and eight supervised machine learning models have been developed by utilizing the python programming language for the detection of the disease.RESULTS: A comparative analysis is portrayed among eight machine learning models by evaluating different performance parameters like accuracy, precision, sensitivity, F1 score and ROC-AUC. Among the models, Random Forest displayed the highest accuracy of 99.75%.CONCLUSION: We observed that machine learning algorithms can contribute significantly to the domain of predictive analysis of chronic kidney disease, and can assist in developing a robust computer-aided diagnosis system to aid the healthcare professionals in treating the patients properly and efficiently.
Journal Article
Detection of Autism Spectrum Disorder in Children Using Machine Learning Techniques
by
Krishnan, Deepa
,
Purkayastha, Diya
,
Vakadkar, Kaushik
in
Accuracy
,
Advanced Computing and Data Sciences
,
Algorithms
2021
Autism Spectrum Disorder (ASD) is a neurological disorder which might have a lifelong impact on the language learning, speech, cognitive, and social skills of an individual. Its symptoms usually show up in the developmental stages, i.e., within the first two years after birth, and it impacts around 1% of the population globally [
https://www.autism-society.org/whatis/facts-and-statistics/
. Accessed 25 Dec 2019]. ASD is mainly caused by genetics or by environmental factors; however, its conditions can be improved by detecting and treating it at earlier stages. In the current times, clinical standardized tests are the only methods which are being used, to diagnose ASD. This not only requires prolonged diagnostic time but also faces a steep increase in medical costs. To improve the precision and time required for diagnosis, machine learning techniques are being used to complement the conventional methods. We have applied models such as Support Vector Machines (SVM), Random Forest Classifier (RFC), Naïve Bayes (NB), Logistic Regression (LR), and KNN to our dataset and constructed predictive models based on the outcome. The main objective of our paper is to thus determine if the child is susceptible to ASD in its nascent stages, which would help streamline the diagnosis process. Based on our results, Logistic Regression gives the highest accuracy for our selected dataset.
Journal Article
A Study on Recognition of Different Kinds of Instruments in Marine Engine Room Based on SE-Mix Up YOLOv5
by
Zang, Shaokang
,
Gan, Huibing
,
Hong, Geer
in
Computer vision
,
Datasets
,
Engine room instruments
2025
The application of machine vision in identifying various types of instruments holds significant research value in enhancing the intelligence level of ship engine rooms. This paper presents the development of a SE-MIXUP YOLO model based on the YOLOv5 algorithm, capable of recognizing different types of instruments in complex ship engine room environments. First, instrument images of different types from different ship engine rooms were collected as a self-built data set, and the image data set was enhanced according to the actual situation of the engine room. In this study, the YOLOv5 model was trained based on the PyTorch framework in the Anaconda virtual environment. The analysis shows that the model achieves an average precision (mAP) of 1.00 when the confidence threshold is 0.970; the average F1 score is 0.84 when the confidence threshold is 0.596. The model can effectively identify various instruments such as pressure gauges, thermometers, and level gauges in complex ship engine room environments, verifying its good environmental adaptability and robustness.
Journal Article
Machine learning approaches for anomaly detection of water quality on a real-world data set
by
Logofătu, Doina
,
Leon, Florin
,
Muharemi, Fitore
in
Algorithms
,
Anomalies
,
Artificial neural networks
2019
Accurate detection of water quality changes is a crucial task of water companies. Water supply companies must provide safe drinking water. Nowadays in different areas, we find sensible sensors which monitor data during the time. Normally the data registered by the sensors contain a meaning, such as there can be any event. Sometimes the data are ill-understood and stating if there is an event which is difficult. This work represents the description of several approaches to identifying changes or anomalies occurring on water quality time series data. This work also discusses and proposes a solution to some challenges when dealing with time series data. The following models are applied to water quality data: logistic regression, linear discriminant analysis, support vector machines (SVM), artificial neural network (ANN), deep neural network (DNN), recurrent neural network (RNN) and long short-term memory (LSTM). The performance evaluation is conducted using F-score metric. A simulation study is conducted to check the performance of each algorithm using F-score. Solving imbalanced data is basically intentionally biasing the data to get interesting results instead of accurate results. The results show that all algorithms are vulnerable although SVM, ANN and logistic regressions tend to be a little less vulnerable, while DNN, RNN and LSTM are very vulnerable.
Journal Article
Predicting the Onset of Diabetes with Machine Learning Methods
2023
The number of people suffering from diabetes in Taiwan has continued to rise in recent years. According to the statistics of the International Diabetes Federation, about 537 million people worldwide (10.5% of the global population) suffer from diabetes, and it is estimated that 643 million people will develop the condition (11.3% of the total population) by 2030. If this trend continues, the number will jump to 783 million (12.2%) by 2045. At present, the number of people with diabetes in Taiwan has reached 2.18 million, with an average of one in ten people suffering from the disease. In addition, according to the Bureau of National Health Insurance in Taiwan, the prevalence rate of diabetes among adults in Taiwan has reached 5% and is increasing each year. Diabetes can cause acute and chronic complications that can be fatal. Meanwhile, chronic complications can result in a variety of disabilities or organ decline. If holistic treatments and preventions are not provided to diabetic patients, it will lead to the consumption of more medical resources and a rapid decline in the quality of life of society as a whole. In this study, based on the outpatient examination data of a Taipei Municipal medical center, 15,000 women aged between 20 and 80 were selected as the subjects. These women were patients who had gone to the medical center during 2018–2020 and 2021–2022 with or without the diagnosis of diabetes. This study investigated eight different characteristics of the subjects, including the number of pregnancies, plasma glucose level, diastolic blood pressure, sebum thickness, insulin level, body mass index, diabetes pedigree function, and age. After sorting out the complete data of the patients, this study used Microsoft Machine Learning Studio to train the models of various kinds of neural networks, and the prediction results were used to compare the predictive ability of the various parameters for diabetes. Finally, this study found that after comparing the models using two-class logistic regression as well as the two-class neural network, two-class decision jungle, or two-class boosted decision tree for prediction, the best model was the two-class boosted decision tree, as its area under the curve could reach a score of 0.991, which was better than other models.
Journal Article
Automated segmentation of endometriosis using transfer learning technique version 2; peer review: 1 approved
2022
Background: This paper focuses on segmenting the exact location of endometriosis using the state-of-art technique known as U-Net. Endometriosis is a progressive disorder that has a significant impact on women. The lesion-like appearance that grows inside the uterus and sheds for every periodical cycle is known as endometriosis. If the lesion exists and is transferred to other locations in the women's reproductive system, it may lead to a serious problem. Besides radiologists deep learning techniques exist for recognizing the presence and aggravation of endometriosis.
Methods: The proposed method known as structural similarity analysis of endometriosis (SSAE) identifies the similarity between pathologically identified and annotated images obtained from standardized dataset known as GLENDA v1.5 by implementing two systematic approaches. The first approach is based on semantic segmentation and the second approach uses statistical analysis. Semantic segmentation is a cutting-edge technology for identifying exact locations by performing pixel-level classification. In semantic segmentation, U-Net is a transfer-learning architecture that works effectively for biomedical image classification. The SSAE implements the U-Net architecture for segmenting endometriosis based on the region of occurrence. The second approach proves the similarity between pathologically identified images and the corresponding annotated images using a statistical evaluation. Statistical analysis was performed using calculation of both the mean and standard deviation of all four regions by implementing systematic sampling procedure.
Results: The SSAE obtains the intersection over union value of 0.72 and the F1 score of 0.74 for the trained dataset. The means of both the laparoscopic and annotated images for all regions were similar. Consequently, the SSAE facilitated the presence of abnormalities in a specific region.
Conclusions: The proposed SSAE approach identifies the affected region using U-Net architecture and systematic sampling procedure.
Journal Article
Evaluation of rural tourism development level using BERT-enhanced deep learning model and BP algorithm
2024
To address the insufficient expressive capabilities of traditional methods in assessing the development level of rural tourism, this study explores the fusion application of the Bidirectional Encoder Representations from Transformers (BERT) and the Back Propagation (BP) algorithm to enhance the accuracy and comprehensiveness of rural tourism development assessment. Firstly, this study introduces the BERT deep learning model and its applications in natural language processing, alongside the role of the BP algorithm in pattern recognition and predictive analysis. Subsequently, a framework for assessing rural tourism development levels, integrating BERT and the BP algorithm, is proposed. This framework collects multidimensional rural tourism-related data and utilizes the BERT model for sentiment analysis and topic extraction from textual data. Empirical analysis of rural tourism development in a specific region validates the effectiveness of the proposed approach. Experimental results demonstrate: (1) The model achieves an accuracy of 84.33% and an F1 score of 85.33% on the publicly available Laptop dataset, with a processing time of 20 s, significantly outperforming other methods. Compared to traditional approaches, the proposed method accurately captures correlations between textual information and numerical data, thereby enhancing the credibility and accuracy of assessment results. (2) From the ablation study results, it is evident that removing any component from the model leads to performance degradation. Specifically, removing the Bidirectional Gated Recurrent Unit (BiGRU) reduces accuracy and F1 scores to 78.21% and 76.33% on the Laptop dataset, and to 85.10% and 70.45% on the Tourist_F dataset. Removing Text Convolutional Neural Network (CNN) reduces accuracy and F1 scores to 79.34% and 77.56% on the Laptop dataset, and to 86.25% and 72.11% on the Tourist_F dataset. The most significant performance decline occurs upon removing BERT, with accuracy and F1 scores decreasing to 70.14% and 66.43% on the Laptop dataset, and to 78.45% and 60.33% on the Tourist_F dataset. These results indicate that BiGcRU, TextCNN, and BERT each contribute significantly to the model’s performance. This study provides substantial support for advancing sustainable development in rural tourism, offering significant practical and innovative value.
Journal Article
Detection, identification and alert of wild animals in surveillance videos using deep learning
2024
PurposeWith the rapid advancement of lifestyle and technology, human lives are becoming increasingly threatened. Accidents, exposure to dangerous substances and animal strikes are all possible threats. Human lives are increasingly being harmed as a result of attacks by wild animals. Further investigation into the cases reported revealed that such events can be detected early on. Techniques such as machine learning and deep learning will be used to solve this challenge. The upgraded VGG-16 model with deep learning-based detection is appropriate for such real-time applications because it overcomes the low accuracy and poor real-time performance of traditional detection methods and detects medium- and long-distance objects more accurately. Many organizations use various safety and security measures, particularly CCTV/video surveillance systems, to address physical security concerns. CCTV/video monitoring systems are quite good at visually detecting a range of attacks associated with suspicious behavior on the premises and in the workplace. Many have indeed begun to use automated systems such as video analytics solutions such as motion detection, object/perimeter detection, face recognition and artificial intelligence/machine learning, among others. Anomaly identification can be performed with the data collected from the CCTV cameras. The camera surveillance can generate enormous quantities of data, which is laborious and expensive to screen for the species of interest. Many cases have been recorded where wild animals enter public places, causing havoc and damaging lives and property. There are many cases where people have lost their lives to wild attacks. The conventional approach of sifting through images by eye can be expensive and risky. Therefore, an automated wild animal detection system is required to avoid these circumstances.Design/methodology/approachThe proposed system consists of a wild animal detection module, a classifier and an alarm module, for which video frames are fed as input and the output is prediction results. Frames extracted from videos are pre-processed and then delivered to the neural network classifier as filtered frames. The classifier module categorizes the identified animal into one of the several categories. An email or WhatsApp notice is issued to the appropriate authorities or users based on the classifier outcome.FindingsEvaluation metrics are used to assess the quality of a statistical or machine learning model. Any system will include a review of machine learning models or algorithms. A number of evaluation measures can be performed to put a model to the test. Among them are classification accuracy, logarithmic loss, confusion matrix and other metrics. The model must be evaluated using a range of evaluation metrics. This is because a model may perform well when one measurement from one evaluation metric is used but perform poorly when another measurement from another evaluation metric is used. We must utilize evaluation metrics to guarantee that the model is running correctly and optimally.Originality/valueThe output of conv5 3 will be of size 7*7*512 in the ImageNet VGG-16 in Figure 4, which operates on images of size 224*224*3. Therefore, the parameters of fc6 with a flattened input size of 7*7*512 and an output size of 4,096 are 4,096, 7*7*512. With reshaped parameters of dimensions 4,096*7*7*512, the comparable convolutional layer conv6 has a 7*7 kernel size and 4,096 output channels. The parameters of fc7 with an input size of 4,096 (i.e. the output size of fc6) and an output size of 4,096 are 4,096, 4,096. The input can be thought of as a one-of-a-kind image with 4,096 input channels. With reshaped parameters of dimensions 4,096*1*1*4,096, the comparable convolutional layer conv7 has a 1*1 kernel size and 4,096 output channels. It is clear that conv6 has 4,096 filters, each with dimensions 7*7*512, and conv7 has 4,096 filters, each with dimensions 1*1*4,096. These filters are numerous, large and computationally expensive. To remedy this, the authors opt to reduce both their number and the size of each filter by subsampling parameters from the converted convolutional layers. Conv6 will use 1,024 filters, each with dimensions 3*3*512. Therefore, the parameters are subsampled from 4,096*7*7*512 to 1,024*3*3*512. Conv7 will use 1,024 filters, each with dimensions 1*1*1,024. Therefore, the parameters are subsampled from 4,096*1*1*4,096 to 1,024*1*1*1,024.
Journal Article