Catalogue Search | MBRL

Least Ambiguous Set-Valued Classifiers With Bounded Error Levels

by Sadinle, Mauricio , Lei, Jing , Wasserman, Larry in Ambiguity , Ambiguous observation , Asymptotic properties

2019

In most classification tasks, there are observations that are ambiguous and therefore difficult to correctly label. Set-valued classifiers output sets of plausible labels rather than a single label, thereby giving a more appropriate and informative treatment to the labeling of ambiguous instances. We introduce a framework for multiclass set-valued classification, where the classifiers guarantee user-defined levels of coverage or confidence (the probability that the true label is contained in the set) while minimizing the ambiguity (the expected size of the output). We first derive oracle classifiers assuming the true distribution to be known. We show that the oracle classifiers are obtained from level sets of the functions that define the conditional probability of each class. Then we develop estimators with good asymptotic and finite sample properties. The proposed estimators build on existing single-label classifiers. The optimal classifier can sometimes output the empty set, but we provide two solutions to fix this issue that are suitable for various practical needs. Supplementary materials for this article are available online.

Journal Article

Share this book

Add to My Shelf

An All-Inclusive Machine Learning and Deep Learning Method for Forecasting Cardiovascular Disease in Bangladeshi Population

by Ghosh, Hritwik , Mandava, Manjula , Vinta, Surendra Reddy in Accuracy , Algorithms , Cardiovascular disease

2024

INTRODUCTION: Cardiovascular disease is a major concern and pressing issue faced by the healthcare sector globally. According to a survey conducted by the WHO every year, CVDs cause 17.9 million deaths worldwide. Lack of pre-prediction of CVDs is a significant factor contributing to the death of patients. Predicting CVDs is a challenging task for medical practitioners as it requires a high level of medical analysis skills and extensive knowledge. OBJECTIVES: We believe that the improvement in the accuracy of prediction can significantly reduce the risk caused by CVDs and help medical practitioners better diagnose patients . METHODS: In this study, We created a CVD prediction model. using a ML approach. We utilized various algorithms, including logistic regression, Gaussian Naive Baye, Bernoulli Naive Baye, SVM, KNN, optimized KNN, X Gradient Boosting, and random forest algorithms to analyze and predict CVDs. RESULTS: Our developed prediction model achieved an accuracy of 96.7%, indicating its effectiveness in predicting CVDs. DL algorithms can also assist in identifying, classifying, and quantifying patterns of medical images, improving patient evaluation and diagnosis based on prior medical history and evaluation patterns. CONCLUSION: Furthermore, deep learning algorithms can help in developing new drugs with minimum cost by reducing the number of clinical research trials, using prior prediction of the drug's efficacy.

Journal Article

Share this book

Add to My Shelf

Leveraging Machine Learning for Fraudulent Social Media Profile Detection

by Ramdas, Soorya , Agnes, Neenu N. T. in Decision Tree Classifier , Dummy Classifier , MLP (MultiLayer Perceptron) Classifier

2024

Fake social media profiles are responsible for various cyber-attacks, spreading fake news, identity theft, business and payment fraud, abuse, and more. This paper aims to explore the potential of Machine Learning in detecting fake social media profiles by employing various Machine Learning algorithms, including the Dummy Classifier, Support Vector Classifier (SVC), Support Vector Classifier (SVC) kernels, Random Forest classifier, Random Forest Regressor, Decision Tree Classifier, Decision Tree Regressor, MultiLayer Perceptron classifier (MLP), MultiLayer Perceptron (MLP) Regressor, Naïve Bayes classifier, and Logistic Regression. For a comprehensive evaluation of the performance and accuracy of different models in detecting fake social media profiles, it is essential to consider confusion matrices, sampling techniques, and various metric calculations. Additionally, incorporating extended computations such as root mean squared error, mean absolute error, mean squared error and cross-validation accuracy can further enhance the overall performance of the models.

Journal Article

Share this book

Add to My Shelf

Two-Stage Hybrid Data Classifiers Based on SVM and kNN Algorithms

by Demidova, Liliya A. in Algorithms , Classification , Classifiers

2021

The paper considers a solution to the problem of developing two-stage hybrid SVM-kNN classifiers with the aim to increase the data classification quality by refining the classification decisions near the class boundary defined by the SVM classifier. In the first stage, the SVM classifier with default parameters values is developed. Here, the training dataset is designed on the basis of the initial dataset. When developing the SVM classifier, a binary SVM algorithm or one-class SVM algorithm is used. Based on the results of the training of the SVM classifier, two variants of the training dataset are formed for the development of the kNN classifier: a variant that uses all objects from the original training dataset located inside the strip dividing the classes, and a variant that uses only those objects from the initial training dataset that are located inside the area containing all misclassified objects from the class dividing strip. In the second stage, the kNN classifier is developed using the new training dataset above-mentioned. The values of the parameters of the kNN classifier are determined during training to maximize the data classification quality. The data classification quality using the two-stage hybrid SVM-kNN classifier was assessed using various indicators on the test dataset. In the case of the improvement of the quality of classification near the class boundary defined by the SVM classifier using the kNN classifier, the two-stage hybrid SVM-kNN classifier is recommended for further use. The experimental results approve the feasibility of using two-stage hybrid SVM-kNN classifiers in the data classification problem. The experimental results obtained with the application of various datasets confirm the feasibility of using two-stage hybrid SVM-kNN classifiers in the data classification problem.

Journal Article

Share this book

Add to My Shelf

Nearest neighbors distance ratio open-set classifier

by Rocha, Anderson , Penatti, Otávio A. B. , Mendes Júnior, Pedro R. in Artificial Intelligence , Benchmarks , Classification

2017

In this paper, we propose a novel multiclass classifier for the open-set recognition scenario. This scenario is the one in which there are no a priori training samples for some classes that might appear during testing. Usually, many applications are inherently open set. Consequently, successful closed-set solutions in the literature are not always suitable for real-world recognition problems. The proposed open-set classifier extends upon the Nearest-Neighbor (NN) classifier. Nearest neighbors are simple, parameter independent, multiclass, and widely used for closed-set problems. The proposed Open-Set NN (OSNN) method incorporates the ability of recognizing samples belonging to classes that are unknown at training time, being suitable for open-set recognition. In addition, we explore evaluation measures for open-set problems, properly measuring the resilience of methods to unknown classes during testing. For validation, we consider large freely-available benchmarks with different open-set recognition regimes and demonstrate that the proposed OSNN significantly outperforms their counterparts in the literature.

Journal Article

Share this book

Add to My Shelf

Comparative study of machine learning algorithms for Kannada twitter sentimental analysis

by Bhuyyar, Rani , Ijeri, Dakshayani , Burkaposh, Sayed Salman in Algorithms , Comparative studies , Computer Communication Networks

2024

Analyzing the client’s reviews from various online platform helps to improvise the business to higher levels. These User’s opinions can be analyzed using Sentiment Analysis. Sentimental analysis on Indian languages is a tedious work as there is a wide diversity in different languages of the India. Kannada is one of the prominent languages in India as 43 million of Indian population use Kannada as their native language for communication and it holds 27 th rank among top 30 languages across the world, as there is very less work carried out on Indian languages, especially in Kannada language, more work is required to process the Kannada language across different domains. The sentimental analysis on the Kannada language has the accuracy about 72% from the previous work. So, in this work, we have made comparative study of various machine learning algorithms for Kannada Twitter sentimental analysis. It is experimented on live Twitter data and found that Multinomial Naive Bayes Classifier has performed better with accuracy of 75%.

Journal Article

Share this book

Add to My Shelf

A hybrid hierarchical framework for classification of breast density using digitized film screen mammograms

by Bhadauria, H. S. , Thakur, Shruti , Kumar, Indrajeet in Breast , Breast cancer , Classification

2017

In the present work, a hybrid hierarchical framework for classification of breast density using digitized film screen mammograms has been proposed. For designing of an efficient classification framework 480 MLO view digitized screen film mammographic images are taken from DDSM dataset. The ROIs of fixed size i.e. 128 × 128 pixels are cropped from the center area of the breast (i.e. the area where glandular ducts are prominent). A total of 292 texture features based on statistical methods, signal processing based methods and transform domain based methods are computed for each ROI. The computed feature vector is subjected to PCA for dimensionality reduction. The reduced feature space is fed to the classification module. In this work 4-class breast density classification has been conducted using hierarchical framework where the first classifier is used to classify an unknown test ROI into B-I / other class . If the test ROI is predicted as other class , it is inputted to second classifier for the classification into B-II / dense class . If the test ROI is predicted as belonging to dense class , it is inputted to classifier for the classification into B-III / B-IV class. In this work five hierarchical classifiers designs consisting of 3 PCA- k NN, 3 PCA-PNN, 3 PCA-ANN, 3 PCA-NFC and 3 PCA-SVM classifiers has been proposed. The obtained maximum OCA value is 80.4% using PCA-NFC in hierarchical approach. Further, the best performing individual classifiers are clubbed together in a hierarchical framework to design hybrid hierarchical framework for classification of breast density using digitized screen film mammograms. The proposed hybrid hierarchical framework yields the OCA value of 84.1%. The result achieved by the proposed hybrid hierarchical framework is quite promising and can be used in clinical environment for differentiation between different breast density patterns.

Journal Article

Share this book

Add to My Shelf

On the analytical study of the service quality of Indian Railways under soft-computing paradigm

by Karpenko, Mykola , Singh, Aarti , Singh, Anupama in Classifiers , Decision support systems , Machine learning

2024

Indian Railway Catering and Tourism Corporation (IRCTC) is among the busiest railways reservation systems since the Indian Railways (IR) is the vital and economical mode of transportation in India. Hence, rating of the trains seems to be critical aspect for selecting an appropriate train for travelling. In this study, we have considered 7 vital attributes of 500 popular trains and rate their performance based on 7 important related attributes. For this purpose, we have employed 2 different approaches to analyse of the train attributes, which eventually contribute to the overall performance of the trains. Here, we have developed a rule based rough set decision support system to analyse the criticality of the train attributes while rating the train performance. Furthermore, we have also used 3 Machine Learning (ML) model estimators: Extra Trees Classifier (ETC), Support Vector Machine Classifier (SVMC) and Multinomial Naive Bayes Classifier (MNBC) and perform their comparative analysis with respect to 7 performance metrics while predicting the overall train rating based.

Journal Article

Share this book

Add to My Shelf

Dementia classification using MR imaging and clinical data with voting based machine learning models

by Bharati, Subrato , Thanh, Dang Ngoc Hoang , Podder, Prajoy in Accuracy , Classification , Classifiers

2022

Dementia is one of the leading causes of severe cognitive decline, it induces memory loss and impairs the daily life of millions of people worldwide. In this work, we consider the classification of dementia using magnetic resonance (MR) imaging and clinical data with machine learning models. We adapt univariate feature selection in the MR data pre-processing step as a filter-based feature selection. Bagged decision trees are also implemented to estimate the important features for achieving good classification accuracy. Several ensemble learning-based machine learning approaches, namely gradient boosting (GB), extreme gradient boost (XGB), voting-based, and random forest (RF) classifiers, are considered for the diagnosis of dementia. Moreover, we propose voting-based classifiers that train on an ensemble of numerous basic machine learning models, such as the extra trees classifier, RF, GB, and XGB. The implementation of a voting-based approach is one of the important contributions, and the performance of different classifiers are evaluated in terms of precision, accuracy, recall, and F1 score. Moreover, the receiver operating characteristic curve (ROC) and area under the ROC curve (AUC) are used as metrics for comparing these classifiers. Experimental results show that the voting-based classifiers often perform better compared to the RF, GB, and XGB in terms of precision, recall, and accuracy, thereby indicating the promise of differentiating dementia from imaging and clinical data.

Journal Article

Share this book

Add to My Shelf

Decision-making in clinical diagnostic for brain tumor detection based on advanced machine ‎learning algorithm

by Jiang, Ensong , Huang, Tangsen , Yin, Xiangdong in ada boost classifier , brain tumor classification , gaussian process classifier

2025

Brain tumors, abnormal growths in the brain or spinal canal, can be benign or malignant, causing symptoms like headaches, seizures, and cognitive decline by disrupting brain function. Therefore, developing reliable predictive models for diagnosis and prognosis is crucial. In this paper, the prediction of brain tumors is made using machine learning models enhanced by an optimizer, namely Escaping Bird Search Optimization. Optimized models incorporate Ada Boost Classifier (ADEB), Gaussian Process Classifier (GPEB), and Support Vector Classifier (SVC) which, after being tested on a few databases, were named ADEB, SVEB, and GPEB, respectively, and their predictive power was assessed. The best single model performance overall on all databases is the SVC with an average accuracy of 0.981, while among enhanced models, the optimized model, called SVEB, using SVC, attained the highest accuracy for all models and reached as high as 0.990. These findings underscore the role of optimization techniques and demonstrate the effectiveness of machine learning in predicting brain cancers. The improved performance of the enhanced SVC model, SVEB, suggests it could offer a reliable approach for accurate brain tumor prediction. Enhanced patient outcomes and early diagnosis could be an implication of this in the field of neuro-oncology.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter