Catalogue Search | MBRL

Leveraging Machine Learning for Fraudulent Social Media Profile Detection

by Ramdas, Soorya , Agnes, Neenu N. T. in Decision Tree Classifier , Dummy Classifier , MLP (MultiLayer Perceptron) Classifier

2024

Fake social media profiles are responsible for various cyber-attacks, spreading fake news, identity theft, business and payment fraud, abuse, and more. This paper aims to explore the potential of Machine Learning in detecting fake social media profiles by employing various Machine Learning algorithms, including the Dummy Classifier, Support Vector Classifier (SVC), Support Vector Classifier (SVC) kernels, Random Forest classifier, Random Forest Regressor, Decision Tree Classifier, Decision Tree Regressor, MultiLayer Perceptron classifier (MLP), MultiLayer Perceptron (MLP) Regressor, Naïve Bayes classifier, and Logistic Regression. For a comprehensive evaluation of the performance and accuracy of different models in detecting fake social media profiles, it is essential to consider confusion matrices, sampling techniques, and various metric calculations. Additionally, incorporating extended computations such as root mean squared error, mean absolute error, mean squared error and cross-validation accuracy can further enhance the overall performance of the models.

Journal Article

Share this book

Add to My Shelf

Potential of Snapshot-Type Hyperspectral Imagery Using Support Vector Classifier for the Classification of Tomatoes Maturity

by Hong, Young-Ki , Lee, Ki-Beom , Kim, Kyoung-Chul in Agriculture , Automation , Cameras

2022

It is necessary to convert to automation in a tomato hydroponic greenhouse because of the aging of farmers, the reduction in agricultural workers as a proportion of the population, COVID-19, and so on. In particular, agricultural robots are attractive as one of the ways for automation conversion in a hydroponic greenhouse. However, to develop agricultural robots, crop monitoring techniques will be necessary. In this study, therefore, we aimed to develop a maturity classification model for tomatoes using both support vector classifier (SVC) and snapshot-type hyperspectral imaging (VIS: 460–600 nm (16 bands) and Red-NIR: 600–860 nm (15 bands)). The spectral data, a total of 258 tomatoes harvested in January and February 2022, was obtained from the tomatoes’ surfaces. Spectral data that has a relationship with the maturity stages of tomatoes was selected by correlation analysis. In addition, the four different spectral data were prepared, such as VIS data (16 bands), Red-NIR data (15 bands), combination data of VIS and Red-NIR (31 bands), and selected spectral data (6 bands). These data were trained by SVC, respectively, and we evaluated the performance of trained classification models. As a result, the SVC based on VIS data achieved a classification accuracy of 79% and an F1-score of 88% to classify the tomato maturity into six stages (Green, Breaker, Turning, Pink, Light-red, and Red). In addition, the developed model was tested in a hydroponic greenhouse and was able to classify the maturity stages with a classification accuracy of 75% and an F1-score of 86%.

Journal Article

Share this book

Add to My Shelf

Startup Success Prediction with PCA-Enhanced Machine Learning Models

by Choi, Youngkeun in ENGINEERING, MULTIDISCIPLINARY , Investment decision-making , Machine learning

2024

Abstract This study evaluates the effectiveness of various machine learning algorithms in predicting startup success and explores the performance improvement achieved by applying Principal Component Analysis (PCA) to the models. By analyzing logistic regression, support vector classifier (SVC), XGBoost, and other supervised learning algorithms, the study demonstrates that PCA enhances the generalization performance of most models. Notably, Support Vector Classifier (SVC) showed an accuracy of 0.78, precision of 0.83, recall of 0.73, and F1 score of 0.74 without PCA, but performance significantly improved with PCA, recording an accuracy of 0.90, precision of 0.90, recall of 0.89, and F1 score of 0.89. Academically, this research contributes to the literature by examining how dimension reduction can boost the accuracy of machine learning models for startup success prediction, providing a valuable intersection of machine learning and venture capital studies. Practically, it offers investors AI-driven decision-making tools to enhance the precision of investment evaluations and better identify startups with high growth potential. Despite its contributions, this study is limited by the specific dataset used, suggesting that future research could explore various datasets and alternative dimension reduction techniques. Future studies could also assess real-time data application and incorporate deep learning models to improve predictive performance in startup success evaluation.

Journal Article

Share this book

Add to My Shelf

Application of machine learning models in predicting insomnia severity: an integrative approach with constitution of traditional Chinese medicine

by Li, Jing , Li, Shenguang , Zhu, Po in Algorithms , constitution of traditional Chinese medicine , Constitutions

2023

ObjectiveThis study sought to explore the utility of machine learning models in predicting insomnia severity based on Traditional Chinese Medicine (TCM) constitution classifications, with an aim to discuss the potential applications of such models in the treatment and prevention of insomnia.MethodsWe analyzed a dataset of 165 insomnia patients from the Shanghai Minhang District Integrated Traditional Chinese and Western Medicine Hospital. TCM constitution was assessed using a standardized Constitution in Chinese Medicine (CCM) scale. Sleep quality, or insomnia severity, was evaluated using the Spiegel Sleep Questionnaire (SSQ). Machine learning models, including Random Forest Classifier (RFC), Support Vector Classifier (SVC), and K-Nearest Neighbors (KNN), were utilized. These models were optimized using Grid Search algorithm and were trained and tested on stratified patient data, with the TCM constitution classifications serving as primary predictors.ResultsThe RFC outperformed others, achieving a weighted average accuracy, precision, recall, and F1-score of 0.91, 0.94, 0.92, and 0.92 respectively, it also effectively classified the severity of insomnia with high area under receiver operating characteristic curve (AUC-ROC) values. Feature importance analysis demonstrated the Damp-heat constitution as the most influential predictor, followed by Yang-deficiency, Qi-depression, Qi-deficiency, and Blood-stasis constitutions.ConclusionThe results demonstrate the potent utility of machine learning, specifically RFC, coupled with TCM constitution classifications in predicting insomnia severity. Notably, the constitution classifications such as Damp-heat and Yang-deficiency emerged as crucial determinants, emphasizing its potential in guiding targeted insomnia treatments. This approach enables the development of more personalized and efficient interventions, thereby enhancing patient outcomes.

Journal Article

Share this book

Add to My Shelf

SVM-Based Blood Exam Classification for Predicting Defining Factors in Metabolic Syndrome Diagnosis

by Sotiropoulos, Dionisios N. , Tsihrintzis, George A. , Panagoulias, Dimitrios P. in Artificial intelligence , Automation , Biomarkers

2022

Biomarkers have already been proposed as powerful classification features for use in the training of neural network-based and other machine learning and artificial intelligence-based prognostic models in the scientific field of personalized nutrition. In this paper, we construct and study cascaded SVM-based classifiers for automated metabolic syndrome diagnosis. Specifically, using blood exams, we achieve an average accuracy of about 84% in correctly classifying body mass index. Similarly, cascaded SVM-based classifiers achieve a 74% accuracy in correctly classifying systolic blood pressure. Next, we propose and implement a system that achieves an 84% accuracy in metabolic syndrome prediction. The proposed system relies not only on prediction of the body mass index but also on prediction from blood exams of total cholesterol, triglycerides and glucose. For the aim of self-completeness of the paper, the key concepts with regard to metabolic syndrome are summarized, and a review of previous related work is included. Finally, conclusions are drawn and indications for related future research are outlined.

Journal Article

Share this book

Add to My Shelf

Decoding emotions and unveiling stress: a non-invasive approach through sequential feature extraction and multiclass classifiers

by Bhoi, Akash Kumar , Panigrahi, Ranjit , K.S, Hareesha in Accuracy , Algorithms , Biological and Medical Physics

2024

Purpose Stress is widespread in the modern world. It is a complex fusion of psychological and physiological tension that leads to various health issues, such as heart disease, high blood pressure, and widespread anxiety. Although monitoring emotions, especially stress, is critically challenging, however, to tackle this challenge head-on, advancements in machine learning have paved the way for unraveling the complexities of human emotions and detecting early signs of stress. Methods In this exploratory study, we introduce an innovative framework built on a Sequential Feature Extractor (SFE), which collaborates seamlessly with k-Nearest Neighbor (KNN), linear Support Vector Classifier (SVC), Support Vector Machine (SVM), and Logistic Regression (LR). The model identifies seven crucial features in this context through refined preprocessing methods. Results The SFE + KNN model stands out by leveraging its attributes, displaying remarkable precision and an F1-Score of 88.00% when detecting stress. Furthermore, concerning individual emotions, this model excels in various ways. The SFE + SVM methodology accurately identifies Transient emotions at a rate of 94.00% and flags Baseline emotions with a perfect score of 100.00%. Amusement is deftly grasped with 79.00% accuracy using SFE + LR. Meanwhile, the SFE + SVC approach astutely recognizes Stress at 84.00% and Meditation at 92.00%. These results underscore the model’s capability to untangle the complex tapestry of human sentiments and stress responses successfully. Conclusions The study utilizes the publicly available WESAD Dataset and achieves impressive accuracy levels in detecting stress and various emotions. The approach taken in this study contributes to understanding human emotional experiences and coping mechanisms, leading to improved resilience and emotional intelligence.

Journal Article

Share this book

Add to My Shelf

Soil Moisture Investigation Utilizing Machine Learning Approach Based Experimental Data and Landsat5-TM Images: A Case Study in the Mega City Beijing

by Qian, Xu , Qu, Yue , Tan, Jinqiang in Artificial intelligence , Case studies , China

2018

The characteristics of soil moisture content (SMC) distribution in an area are necessarily analyzed for the design and construction of sponge cities. Combining remote sensing data with experimental data, this paper establishes a machine learning model to reveal the characteristics of SMC. Taking Beijing as an example, the SMC distribution was obtained and the characteristics were analyzed after training and validating. When comparing different machine learning methods, it can be concluded that the support vector classifier (SVC) method trained with remote sensing and grayscale data can achieve the highest accuracy (76.69%). The calculation results show that the districts with the highest and lowest SMC value are Xicheng District (19.94%) and Daxing District (11.04%), respectively, in Beijing. The mean SMC value of Beijing is 15.65%. The SMC distribution characteristic in Beijing shows that the soil in the west and north are relatively wet, while the soil in the east and south are relatively dry. Therefore, it is suggested that the timely monitoring of the SMC of vegetation covered areas at the north and west should be carried out. Water conservation facilities also need to be established with the development of city constructions in the south and east areas.

Journal Article

Share this book

Add to My Shelf

A Trade-off between ML and DL Techniques in Natural Language Processing

by Tank, Parth , Katre, Neha , Singh, Bhavesh in Algorithms , Classification , Classifiers

2021

The domain of Natural Language Processing covers various tasks, such as classification, text generation, and language model. The data processed using word embeddings, or vectorizers, is then trained using Machine Learning and Deep Learning algorithms. In order to observe the tradeoff between both these types of algorithms, with respect to data available, accuracy obtained and other factors, a binary classification is undertaken to distinguish between insincere and regular questions on Quora. A dataset called Quora Insincere Questions Classification was used to train various machine learning and deep learning models. A Bidirectional-Long Short Term Network (LSTM) was trained, with the text processed using Global Vectors for Word Representation (GloVe). Machine Learning algorithms such as Extreme Gradient Boosting classifier, Gaussian Naive Bayes, and Support Vector Classifier (SVC), by using the TF-IDF vectorizer to process the text. This paper also presents an evaluation of the above algorithms on the basis of precision, recall, f1 score metrics.

Journal Article

Share this book

Add to My Shelf

Analysis of sentiments on the onset of Covid-19 using Machine Learning Techniques

by González-Briones, Alfonso , Arya, Vishakha , Mishra, Amit Kumar Mishra in Classification , Classifiers , Coronaviruses

2022

The novel coronavirus (Covid-19) pandemic has struck the whole world and is one of the most striking topics on social media platforms. Sentiment outbreak on social media enduring various thoughts, opinions, and emotions about the Covid-19 disease, expressing views they are feeling presently. Analyzing sentiments helps to yield better results. Gathering data from different blogging sites like Facebook, Twitter, Weibo, YouTube, Instagram, etc., and Twitter is the largest repository. Videos, text, and audio were also collected from repositories. Sentiment analysis uses opinion mining to acquire the sentiments of its users and categorizes them accordingly as positive, negative, and neutral. Analytical and machine learning classification is implemented to 3586 tweets collected in different time frames. In this paper, sentiment analysis was performed on tweets accumulated during the Covid-19 pandemic, Coronavirus disease. Tweets are collected from the Twitter database using Hydrator a web-based application. Data-preprocessing removes all the noise, outliers from the raw data. With Natural Language Toolkit (NLTK), text classification for sentiment analysis and calculate the score subjective polarity, counts, and sentiment distribution. N-gram is used in textual mining -and Natural Language Processing for a continuous sequence of words in a text or document applying uni-gram, bi-gram, and tri-gram for statistical computation. Term frequency and Inverse document frequency (TF-IDF) is a feature extraction technique that converts textual data into numeric form. Vectorize data feed to our model to obtain insights from linguistic data. Linear SVC, MultinomialNB, GBM, and Random Forest classifier with Tfidf classification model applied to our proposed model. Linear Support Vector classification performs better than the other two classifiers. Results depict that RF performs better.

Journal Article

Share this book

Add to My Shelf

Application of Machine Learning Algorithms for Detection of Vulnerability in Web Applications

by Balaraju, Manjuprasad , Mathalli Narasimha, Vijayalakshmi , Swamy, S. Narasimha in Advances in Computational Intelligence for Artificial Intelligence , Algorithms , Applications programs

2023

The Internet is a world-class network that connects systems and electronic devices. As per the report, 4.66 billion people in the world use the internet for one or other purposes. The internet also provides a wide range of web applications, which provides vast benefits to society and the users. Nowadays, cyberattacks like denial of service (DoS), SQL injections, brute force, and phishing attacks on websites, web applications, and web of things are more common. During the development phase, these security issues need to be addressed efficiently. These internet-based applications, store very critical, valuable, and important information related to user credentials, financial, biometric, payment information, etc. The adversary tries to find vulnerabilities and exploit them to capture the information related to users, and devices. The adversary can also damage the applications and stop them from working. This paper illustrates and analyses the different types of vulnerabilities in detail. Also, this work provides possible solutions to the various attacks. The data for the analysis are collected through the NESSUS tool. The analysis is carried out using Random Forest Classifier, Multinominal Naïve Bayes, Linear SVC, and Logistic Regression. In this work, Linear SVC has 91% accuracy in identifying the type of vulnerability. The algorithm also shows the accuracy of 98% in giving the solutions for the type of attack.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter