Catalogue Search | MBRL

Hybrid MLOps framework for automated lifecycle management of adaptive phishing detection models

by Reda, Asmaa , Taie, Shereen A. , Shaheen, Masoud E. in Accuracy , Adaptation , Adaptive machine learning

2025

Phishing detection models degrade quickly due to drift, adversarial evasion, and fairness issues. Existing MLOps platforms mainly automate deployment and monitoring. Prior works have examined SHAP-based monitoring, retraining, or fairness audits separately, but lack an integrated theory of resilience for adversarial environments. We introduce the Hybrid MLOps Framework (HAMF), a system designed to embed resilience and ethical governance into the lifecycle of phishing detection models. HAMF is ‘hybrid’ because it unifies proactive and reactive adaptation, combining automation with stakeholder oversight, and embedding resilience with ethical governance. HAMF treats resilience as an integrated lifecycle property, designed to simultaneously preserve model accuracy, fairness, and stakeholder trust amidst concept drift. Methodologically, HAMF implements this through a hybrid control cycle. This cycle fuses four key capabilities: SHAP-guided feature replacement, event-driven retraining, fairness-triggered audits, and structured human feedback. Unlike conventional pipelines where these functions are isolated, HAMF ensures their interdependence as first-class triggers. Empirical evaluations on large-scale phishing streams demonstrate HAMF’s superior performance. The framework detects drift within 18 seconds, restores F1 scores above 0.99 post-attack, reduces subgroup disparities by over 60%, and scales to over 2,300 requests per second with sub-50ms latency. These results validate HAMF’s design, demonstrating that embedding resilience and ethical alignment into the MLOps lifecycle is both effective and scalable.

Journal Article

Share this book

Add to My Shelf

Using Automated Machine Learning to Predict the Mortality of Patients With COVID-19: Prediction Model Development Study

by Goldstein, D Y , Yagi, Yukako , Reyes Gil, Morayma in Algorithms , Artificial intelligence , Automation

2021

During a pandemic, it is important for clinicians to stratify patients and decide who receives limited medical resources. Machine learning models have been proposed to accurately predict COVID-19 disease severity. Previous studies have typically tested only one machine learning algorithm and limited performance evaluation to area under the curve analysis. To obtain the best results possible, it may be important to test different machine learning algorithms to find the best prediction model. In this study, we aimed to use automated machine learning (autoML) to train various machine learning algorithms. We selected the model that best predicted patients' chances of surviving a SARS-CoV-2 infection. In addition, we identified which variables (ie, vital signs, biomarkers, comorbidities, etc) were the most influential in generating an accurate model. Data were retrospectively collected from all patients who tested positive for COVID-19 at our institution between March 1 and July 3, 2020. We collected 48 variables from each patient within 36 hours before or after the index time (ie, real-time polymerase chain reaction positivity). Patients were followed for 30 days or until death. Patients' data were used to build 20 machine learning models with various algorithms via autoML. The performance of machine learning models was measured by analyzing the area under the precision-recall curve (AUPCR). Subsequently, we established model interpretability via Shapley additive explanation and partial dependence plots to identify and rank variables that drove model predictions. Afterward, we conducted dimensionality reduction to extract the 10 most influential variables. AutoML models were retrained by only using these 10 variables, and the output models were evaluated against the model that used 48 variables. Data from 4313 patients were used to develop the models. The best model that was generated by using autoML and 48 variables was the stacked ensemble model (AUPRC=0.807). The two best independent models were the gradient boost machine and extreme gradient boost models, which had an AUPRC of 0.803 and 0.793, respectively. The deep learning model (AUPRC=0.73) was substantially inferior to the other models. The 10 most influential variables for generating high-performing models were systolic and diastolic blood pressure, age, pulse oximetry level, blood urea nitrogen level, lactate dehydrogenase level, D-dimer level, troponin level, respiratory rate, and Charlson comorbidity score. After the autoML models were retrained with these 10 variables, the stacked ensemble model still had the best performance (AUPRC=0.791). We used autoML to develop high-performing models that predicted the survival of patients with COVID-19. In addition, we identified important variables that correlated with mortality. This is proof of concept that autoML is an efficient, effective, and informative method for generating machine learning-based clinical decision support tools.

Journal Article

Share this book

Add to My Shelf

End-to-end data-driven modeling framework for automated and trustworthy short-term building energy load forecasting

by Zhang, Chaobo , Huang, Jiahua , Lu, Jie in Accuracy , Adaptation , Atmospheric Protection/Air Quality Control/Air Pollution

2024

Conventional automated machine learning (AutoML) technologies fall short in preprocessing low-quality raw data and adapting to varying indoor and outdoor environments, leading to accuracy reduction in forecasting short-term building energy loads. Moreover, their predictions are not transparent because of their black box nature. Hence, the building field currently lacks an AutoML framework capable of data quality enhancement, environment self-adaptation, and model interpretation. To address this research gap, an improved AutoML-based end-to-end data-driven modeling framework is proposed. Bayesian optimization is applied by this framework to find an optimal data preprocessing process for quality improvement of raw data. It bridges the gap where conventional AutoML technologies cannot automatically handle missing data and outliers. A sliding window-based model retraining strategy is utilized to achieve environment self-adaptation, contributing to the accuracy enhancement of AutoML technologies. Moreover, a local interpretable model-agnostic explanations-based approach is developed to interpret predictions made by the improved framework. It overcomes the poor interpretability of conventional AutoML technologies. The performance of the improved framework in forecasting one-hour ahead cooling loads is evaluated using two-year operational data from a real building. It is discovered that the accuracy of the improved framework increases by 4.24%–8.79% compared with four conventional frameworks for buildings with not only high-quality but also low-quality operational data. Furthermore, it is demonstrated that the developed model interpretation approach can effectively explain the predictions of the improved framework. The improved framework offers a novel perspective on creating accurate and reliable AutoML frameworks tailored to building energy load prediction tasks and other similar tasks.

Journal Article

Share this book

Add to My Shelf

ChatGPT as a Stable and Fair Tool for Automated Essay Scoring

by Mendoza, Marcelo , Martínez-Troncoso, Carolina , Bekerman, Zvi in Agreements , Algorithms , Artificial intelligence

2025

The evaluation of open-ended questions is typically performed by human instructors using predefined criteria to uphold academic standards. However, manual grading presents challenges, including high costs, rater fatigue, and potential bias, prompting interest in automated essay scoring systems. While automated essay scoring tools can assess content, coherence, and grammar, discrepancies between human and automated scoring have raised concerns about their reliability as standalone evaluators. Large language models like ChatGPT offer new possibilities, but their consistency and fairness in feedback remain underexplored. This study investigates whether ChatGPT can provide stable and fair essay scoring—specifically, whether identical student responses receive consistent evaluations across multiple AI interactions using the same criteria. The study was conducted in two marketing courses at an engineering school in Chile, involving 40 students. Results showed that ChatGPT, when unprompted or using minimal guidance, produced volatile grades and shifting criteria. Incorporating the instructor’s rubric reduced this variability but did not eliminate it. Only after providing an example-rich rubric, a standardized output format, low temperature settings, and a normalization process based on decision tables did ChatGPT-4o demonstrate consistent and fair grading. Based on these findings, we developed a scalable algorithm that automatically generates effective grading rubrics and decision tables with minimal human input. The added value of this work lies in the development of a scalable algorithm capable of automatically generating normalized rubrics and decision tables for new questions, thereby extending the accessibility and reliability of automated assessment.

Journal Article

Share this book

Add to My Shelf

Blind method for discovering number of clusters in multidimensional datasets by regression on linkage hierarchies generated from random data

by Zalay, Osbert C. in Algorithms , Biology and Life Sciences , Cluster Analysis

2020

Determining intrinsic number of clusters in a multidimensional dataset is a commonly encountered problem in exploratory data analysis. Unsupervised clustering algorithms often rely on specification of cluster number as an input parameter. However, this is typically not known a priori. Many methods have been proposed to estimate cluster number, including statistical and information-theoretic approaches such as the gap statistic, but these methods are not always reliable when applied to non-normally distributed datasets containing outliers or noise. In this study, I propose a novel method called hierarchical linkage regression, which uses regression to estimate the intrinsic number of clusters in a multidimensional dataset. The method operates on the hypothesis that the organization of data into clusters can be inferred from the hierarchy generated by partitioning the dataset, and therefore does not directly depend on the specific values of the data or their distribution, but on their relative ranking within the partitioned set. Moreover, the technique does not require empirical data to train on, but can use synthetic data generated from random distributions to fit regression coefficients. The trained hierarchical linkage regression model is able to infer cluster number in test datasets of varying complexity and differing distributions, for image, text and numeric data, using the same regression model without retraining. The method performs favourably against other cluster number estimation techniques, and is also robust to parameter changes, as demonstrated by sensitivity analysis. The apparent robustness and generalizability of hierarchical linkage regression make it a promising tool for unsupervised exploratory data analysis and discovery.

Journal Article

Share this book

Add to My Shelf

An adaptable automated visual inspection scheme through online learning

by Sun, Jun , Surgenor, Brian W. , Sun, Qiao in Algorithms , Assembly , Automation

2012

In the manufacturing industry, there is a growing need for an adaptable automated visual inspection (AVI) system that can be used to perform different inspection tasks without excessive retuning or retraining efforts. This paper presents an automated visual inspection scheme to improve the adaptability of an AVI system. In doing so, we propose the design of an adaptable inspection model composed of two sub-models: one for localizing the region of useful features and the other for defect classification. The localization sub-model contains invariant features common to all inspection samples. Through an edge-based geometric template-matching process, the localization sub-model is used to locate a verification region containing the subject of inspection such as a clip or a screw in an assembly piece. Through principal component analysis (PCA), the verification sub-model is constructed based on the reconstruction error distribution of non-defective samples. Consequently, this sub-model can be used to identify defective samples. In addition, an efficient online training algorithm is proposed for the construction of the verification sub-model during system operation. This algorithm allows minimum manual inspection effort while ensuring model training sufficiency. Through case study, the proposed AVI scheme demonstrates its capability of self-tuning while inspecting different parts or under different operating conditions in an assembly process. The feature of adaptability will help increase the benefit and functionality of an AVI technique to the manufacturing industry.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter