Catalogue Search | MBRL

Data mining and learning analytics : applications in educational research

by ElAtia, Samira, 1973- editor , Ipperciel, Donald, 1967- editor , Zaiane, Osmar R., 1965- editor in Education Research Statistical methods. , Educational statistics Data processing. , Data mining.

Book

Share this book

Add to My Shelf

Student Performance Prediction Model based on Supervised Machine Learning Algorithms

by Salah Hashim, Ali , Akeel Awadh, Wid , Khalaf Hamoud, Alaa in Algorithms , Colleges & universities , Data mining

2020

Higher education institutions aim to forecast student success which is an important research subject. Forecasting student success can enable teachers to prevent students from dropping out before final examinations, identify those who need additional help and boost institution ranking and prestige. Machine learning techniques in educational data mining aim to develop a model for discovering meaningful hidden patterns and exploring useful information from educational settings. The key traditional characteristics of students (demographic, academic background and behavioural features) are the main essential factors that can represent the training dataset for supervised machine learning algorithms. In this study, we compared the performances of several supervised machine learning algorithms, such as Decision Tree, Naïve Bayes, Logistic Regression, Support Vector Machine, K-Nearest Neighbour, Sequential Minimal Optimisation and Neural Network. We trained a model by using datasets provided by courses in the bachelor study programmes of the College of Computer Science and Information Technology, University of Basra, for academic years 2017-2018 and 2018-2019 to predict student performance on final examinations. Results indicated that logistic regression classifier is the most accurate in predicting the exact final grades of students (68.7% for passed and 88.8% for failed).

Journal Article

Share this book

Add to My Shelf

Learning analytics in higher education : current innovations, future potential, and practical applications

by Lester, Jaime, editor in Academic achievement Evaluation. , Education, Higher Evaluation. , Educational evaluation Data processing.

Book

Share this book

Add to My Shelf

A framework to capture the dependency between prerequisite and advanced courses in higher education

by Hriez, Raghda Fawzey , Al-Naymat, Ghazi in Advanced Courses , Algorithms , Colleges & universities

2021

Depicting the reason for the mismatch between instructor expectations of students’ performance in advanced courses and their actual performance has been a challenging issue for a long time, which raises the question of why such a mismatch exists. An implicit reason for this mismatch is the student’s weakness in prerequisite course skills. To solve this challenge, this research proposes a new graph mining algorithm combined with statistical analysis to reveal the dependency relationships between Course Learning Outcomes (CLOs) of prerequisite and advanced courses. In addition, a new model is built to predict students’ performance in the advanced courses based on prerequisites. The contributions of this research are threefold: (1) Modeling: three models are built based on bipartite graphs. The first model is a bipartite graph to model the relationships between the CLOs of different courses. This bipartite graph is constructed using a new calculated dependency measure based on a statistical analysis of students’ grades. Then the relationships between the Learning Outcomes are discovered by extracting the maximal bipartite cliques in the graph. The second model is built to model the relationships between students and the prerequisite CLOs. The maximal bipartite cliques are then extracted from this graph to discover the maximal set of students who share the same study weaknesses. The third model is built to model the relationships between students and the advanced CLOs. The maximal bipartite cliques are then extracted to discover the maximal set of students who are expected to share the same study weaknesses in the advanced course. Therefore, the same remedial actions can be used towards this group of students. (2) Algorithm: A new maximal bipartite cliques enumeration algorithm is proposed to extract the targeted patterns and relationships between CLOs themselves and between students and CLOs. (3) Applicability: The proposed models and algorithm have been applied using a real educational data set collected from one university. Other real datasets are used to conduct an empirical evaluation to assess the maximal bipartite enumeration algorithm’s correctness, the running time of the inflation and enumeration steps, and the overhead of the inflation algorithm on the size of the generated general graph. The evaluation proves that the proposed algorithm is accurate, efficient, effective, and applicable to real-world graphs more than the traditional algorithm.

Journal Article

Share this book

Add to My Shelf

Educational data mining: prediction of students' academic performance using machine learning algorithms

by Yağcı, Mustafa in Academic Achievement , Accuracy , Algorithms

2022

Educational data mining has become an effective tool for exploring the hidden relationships in educational data and predicting students' academic achievements. This study proposes a new model based on machine learning algorithms to predict the final exam grades of undergraduate students, taking their midterm exam grades as the source data. The performances of the random forests, nearest neighbour, support vector machines, logistic regression, Naïve Bayes, and k-nearest neighbour algorithms, which are among the machine learning algorithms, were calculated and compared to predict the final exam grades of the students. The dataset consisted of the academic achievement grades of 1854 students who took the Turkish Language-I course in a state University in Turkey during the fall semester of 2019–2020. The results show that the proposed model achieved a classification accuracy of 70–75%. The predictions were made using only three types of parameters; midterm exam grades, Department data and Faculty data. Such data-driven studies are very important in terms of establishing a learning analysis framework in higher education and contributing to the decision-making processes. Finally, this study presents a contribution to the early prediction of students at high risk of failure and determines the most effective machine learning methods.

Journal Article

Share this book

Add to My Shelf

Educational Data Classification Framework for Community Pedagogical Content Management using Data Mining

by Siddique, Imran , Ahmed, Muhammad , Babur, Dr in Algorithms , Attention , Classification

2019

Recent years witness the significant surge in awareness and exploitation of social media especially community Question and Answer (Q&A) websites by academicians and professionals. These sites are, large repositories of vast data, pawing ways to new avenues for research through applications of data mining and data analysis by investigation of trending topics and the topics of most attention of users. Educational Data Mining (EDM) techniques can be used to unveil potential of Community Q&A websites. Conventional Educational Data Mining approaches are concerned with generation of data through systematic ways and mined it for knowledge discovery to improve educational processes. This paper gives a novel idea to explore already generated data through millions of users having variety of expertise in their particular domains across a common platform like StackOverFlow (SO), a community Q&A website where users post questions and receive answers about particular problems. This study presents an EDM framework to classify community data into Software Engineering subjects. The framework classifies the SO posts according to the academic courses along with their best solutions to accommodate learners. Moreover, it gives teachers, instructors, educators and other EDM stakeholders an insight to pay more attention and focus on commonly occurring subject related problems and to design and manage of their courses delivery and teaching accordingly. The data mining framework performs preprocessing of data using NLP techniques and apply machine learning algorithms to classify data. Amongst all, SVM gives better performs with 72.06% accuracy. Evaluation measures like precision, recall and F-1 score also used to evaluate the best performing classifier.

Journal Article

Share this book

Add to My Shelf

Unpacking the ＂Black Box＂ of AI in Education

by Rebecca Eynon , Catherine Chiabaut , Nabeel Gillani in Algorithms , Artificial Intelligence , artificial intelligence in education

2023

Recent advances in Artificial Intelligence (AI) have sparked renewed interest in its potential to improve education. However, AI is a loose umbrella term that refers to a collection of methods, capabilities, and limitations-many of which are often not explicitly articulated by researchers, education technology companies, or other AI developers. In this paper, we seek to clarify what ＂AI＂ is and the potential it holds to both advance and hamper educational opportunities that may improve the human condition. We offer a basic introduction to different methods and philosophies underpinning AI, discuss recent advances, explore applications to education, and highlight key limitations and risks. We conclude with a set of questions that educationalists may ask as they encounter AI in their research and practice. Our hope is to make often jargon-laden terms and concepts accessible, so that all are equipped to understand, interrogate, and ultimately shape the development of human-centered AI in education.

Journal Article

Share this book

Add to My Shelf

A Comparison of Undersampling, Oversampling, and SMOTE Methods for Dealing with Imbalanced Classification in Educational Data Mining

by Wongvorachan, Tarid , He, Surina , Bulut, Okan in Accuracy , Algorithms , Bias

2023

Educational data mining is capable of producing useful data-driven applications (e.g., early warning systems in schools or the prediction of students’ academic achievement) based on predictive models. However, the class imbalance problem in educational datasets could hamper the accuracy of predictive models as many of these models are designed on the assumption that the predicted class is balanced. Although previous studies proposed several methods to deal with the imbalanced class problem, most of them focused on the technical details of how to improve each technique, while only a few focused on the application aspect, especially for the application of data with different imbalance ratios. In this study, we compared several sampling techniques to handle the different ratios of the class imbalance problem (i.e., moderately or extremely imbalanced classifications) using the High School Longitudinal Study of 2009 dataset. For our comparison, we used random oversampling (ROS), random undersampling (RUS), and the combination of the synthetic minority oversampling technique for nominal and continuous (SMOTE-NC) and RUS as a hybrid resampling technique. We used the Random Forest as our classification algorithm to evaluate the results of each sampling technique. Our results show that random oversampling for moderately imbalanced data and hybrid resampling for extremely imbalanced data seem to work best. The implications for educational data mining applications and suggestions for future research are discussed.

Journal Article

Share this book

Add to My Shelf

Predicting Student Performance Using Data Mining and Learning Analytics Techniques: A Systematic Literature Review

by Namoun, Abdallah , Alshanqiti, Abdullah in Academic achievement , academic performance , Data mining

2021

The prediction of student academic performance has drawn considerable attention in education. However, although the learning outcomes are believed to improve learning and teaching, prognosticating the attainment of student outcomes remains underexplored. A decade of research work conducted between 2010 and November 2020 was surveyed to present a fundamental understanding of the intelligent techniques used for the prediction of student performance, where academic success is strictly measured using student learning outcomes. The electronic bibliographic databases searched include ACM, IEEE Xplore, Google Scholar, Science Direct, Scopus, Springer, and Web of Science. Eventually, we synthesized and analyzed a total of 62 relevant papers with a focus on three perspectives, (1) the forms in which the learning outcomes are predicted, (2) the predictive analytics models developed to forecast student learning, and (3) the dominant factors impacting student outcomes. The best practices for conducting systematic literature reviews, e.g., PICO and PRISMA, were applied to synthesize and report the main results. The attainment of learning outcomes was measured mainly as performance class standings (i.e., ranks) and achievement scores (i.e., grades). Regression and supervised machine learning models were frequently employed to classify student performance. Finally, student online learning activities, term assessment grades, and student academic emotions were the most evident predictors of learning outcomes. We conclude the survey by highlighting some major research challenges and suggesting a summary of significant recommendations to motivate future works in this field.

Journal Article

Share this book

Add to My Shelf

Educational data mining to predict students' academic performance: A survey study

by Nisar, Muhammad Wasif , Hussain, Amir , Batool, Saba in Academic Achievement , Algorithms , Computer Appl. in Social and Behavioral Sciences

2023

Educational data mining is an emerging interdisciplinary research area involving both education and informatics. It has become an imperative research area due to many advantages that educational institutions can achieve. Along these lines, various data mining techniques have been used to improve learning outcomes by exploring large-scale data that come from educational settings. One of the main problems is predicting the future achievements of students before taking final exams, so we can proactively help students achieve better performance and prevent dropouts. Therefore, many efforts have been made to solve the problem of student performance prediction in the context of educational data mining. In this paper, we provide readers with a comprehensive understanding of student performance prediction and compare approximately 260 studies in the last 20 years with respect to i) major factors highly affecting student performance prediction, ii) kinds of data mining techniques including prediction and feature selection algorithms, and iii) frequently used data mining tools. The findings of the comprehensive analysis show that ANN and Random Forest are mostly used data mining algorithms, while WEKA is found as a trending tool for students’ performance prediction. Students’ academic records and demographic factors are the best attributes to predict performance. The study proves that irrelevant features in the dataset reduce the prediction results and increase model processing time. Therefore, almost half of the studies used feature selection techniques before building prediction models. This study attempts to provide useful and valuable information to researchers interested in advancing educational data mining. The study directs future researchers to achieve highly accurate prediction results in different scenarios using different available inputs or techniques. The study also helps institutions apply data mining techniques to predict and improve student outcomes by providing additional assistance on time.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter