Catalogue Search | MBRL

Metoda Badania

by Panek, Tomasz in Datengewinnung

2015

Journal Article

Share this book

Add to My Shelf

AI-based smart prediction of clinical disease using random forest classifier and Naive Bayes

by Kaliappan, M , Jackins, V , Vimal, S in Algorithms , Artificial intelligence , Classification

2021

Healthcare practices include collecting all kinds of patient data which would help the doctor correctly diagnose the health condition of the patient. These data could be simple symptoms observed by the subject, initial diagnosis by a physician or a detailed test result from a laboratory. Thus, these data are only utilized for analysis by a doctor who then ascertains the disease using his/her personal medical expertise. The artificial intelligence has been used with Naive Bayes classification and random forest classification algorithm to classify many disease datasets like diabetes, heart disease, and cancer to check whether the patient is affected by that disease or not. A performance analysis of the disease data for both algorithms is calculated and compared. The results of the simulations show the effectiveness of the classification techniques on a dataset, as well as the nature and complexity of the dataset used.

Journal Article

Share this book

Add to My Shelf

Training and assessing classification rules with imbalanced data

by Menardi, Giovanna , Torelli, Nicola in Accuracy , Artificial Intelligence , Chemistry and Earth Sciences

2014

The problem of modeling binary responses by using cross-sectional data has been addressed with a number of satisfying solutions that draw on both parametric and nonparametric methods. However, there exist many real situations where one of the two responses (usually the most interesting for the analysis) is rare. It has been largely reported that this class imbalance heavily compromises the process of learning, because the model tends to focus on the prevalent class and to ignore the rare events. However, not only the estimation of the classification model is affected by a skewed distribution of the classes, but also the evaluation of its accuracy is jeopardized, because the scarcity of data leads to poor estimates of the model’s accuracy. In this work, the effects of class imbalance on model training and model assessing are discussed. Moreover, a unified and systematic framework for dealing with the problem of imbalanced classification is proposed, based on a smoothed bootstrap re-sampling technique. The proposed technique is founded on a sound theoretical basis and an extensive empirical study shows that it outperforms the main other remedies to face imbalanced learning problems.

Journal Article

Share this book

Add to My Shelf

Updated framework for monitoring adult learning

by Sekmokas, Mantas , von Erlach, Emanuel , Rojas González, Gara in education

2024

The Network on Labour market, economic, and social outcomes of learning (LSO Expert Network) has diligently worked on the selection of indicators for monitoring adult learning policies. Their inaugural theoretical framework on adult learning, published in 2013, covered a broad spectrum of policy areas. This comprehensive scope reflected both the focus of existing data sources and the challenges encountered in data collection efforts. Over the past decade, significant policy shifts have occurred, reshaping adult learning systems both domestically and internationally. Concurrently, there have been improvements in the availability and frequency of data pertaining to adult learning. In response to these developments, this working paper presents an updated theoretical framework on adult learning, aiming to enhance the identification of statistical data concerning adult learning systems and facilitate the selection of pertinent indicators for monitoring purposes. Additionally, the paper offers detailed insights into national priorities and practices within this domain.

Paper

Share this book

Add to My Shelf

Harmonisation of demographic and socio-economic variables in cross-national survey research

by Hoffmeyer-Zlotnik, Jürgen H. P in Datengewinnung

2008

Journal Article

Share this book

Add to My Shelf

Studying user income through language, behaviour and affect in social media

by Lampos, Vasileios , Volkova, Svitlana , Bachrach, Yoram in Affect , Age differences , Artificial intelligence

2015

Automatically inferring user demographics from social media posts is useful for both social science research and a range of downstream applications in marketing and politics. We present the first extensive study where user behaviour on Twitter is used to build a predictive model of income. We apply non-linear methods for regression, i.e. Gaussian Processes, achieving strong correlation between predicted and actual user income. This allows us to shed light on the factors that characterise income on Twitter and analyse their interplay with user emotions and sentiment, perceived psycho-demographics and language use expressed through the topics of their posts. Our analysis uncovers correlations between different feature categories and income, some of which reflect common belief e.g. higher perceived education and intelligence indicates higher earnings, known differences e.g. gender and age differences, however, others show novel findings e.g. higher income users express more fear and anger, whereas lower income users express more of the time emotion and opinions.

Journal Article

Share this book

Add to My Shelf

An Interdisciplinary Mixed-Methods Approach to Analyzing Urban Spaces: The Case of Urban Walkability and Bikeability

by Puetz, Inga , Kyriakou, Kalliopi , Resch, Bernd in Bicycling , Cities , City Planning

2020

Human-centered approaches are of particular importance when analyzing urban spaces in technology-driven fields, because understanding how people perceive and react to their environments depends on several dynamic and static factors, such as traffic volume, noise, safety, urban configuration, and greenness. Analyzing and interpreting emotions against the background of environmental information can provide insights into the spatial and temporal properties of urban spaces and their influence on citizens, such as urban walkability and bikeability. In this study, we present a comprehensive mixed-methods approach to geospatial analysis that utilizes wearable sensor technology for emotion detection and combines information from sources that correct or complement each other. This includes objective data from wearable physiological sensors combined with an eDiary app, first-person perspective videos from a chest-mounted camera, and georeferenced interviews, and post-hoc surveys. Across two studies, we identified and geolocated pedestrians’ and cyclists’ moments of stress and relaxation in the city centers of Salzburg and Cologne. Despite open methodological questions, we conclude that mapping wearable sensor data, complemented with other sources of information—all of which are indispensable for evidence-based urban planning—offering tremendous potential for gaining useful insights into urban spaces and their impact on citizens.

Journal Article

Share this book

Add to My Shelf

Long-term prediction of rockburst hazard in deep underground openings using three robust data mining techniques

by Roohollah Shirani Faradonbeh , Taheri, Abbas in Casualties , Compressive strength , Criteria

2019

Rockburst phenomenon is the extreme release of strain energy stored in surrounding rock mass which could lead to casualties, damage to underground structures and equipment and finally endanger the economic viability of the project. Considering the complex mechanism of rockburst and a large number of factors affecting it, the conventional criteria cannot be used generally and with high reliability. Hence, there is a need to develop new models with high accuracy and ease to use in practice. This study focuses on the applicability of three novel data mining techniques including emotional neural network (ENN), gene expression programming (GEP), and decision tree-based C4.5 algorithm along with five conventional criteria to predict the occurrence of rockburst in a binary condition. To do so, a total of 134 rockburst events were compiled from various case studies and the models were established based on training datasets and input parameters of maximum tangential stress, uniaxial tensile strength, uniaxial compressive strength, and elastic energy index. The prediction strength of the constructed models was evaluated by feeding the testing datasets to the models and measuring the indices of root mean squared error (RMSE) and percentage of the successful prediction (PSP). The results showed the high accuracy and applicability of all three new models; however, the GA-ENN and the GEP methods outperformed the C4.5 method. Besides, it was found that the criterion of elastic energy index (EEI) is more accurate among other conventional criteria and with the results similar to the C4.5 model, can be used easily in practical applications. Finally, a sensitivity analysis was carried out and the maximum tangential stress was identified as the most influential parameter, which could be a guide for rockburst prediction.

Journal Article

Share this book

Add to My Shelf

Discriminative machine learning for maximal representative subsampling

by Tüscher, Oliver , Hauptmann, Tony , Nathan, Laksan in 631/477/2811 , 639/705/117 , Bias

2023

Biased population samples pose a prevalent problem in the social sciences. Therefore, we present two novel methods that are based on positive-unlabeled learning to mitigate bias. Both methods leverage auxiliary information from a representative data set and train machine learning classifiers to determine the sample weights. The first method, named maximum representative subsampling (MRS), uses a classifier to iteratively remove instances, by assigning a sample weight of 0, from the biased data set until it aligns with the representative one. The second method is a variant of MRS – Soft-MRS – that iteratively adapts sample weights instead of removing samples completely. To assess the effectiveness of our approach, we induced artificial bias in a public census data set and examined the corrected estimates. We compare the performance of our methods against existing techniques, evaluating the ability of sample weights created with Soft-MRS or MRS to minimize differences and improve downstream classification tasks. Lastly, we demonstrate the applicability of the proposed methods in a real-world study of resilience research, exploring the influence of resilience on voting behavior. Through our work, we address the issue of bias in social science, amongst others, and provide a versatile methodology for bias reduction based on machine learning. Based on our experiments, we recommend to use MRS for downstream classification tasks and Soft-MRS for downstream tasks where the relative bias of the dependent variable is relevant.

Journal Article

Share this book

Add to My Shelf

The baseline examinations of the German National Cohort (NAKO): recruitment protocol, response, and weighting

by Rospleszcz, Susanne , Wolf, Kathrin , Rach, Stefan in Adult , Age composition , Age groups

2025

The German National Cohort (NAKO) is the largest population-based epidemiologic cohort study in Germany and investigates the causes of the most common chronic diseases. Between 2014 and 2019, a total of 1.3 million residents aged 20–69 years from 16 German regions were randomly selected from the general population and invited to participate following a highly standardized recruitment protocol. The overall response was 15.6% and differed considerably across study centers (7.6–30.7%). Females were more likely to participate than males (17.5% vs. 14.1%) and participation increased with age (10.2% in age group “ < 29 years” up to 20.7% in age group “ > 60 years”). Across all study regions, response was highest in rural areas (22.3%), followed by towns and suburbs (17.2%), and was lowest in cities (14.5%). Compared with the general population in the respective study regions, participants with low and medium education are underrepresented in the NAKO sample, while highly educated participants are overrepresented. Participants with non-German nationality and with a migration background are also underrepresented. Participants living in single households are underrepresented, while participants from larger households (2 or more persons) are overrepresented compared to the general population. Survey weights are made available to researchers along with the study data that account for the sampling design and adjust for differences in the distribution of age, sex, nationality (German vs. non-German), migration status, education, and household size.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter