Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
1,429
result(s) for
"MISSING VALUES"
Sort by:
Estimation of Stochastic Processes with Missing Observations
by
Moklyachuk, Mikhail
,
Masyutka, Oleksandr
,
Sidei, Maria
in
Missing observations (Statistics)
,
Stochastic processes
2019
We propose results of the investigation of the problem of mean square optimal estimation of linear functionals constructed from unobserved values of stationary stochastic processes. Estimates are based on observations of the processes with additive stationary noise process. The aim of the book is to develop methods for finding the optimal estimates of the functionals in the case where some observations are missing. Formulas for computing values of the mean-square errors and the spectral characteristics of the optimal linear estimates of functionals are derived in the case of spectral certainty, where the spectral densities of the processes are exactly known. The minimax robust method of estimation is applied in the case of spectral uncertainty, where the spectral densities of the processes are not known exactly while some classes of admissible spectral densities are given. The formulas that determine the least favourable spectral densities and the minimax spectral characteristics of the optimal estimates of functionals are proposed for some special classes of admissible densities.
CBRL and CBRC: Novel Algorithms for Improving Missing Value Imputation Accuracy Based on Bayesian Ridge Regression
by
Hamad, Safwat
,
M. Mostafa, Samih
,
S. Eladimy, Abdelrahman
in
Accuracy
,
Algorithms
,
Bayesian analysis
2020
In most scientific studies such as data analysis, the existence of missing data is a critical problem, and selecting the appropriate approach to deal with missing data is a challenge. In this paper, the authors perform a fair comparative study of some practical imputation methods used for handling missing values against two proposed imputation algorithms. The proposed algorithms depend on the Bayesian Ridge technique under two different feature selection conditions. The proposed algorithms differ from the existing approaches in that they cumulate the imputed features; those imputed features will be incorporated within the Bayesian Ridge equation for predicting the missing values in the next incomplete selected feature. The authors applied the proposed algorithms on eight datasets with different amount of missing values created from different missingness mechanisms. The performance was measured in terms of imputation time, root-mean-square error (RMSE), coefficient of determination (R2), and mean absolute error (MAE). The results showed that the performance varies depending on missing values percentage, size of the dataset, and the missingness mechanism. In addition, the performance of the proposed methods is slightly better.
Journal Article
Missing Value Imputation Method for Multiclass Matrix Data Based on Closed Itemset
by
Suzuki, Natsumi
,
Okada, Yoshifumi
,
Tada, Mayu
in
Algorithms
,
closed itemset
,
Computational efficiency
2022
Handling missing values in matrix data is an important step in data analysis. To date, many methods to estimate missing values based on data pattern similarity have been proposed. Most previously proposed methods perform missing value imputation based on data trends over the entire feature space. However, individual missing values are likely to show similarity to data patterns in local feature space. In addition, most existing methods focus on single class data, while multiclass analysis is frequently required in various fields. Missing value imputation for multiclass data must consider the characteristics of each class. In this paper, we propose two methods based on closed itemsets, CIimpute and ICIimpute, to achieve missing value imputation using local feature space for multiclass matrix data. CIimpute estimates missing values using closed itemsets extracted from each class. ICIimpute is an improved method of CIimpute in which an attribute reduction process is introduced. Experimental results demonstrate that attribute reduction considerably reduces computational time and improves imputation accuracy. Furthermore, it is shown that, compared to existing methods, ICIimpute provides superior imputation accuracy but requires more computational time.
Journal Article
Missing value imputation: a review and analysis of the literature (2006–2017)
2020
Missing value imputation (MVI) has been studied for several decades being the basic solution method for incomplete dataset problems, specifically those where some data samples contain one or more missing attribute values. This paper aims at reviewing and analyzing related studies carried out in recent decades, from the experimental design perspective. Altogether, 111 journal papers published from 2006 to 2017 are reviewed and analyzed. In addition, several technical issues encountered during the MVI process are addressed, such as the choice of datasets, missing rates and missingness mechanisms, and the MVI techniques and evaluation metrics employed, are discussed. The results of analysis of these issues allow limitations in the existing body of literature to be identified based upon which some directions for future research can be gleaned.
Journal Article
Measuring and accounting for strategic abstentions in the US Senate, 1989-2012
2015
Strategic abstentions—in which legislators abstain from votes for ideological reasons—are a poorly understood feature of legislative voting records. The paper discusses a spatial model for legislators' revealed preferences that accounts for abstentions when missing values are non-ignorable and allows us to measure the pervasiveness of strategic abstention by identifying legislators who consistently engage in strategic abstentions, as well as bills for which the ideology of legislators is a key driver of abstentions. We illustrate the performance of our model through the analysis of the 101st–112th US Senates.
Journal Article
Missing Data Methods: Cross-sectional Methods and Applications
Volume 27 of \"Advances in Econometrics\", entitled \"Missing Data Methods\", contains 16 chapters authored by specialists in the field, covering topics such as: Missing-Data Imputation in Nonstationary Panel Data Models; Markov Switching Models in Empirical Finance; Bayesian Analysis of Multivariate Sample Selection Models Using Gaussian Copulas; Consistent Estimation and Orthogonality; and Likelihood-Based Estimators for Endogenous or Truncated Samples in Standard Stratified Sampling.
Identify the most appropriate imputation method for handling missing values in clinical structured datasets: a systematic review
by
Hosseinzadeh, Elham
,
Afkanpour, Marziyeh
,
Tabesh, Hamed
in
Analysis
,
Biomedical Research - methods
,
Biomedical Research - standards
2024
Background and objectives
Comprehending the research dataset is crucial for obtaining reliable and valid outcomes. Health analysts must have a deep comprehension of the data being analyzed. This comprehension allows them to suggest practical solutions for handling missing data, in a clinical data source. Accurate handling of missing values is critical for producing precise estimates and making informed decisions, especially in crucial areas like clinical research. With data's increasing diversity and complexity, numerous scholars have developed a range of imputation techniques. To address this, we conducted a systematic review to introduce various imputation techniques based on tabular dataset characteristics, including the mechanism, pattern, and ratio of missingness, to identify the most appropriate imputation methods in the healthcare field.
Materials and methods
We searched four information databases namely PubMed, Web of Science, Scopus, and IEEE Xplore, for articles published up to September 20, 2023, that discussed imputation methods for addressing missing values in a clinically structured dataset. Our investigation of selected articles focused on four key aspects: the mechanism, pattern, ratio of missingness, and various imputation strategies. By synthesizing insights from these perspectives, we constructed an evidence map to recommend suitable imputation methods for handling missing values in a tabular dataset.
Results
Out of 2955 articles, 58 were included in the analysis. The findings from the development of the evidence map, based on the structure of the missing values and the types of imputation methods used in the extracted items from these studies, revealed that 45% of the studies employed conventional statistical methods, 31% utilized machine learning and deep learning methods, and 24% applied hybrid imputation techniques for handling missing values.
Conclusion
Considering the structure and characteristics of missing values in a clinical dataset is essential for choosing the most appropriate data imputation technique, especially within conventional statistical methods. Accurately estimating missing values to reflect reality enhances the likelihood of obtaining high-quality and reusable data, contributing significantly to precise medical decision-making processes. Performing this review study creates a guideline for choosing the most appropriate imputation methods in data preprocessing stages to perform analytical processes on structured clinical datasets.
Highlights
• The evidence map emphasized the importance of considering missing data characteristics when choosing an imputation method, providing insights for researchers in method selection.
• The distinction between statistical and learning-based approaches highlighted the strengths and considerations of each method based on data structure.
• Understanding missing data structures can enhance the quality and reliability of data imputation techniques, and improve medical decision-making accuracy.
• Simulation studies are crucial for validating imputation techniques and enhancing their robustness in practical applications.
• Considering missing data mechanisms, patterns, and ratios can aid researchers in making informed decisions on selecting appropriate imputation methods, leading to high-quality, and reusable data for precise medical decision-making.
Journal Article
Multiple Imputation for Multilevel Data with Continuous and Binary Variables
by
Audigier, Vincent
,
Debray, Thomas P. A.
,
van Buuren, Stef
in
Binary data
,
Binary system
,
Cluster analysis
2018
We present and compare multiple imputation methods for multilevel continuous and binary data where variables are systematically and sporadically missing. The methods are compared from a theoretical point of view and through an extensive simulation study motivated by a real dataset comprising multiple studies. The comparisons show that these multiple imputation methods are the most appropriate to handle missing values in a multilevel setting and why their relative performances can vary according to the missing data pattern, the multilevel structure and the type of missing variables. This study shows that valid inferences can only be obtained if the dataset includes a large number of clusters. In addition, it highlights that heteroscedastic multiple imputation methods provide more accurate inferences than homoscedastic methods, which should be reserved for data with few individuals per cluster. Finally, guidelines are given to choose the most suitable multiple imputation method according to the structure of the data.
Journal Article
A nifty collaborative analysis to predicting a novel tool (DRFLLS) for missing values estimation
by
Al-Janabi, Samaher
,
Alkaim, Ayad F.
in
Algorithms
,
Artificial Intelligence
,
Computational Intelligence
2020
One of the important trends in an intelligent data analysis will be the growing importance of data processing. But this point faces problems similar to those of data mining (i.e., high-dimensional data, missing value imputation and data integration); one of the challenges in estimation missing value methods is how to select the optimal number of nearest neighbors of those values. This paper, attempting to search the capability of building a novel tool to estimate missing values of various datasets called developed random forest and local least squares (DRFLLS). By developing random forest algorithm, seven categories of similarity measures were defined. These categories are person similarity coefficient, simple similarity, and fuzzy similarity (M1, M2, M3, M4 and M5). They are sufficient to estimate the optimal number of neighborhoods of missing values in this application. Hereafter, local least squares (LLS) has been used to estimate the missing values. Imputation accuracy can be measured in different ways: Pearson correlation (PC) and NRMSE. Then, the optimal number of neighborhoods is associated with the highest value of PC and a smaller value of NRMSE. The experimental results were carried out on six datasets obtained from different disciplines, and DRFLLS proves the dataset which has a small rate of missing values gave the best estimation to the number of nearest neighbors by DRFPC and in the second degree by DRFFSM1 when
r
= 4, while if the dataset has high rate of missing values, then it gave the best estimation to number of nearest neighbors by DRFFSM5 and in the second degree by DRFFSM3. After that, the missing value was estimated by LLS, and the results accuracy was measured by NRMSE and Pearson correlation. The smallest value of NRMSE for a given dataset is corresponding to DRF correlation function which is a better function for a given dataset. The highest value of PC for a given dataset is corresponding to DRF correlation function which is a better function for a given dataset.
Journal Article
DBSCANI: Noise-Resistant Method for Missing Value Imputation
2016
The quality of data is an important task in the data mining. The validity of mining algorithms is reduced if data is not of good quality. The quality of data can be assessed in terms of missing values (MV) as well as noise present in the data set. Various imputation techniques have been studied in MV study, but little attention has been given on noise in earlier work. Moreover, to the best of knowledge, no one has used density-based spatial clustering of applications with noise (DBSCAN) clustering for MV imputation. This paper proposes a novel technique density-based imputation (DBSCANI) built on density-based clustering to deal with incomplete values in the presence of noise. Density-based clustering algorithm proposed by Kriegal groups the objects according to their density in spatial data bases. The high-density regions are known as clusters, and the low-density regions refer to the noise objects in the data set. A lot of experiments have been performed on the Iris data set from life science domain and Jain’s (2D) data set from shape data sets. The performance of the proposed method is evaluated using root mean square error (RMSE) as well as it is compared with existing K-means imputation (KMI). Results show that our method is more noise resistant than KMI on data sets used under study.
Journal Article