Catalogue Search | MBRL

Estimation of Stochastic Processes with Missing Observations

by Moklyachuk, Mikhail , Masyutka, Oleksandr , Sidei, Maria in Missing observations (Statistics) , Stochastic processes

2019

We propose results of the investigation of the problem of mean square optimal estimation of linear functionals constructed from unobserved values of stationary stochastic processes. Estimates are based on observations of the processes with additive stationary noise process. The aim of the book is to develop methods for finding the optimal estimates of the functionals in the case where some observations are missing. Formulas for computing values of the mean-square errors and the spectral characteristics of the optimal linear estimates of functionals are derived in the case of spectral certainty, where the spectral densities of the processes are exactly known. The minimax robust method of estimation is applied in the case of spectral uncertainty, where the spectral densities of the processes are not known exactly while some classes of admissible spectral densities are given. The formulas that determine the least favourable spectral densities and the minimax spectral characteristics of the optimal estimates of functionals are proposed for some special classes of admissible densities.

eBook

Share this book

Add to My Shelf

CBRL and CBRC: Novel Algorithms for Improving Missing Value Imputation Accuracy Based on Bayesian Ridge Regression

by Hamad, Safwat , M. Mostafa, Samih , S. Eladimy, Abdelrahman in Accuracy , Algorithms , Bayesian analysis

2020

In most scientific studies such as data analysis, the existence of missing data is a critical problem, and selecting the appropriate approach to deal with missing data is a challenge. In this paper, the authors perform a fair comparative study of some practical imputation methods used for handling missing values against two proposed imputation algorithms. The proposed algorithms depend on the Bayesian Ridge technique under two different feature selection conditions. The proposed algorithms differ from the existing approaches in that they cumulate the imputed features; those imputed features will be incorporated within the Bayesian Ridge equation for predicting the missing values in the next incomplete selected feature. The authors applied the proposed algorithms on eight datasets with different amount of missing values created from different missingness mechanisms. The performance was measured in terms of imputation time, root-mean-square error (RMSE), coefficient of determination (R2), and mean absolute error (MAE). The results showed that the performance varies depending on missing values percentage, size of the dataset, and the missingness mechanism. In addition, the performance of the proposed methods is slightly better.

Journal Article

Share this book

Add to My Shelf

Missing Value Imputation Method for Multiclass Matrix Data Based on Closed Itemset

by Suzuki, Natsumi , Okada, Yoshifumi , Tada, Mayu in Algorithms , closed itemset , Computational efficiency

2022

Handling missing values in matrix data is an important step in data analysis. To date, many methods to estimate missing values based on data pattern similarity have been proposed. Most previously proposed methods perform missing value imputation based on data trends over the entire feature space. However, individual missing values are likely to show similarity to data patterns in local feature space. In addition, most existing methods focus on single class data, while multiclass analysis is frequently required in various fields. Missing value imputation for multiclass data must consider the characteristics of each class. In this paper, we propose two methods based on closed itemsets, CIimpute and ICIimpute, to achieve missing value imputation using local feature space for multiclass matrix data. CIimpute estimates missing values using closed itemsets extracted from each class. ICIimpute is an improved method of CIimpute in which an attribute reduction process is introduced. Experimental results demonstrate that attribute reduction considerably reduces computational time and improves imputation accuracy. Furthermore, it is shown that, compared to existing methods, ICIimpute provides superior imputation accuracy but requires more computational time.

Journal Article

Share this book

Add to My Shelf

Missing value imputation: a review and analysis of the literature (2006–2017)

by Wei-Chao, Lin , Chih-Fong, Tsai in Artificial intelligence , Big Data , Data analysis

2020

Missing value imputation (MVI) has been studied for several decades being the basic solution method for incomplete dataset problems, specifically those where some data samples contain one or more missing attribute values. This paper aims at reviewing and analyzing related studies carried out in recent decades, from the experimental design perspective. Altogether, 111 journal papers published from 2006 to 2017 are reviewed and analyzed. In addition, several technical issues encountered during the MVI process are addressed, such as the choice of datasets, missing rates and missingness mechanisms, and the MVI techniques and evaluation metrics employed, are discussed. The results of analysis of these issues allow limitations in the existing body of literature to be identified based upon which some directions for future research can be gleaned.

Journal Article

Share this book

Add to My Shelf

Measuring and accounting for strategic abstentions in the US Senate, 1989-2012

by Moser, Scott , Rodríguez, Abel in Bills , Congressional elections , Congressional legislation

2015

Strategic abstentions—in which legislators abstain from votes for ideological reasons—are a poorly understood feature of legislative voting records. The paper discusses a spatial model for legislators' revealed preferences that accounts for abstentions when missing values are non-ignorable and allows us to measure the pervasiveness of strategic abstention by identifying legislators who consistently engage in strategic abstentions, as well as bills for which the ideology of legislators is a key driver of abstentions. We illustrate the performance of our model through the analysis of the 101st–112th US Senates.

Journal Article

Share this book

Add to My Shelf

Missing Data Methods: Cross-sectional Methods and Applications

by David M. Drukker in BUSINESS & ECONOMICS , Econometrics , Missing observations (Statistics)

2011

Volume 27 of \"Advances in Econometrics\", entitled \"Missing Data Methods\", contains 16 chapters authored by specialists in the field, covering topics such as: Missing-Data Imputation in Nonstationary Panel Data Models; Markov Switching Models in Empirical Finance; Bayesian Analysis of Multivariate Sample Selection Models Using Gaussian Copulas; Consistent Estimation and Orthogonality; and Likelihood-Based Estimators for Endogenous or Truncated Samples in Standard Stratified Sampling.

eBook

Share this book

Add to My Shelf

Identify the most appropriate imputation method for handling missing values in clinical structured datasets: a systematic review

by Hosseinzadeh, Elham , Afkanpour, Marziyeh , Tabesh, Hamed in Analysis , Biomedical Research - methods , Biomedical Research - standards

2024

Background and objectives Comprehending the research dataset is crucial for obtaining reliable and valid outcomes. Health analysts must have a deep comprehension of the data being analyzed. This comprehension allows them to suggest practical solutions for handling missing data, in a clinical data source. Accurate handling of missing values is critical for producing precise estimates and making informed decisions, especially in crucial areas like clinical research. With data's increasing diversity and complexity, numerous scholars have developed a range of imputation techniques. To address this, we conducted a systematic review to introduce various imputation techniques based on tabular dataset characteristics, including the mechanism, pattern, and ratio of missingness, to identify the most appropriate imputation methods in the healthcare field. Materials and methods We searched four information databases namely PubMed, Web of Science, Scopus, and IEEE Xplore, for articles published up to September 20, 2023, that discussed imputation methods for addressing missing values in a clinically structured dataset. Our investigation of selected articles focused on four key aspects: the mechanism, pattern, ratio of missingness, and various imputation strategies. By synthesizing insights from these perspectives, we constructed an evidence map to recommend suitable imputation methods for handling missing values in a tabular dataset. Results Out of 2955 articles, 58 were included in the analysis. The findings from the development of the evidence map, based on the structure of the missing values and the types of imputation methods used in the extracted items from these studies, revealed that 45% of the studies employed conventional statistical methods, 31% utilized machine learning and deep learning methods, and 24% applied hybrid imputation techniques for handling missing values. Conclusion Considering the structure and characteristics of missing values in a clinical dataset is essential for choosing the most appropriate data imputation technique, especially within conventional statistical methods. Accurately estimating missing values to reflect reality enhances the likelihood of obtaining high-quality and reusable data, contributing significantly to precise medical decision-making processes. Performing this review study creates a guideline for choosing the most appropriate imputation methods in data preprocessing stages to perform analytical processes on structured clinical datasets. Highlights • The evidence map emphasized the importance of considering missing data characteristics when choosing an imputation method, providing insights for researchers in method selection. • The distinction between statistical and learning-based approaches highlighted the strengths and considerations of each method based on data structure. • Understanding missing data structures can enhance the quality and reliability of data imputation techniques, and improve medical decision-making accuracy. • Simulation studies are crucial for validating imputation techniques and enhancing their robustness in practical applications. • Considering missing data mechanisms, patterns, and ratios can aid researchers in making informed decisions on selecting appropriate imputation methods, leading to high-quality, and reusable data for precise medical decision-making.

Journal Article

Share this book

Add to My Shelf

Multiple Imputation for Multilevel Data with Continuous and Binary Variables

by Audigier, Vincent , Debray, Thomas P. A. , van Buuren, Stef in Binary data , Binary system , Cluster analysis

2018

We present and compare multiple imputation methods for multilevel continuous and binary data where variables are systematically and sporadically missing. The methods are compared from a theoretical point of view and through an extensive simulation study motivated by a real dataset comprising multiple studies. The comparisons show that these multiple imputation methods are the most appropriate to handle missing values in a multilevel setting and why their relative performances can vary according to the missing data pattern, the multilevel structure and the type of missing variables. This study shows that valid inferences can only be obtained if the dataset includes a large number of clusters. In addition, it highlights that heteroscedastic multiple imputation methods provide more accurate inferences than homoscedastic methods, which should be reserved for data with few individuals per cluster. Finally, guidelines are given to choose the most suitable multiple imputation method according to the structure of the data.

Journal Article

Share this book

Add to My Shelf

A nifty collaborative analysis to predicting a novel tool (DRFLLS) for missing values estimation

by Al-Janabi, Samaher , Alkaim, Ayad F. in Algorithms , Artificial Intelligence , Computational Intelligence

2020

One of the important trends in an intelligent data analysis will be the growing importance of data processing. But this point faces problems similar to those of data mining (i.e., high-dimensional data, missing value imputation and data integration); one of the challenges in estimation missing value methods is how to select the optimal number of nearest neighbors of those values. This paper, attempting to search the capability of building a novel tool to estimate missing values of various datasets called developed random forest and local least squares (DRFLLS). By developing random forest algorithm, seven categories of similarity measures were defined. These categories are person similarity coefficient, simple similarity, and fuzzy similarity (M1, M2, M3, M4 and M5). They are sufficient to estimate the optimal number of neighborhoods of missing values in this application. Hereafter, local least squares (LLS) has been used to estimate the missing values. Imputation accuracy can be measured in different ways: Pearson correlation (PC) and NRMSE. Then, the optimal number of neighborhoods is associated with the highest value of PC and a smaller value of NRMSE. The experimental results were carried out on six datasets obtained from different disciplines, and DRFLLS proves the dataset which has a small rate of missing values gave the best estimation to the number of nearest neighbors by DRFPC and in the second degree by DRFFSM1 when r = 4, while if the dataset has high rate of missing values, then it gave the best estimation to number of nearest neighbors by DRFFSM5 and in the second degree by DRFFSM3. After that, the missing value was estimated by LLS, and the results accuracy was measured by NRMSE and Pearson correlation. The smallest value of NRMSE for a given dataset is corresponding to DRF correlation function which is a better function for a given dataset. The highest value of PC for a given dataset is corresponding to DRF correlation function which is a better function for a given dataset.

Journal Article

Share this book

Add to My Shelf

DBSCANI: Noise-Resistant Method for Missing Value Imputation

by Singh, Sandeep Kumar , Purwar, Archana in Algorithms , Clustering , Computer science

2016

The quality of data is an important task in the data mining. The validity of mining algorithms is reduced if data is not of good quality. The quality of data can be assessed in terms of missing values (MV) as well as noise present in the data set. Various imputation techniques have been studied in MV study, but little attention has been given on noise in earlier work. Moreover, to the best of knowledge, no one has used density-based spatial clustering of applications with noise (DBSCAN) clustering for MV imputation. This paper proposes a novel technique density-based imputation (DBSCANI) built on density-based clustering to deal with incomplete values in the presence of noise. Density-based clustering algorithm proposed by Kriegal groups the objects according to their density in spatial data bases. The high-density regions are known as clusters, and the low-density regions refer to the noise objects in the data set. A lot of experiments have been performed on the Iris data set from life science domain and Jain’s (2D) data set from shape data sets. The performance of the proposed method is evaluated using root mean square error (RMSE) as well as it is compared with existing K-means imputation (KMI). Results show that our method is more noise resistant than KMI on data sets used under study.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter