Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
132,208
result(s) for
"COMPONENT ANALYSIS"
Sort by:
Applying dimension reduction to EEG data by Principal Component Analysis reduces the quality of its subsequent Independent Component decomposition
by
Artoni, Fiorenzo
,
Makeig, Scott
,
Delorme, Arnaud
in
Adult
,
Brain - physiology
,
Brain research
2018
Independent Component Analysis (ICA) has proven to be an effective data driven method for analyzing EEG data, separating signals from temporally and functionally independent brain and non-brain source processes and thereby increasing their definition. Dimension reduction by Principal Component Analysis (PCA) has often been recommended before ICA decomposition of EEG data, both to minimize the amount of required data and computation time. Here we compared ICA decompositions of fourteen 72-channel single subject EEG data sets obtained (i) after applying preliminary dimension reduction by PCA, (ii) after applying no such dimension reduction, or else (iii) applying PCA only. Reducing the data rank by PCA (even to remove only 1% of data variance) adversely affected both the numbers of dipolar independent components (ICs) and their stability under repeated decomposition. For example, decomposing a principal subspace retaining 95% of original data variance reduced the mean number of recovered ‘dipolar’ ICs from 30 to 10 per data set and reduced median IC stability from 90% to 76%. PCA rank reduction also decreased the numbers of near-equivalent ICs across subjects. For instance, decomposing a principal subspace retaining 95% of data variance reduced the number of subjects represented in an IC cluster accounting for frontal midline theta activity from 11 to 5. PCA rank reduction also increased uncertainty in the equivalent dipole positions and spectra of the IC brain effective sources. These results suggest that when applying ICA decomposition to EEG data, PCA rank reduction should best be avoided.
•It is currently a common practice to apply dimension reduction to EEG data using PCA before performing ICA decomposition.•We tested the quality of Independent Components (ICs) after different levels of rank reduction to a principal subspace.•PCA rank reduction adversely affected dipolarity and stability of ICs accounting for brain and known non-brain processes.•PCA rank reduction also increased inter-subject variance in IC source locations (by equivalent dipole fitting) and spectra.•For EEG data at least, PCA rank reduction should be avoided or carefully tested before applying it as a preprocessing step.
Journal Article
Independent component analysis: An introduction
Independent component analysis (ICA) is a widely-used blind source separation technique. ICA has been applied to many applications. ICA is usually utilized as a black box, without understanding its internal details. Therefore, in this paper, the basics of ICA are provided to show how it works to serve as a comprehensive source for researchers who are interested in this field. This paper starts by introducing the definition and underlying principles of ICA. Additionally, different numerical examples in a step-by-step approach are demonstrated to explain the preprocessing steps of ICA and the mixing and unmixing processes in ICA. Moreover, different ICA algorithms, challenges, and applications are presented.
Journal Article
Independent Component Analysis via Distance Covariance
2017
This article introduces a novel statistical framework for independent component analysis (ICA) of multivariate data. We propose methodology for estimating mutually independent components, and a versatile resampling-based procedure for inference, including misspecification testing. Independent components are estimated by combining a nonparametric probability integral transformation with a generalized nonparametric whitening method based on distance covariance that simultaneously minimizes all forms of dependence among the components. We prove the consistency of our estimator under minimal regularity conditions and detail conditions for consistency under model misspecification, all while placing assumptions on the observations directly, not on the latent components. U statistics of certain Euclidean distances between sample elements are combined to construct a test statistic for mutually independent components. The proposed measures and tests are based on both necessary and sufficient conditions for mutual independence. We demonstrate the improvements of the proposed method over several competing methods in simulation studies, and we apply the proposed ICA approach to two real examples and contrast it with principal component analysis.
Journal Article
Consistency and differences between centrality measures across distinct classes of networks
2019
The roles of different nodes within a network are often understood through centrality analysis, which aims to quantify the capacity of a node to influence, or be influenced by, other nodes via its connection topology. Many different centrality measures have been proposed, but the degree to which they offer unique information, and whether it is advantageous to use multiple centrality measures to define node roles, is unclear. Here we calculate correlations between 17 different centrality measures across 212 diverse real-world networks, examine how these correlations relate to variations in network density and global topology, and investigate whether nodes can be clustered into distinct classes according to their centrality profiles. We find that centrality measures are generally positively correlated to each other, the strength of these correlations varies across networks, and network modularity plays a key role in driving these cross-network variations. Data-driven clustering of nodes based on centrality profiles can distinguish different roles, including topological cores of highly central nodes and peripheries of less central nodes. Our findings illustrate how network topology shapes the pattern of correlations between centrality measures and demonstrate how a comparative approach to network centrality can inform the interpretation of nodal roles in complex networks.
Journal Article
Behavior change due to COVID-19 among dental academics—The theory of planned behavior: Stresses, worries, training, and pandemic severity
2020
COVID-19 pandemic led to major life changes. We assessed the psychological impact of COVID-19 on dental academics globally and on changes in their behaviors.
We invited dental academics to complete a cross-sectional, online survey from March to May 2020. The survey was based on the Theory of Planned Behavior (TPB). The survey collected data on participants' stress levels (using the Impact of Event Scale), attitude (fears, and worries because of COVID-19 extracted by Principal Component Analysis (PCA), perceived control (resulting from training on public health emergencies), norms (country-level COVID-19 fatality rate), and personal and professional backgrounds. We used multilevel regression models to assess the association between the study outcome variables (frequent handwashing and avoidance of crowded places) and explanatory variables (stress, attitude, perceived control and norms).
1862 academics from 28 countries participated in the survey (response rate = 11.3%). Of those, 53.4% were female, 32.9% were <46 years old and 9.9% had severe stress. PCA extracted three main factors: fear of infection, worries because of professional responsibilities, and worries because of restricted mobility. These factors had significant dose-dependent association with stress and were significantly associated with more frequent handwashing by dental academics (B = 0.56, 0.33, and 0.34) and avoiding crowded places (B = 0.55, 0.30, and 0.28). Low country fatality rates were significantly associated with more handwashing (B = -2.82) and avoiding crowded places (B = -6.61). Training on public health emergencies was not significantly associated with behavior change (B = -0.01 and -0.11).
COVID-19 had a considerable psychological impact on dental academics. There was a direct, dose-dependent association between change in behaviors and worries but no association between these changes and training on public health emergencies. More change in behaviors was associated with lower country COVID-19 fatality rates. Fears and stresses were associated with greater adoption of preventive measures against the pandemic.
Journal Article
Independent EEG Sources Are Dipolar
2012
Independent component analysis (ICA) and blind source separation (BSS) methods are increasingly used to separate individual brain and non-brain source signals mixed by volume conduction in electroencephalographic (EEG) and other electrophysiological recordings. We compared results of decomposing thirteen 71-channel human scalp EEG datasets by 22 ICA and BSS algorithms, assessing the pairwise mutual information (PMI) in scalp channel pairs, the remaining PMI in component pairs, the overall mutual information reduction (MIR) effected by each decomposition, and decomposition 'dipolarity' defined as the number of component scalp maps matching the projection of a single equivalent dipole with less than a given residual variance. The least well-performing algorithm was principal component analysis (PCA); best performing were AMICA and other likelihood/mutual information based ICA methods. Though these and other commonly-used decomposition methods returned many similar components, across 18 ICA/BSS algorithms mean dipolarity varied linearly with both MIR and with PMI remaining between the resulting component time courses, a result compatible with an interpretation of many maximally independent EEG components as being volume-conducted projections of partially-synchronous local cortical field activity within single compact cortical domains. To encourage further method comparisons, the data and software used to prepare the results have been made available (http://sccn.ucsd.edu/wiki/BSSComparison).
Journal Article
Inferring Population Structure and Admixture Proportions in Low-Depth NGS Data
2018
Meisner and Albrechtsen present two methods for inferring population structure and admixture proportions in low depth next-generation sequencing (NGS). NGS methods provide large amounts of genetic data but are associated with statistical uncertainty, especially for low-depth... We here present two methods for inferring population structure and admixture proportions in low-depth next-generation sequencing (NGS) data. Inference of population structure is essential in both population genetics and association studies, and is often performed using principal component analysis (PCA) or clustering-based approaches. NGS methods provide large amounts of genetic data but are associated with statistical uncertainty, especially for low-depth sequencing data. Models can account for this uncertainty by working directly on genotype likelihoods of the unobserved genotypes. We propose a method for inferring population structure through PCA in an iterative heuristic approach of estimating individual allele frequencies, where we demonstrate improved accuracy in samples with low and variable sequencing depth for both simulated and real datasets. We also use the estimated individual allele frequencies in a fast non-negative matrix factorization method to estimate admixture proportions. Both methods have been implemented in the PCAngsd framework available at http://www.popgen.dk/software/.
Journal Article
Multivariate Functional Principal Component Analysis for Data Observed on Different (Dimensional) Domains
2018
Existing approaches for multivariate functional principal component analysis are restricted to data on the same one-dimensional interval. The presented approach focuses on multivariate functional data on different domains that may differ in dimension, such as functions and images. The theoretical basis for multivariate functional principal component analysis is given in terms of a Karhunen-Loève Theorem. For the practically relevant case of a finite Karhunen-Loève representation, a relationship between univariate and multivariate functional principal component analysis is established. This offers an estimation strategy to calculate multivariate functional principal components and scores based on their univariate counterparts. For the resulting estimators, asymptotic results are derived. The approach can be extended to finite univariate expansions in general, not necessarily orthonormal bases. It is also applicable for sparse functional data or data with measurement error. A flexible R implementation is available on CRAN. The new method is shown to be competitive to existing approaches for data observed on a common one-dimensional domain. The motivating application is a neuroimaging study, where the goal is to explore how longitudinal trajectories of a neuropsychological test score covary with FDG-PET brain scans at baseline. Supplementary material, including detailed proofs, additional simulation results, and software is available online.
Journal Article
Robust principal component analysis for accurate outlier sample detection in RNA-Seq data
2020
Background
High throughput RNA sequencing is a powerful approach to study gene expression. Due to the complex multiple-steps protocols in data acquisition, extreme deviation of a sample from samples of the same treatment group may occur due to technical variation or true biological differences. The high-dimensionality of the data with few biological replicates make it challenging to accurately detect those samples, and this issue is not well studied in the literature currently. Robust statistics is a family of theories and techniques aim to detect the outliers by first fitting the majority of the data and then flagging data points that deviate from it. Robust statistics have been widely used in multivariate data analysis for outlier detection in chemometrics and engineering. Here we apply robust statistics on RNA-seq data analysis.
Results
We report the use of two robust principal component analysis (rPCA) methods,
PcaHubert
and
PcaGrid
, to detect outlier samples in multiple simulated and real biological RNA-seq data sets with positive control outlier samples.
PcaGrid
achieved 100% sensitivity and 100% specificity in all the tests using positive control outliers with varying degrees of divergence. We applied rPCA methods and classical principal component analysis (cPCA) on an RNA-Seq data set profiling gene expression of the external granule layer in the cerebellum of control and conditional
SnoN
knockout mice. Both rPCA methods detected the same two outlier samples but cPCA failed to detect any. We performed differentially expressed gene detection before and after outlier removal as well as with and without batch effect modeling. We validated gene expression changes using quantitative reverse transcription PCR and used the result as reference to compare the performance of eight different data analysis strategies. Removing outliers without batch effect modeling performed the best in term of detecting biologically relevant differentially expressed genes.
Conclusions
rPCA implemented in the
PcaGrid
function is an accurate and objective method to detect outlier samples. It is well suited for high-dimensional data with small sample sizes like RNA-seq data. Outlier removal can significantly improve the performance of differential gene detection and downstream functional analysis.
Journal Article
Principal Component Analyses (PCA)-based findings in population genetic studies are highly biased and must be reevaluated
2022
Principal Component Analysis (PCA) is a multivariate analysis that reduces the complexity of datasets while preserving data covariance. The outcome can be visualized on colorful scatterplots, ideally with only a minimal loss of information. PCA applications, implemented in well-cited packages like EIGENSOFT and PLINK, are extensively used as the foremost analyses in population genetics and related fields (e.g., animal and plant or medical genetics). PCA outcomes are used to shape study design, identify, and characterize individuals and populations, and draw historical and ethnobiological conclusions on origins, evolution, dispersion, and relatedness. The replicability crisis in science has prompted us to evaluate whether PCA results are reliable, robust, and replicable. We analyzed twelve common test cases using an intuitive color-based model alongside human population data. We demonstrate that PCA results can be artifacts of the data and can be easily manipulated to generate desired outcomes. PCA adjustment also yielded unfavorable outcomes in association studies. PCA results may not be reliable, robust, or replicable as the field assumes. Our findings raise concerns about the validity of results reported in the population genetics literature and related fields that place a disproportionate reliance upon PCA outcomes and the insights derived from them. We conclude that PCA may have a biasing role in genetic investigations and that 32,000-216,000 genetic studies should be reevaluated. An alternative mixed-admixture population genetic model is discussed.
Journal Article