Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Language
      Language
      Clear All
      Language
  • Subject
      Subject
      Clear All
      Subject
  • Item Type
      Item Type
      Clear All
      Item Type
  • Discipline
      Discipline
      Clear All
      Discipline
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
46 result(s) for "Gorlov, Ivan P."
Sort by:
Why does the X chromosome lag behind autosomes in GWAS findings?
The X-chromosome is among the largest human chromosomes. It differs from autosomes by a number of important features including hemizygosity in males, an almost complete inactivation of one copy in females, and unique patterns of recombination. We used data from the Catalog of Published Genome Wide Association Studies to compare densities of the GWAS-detected SNPs on the X-chromosome and autosomes. The density of GWAS-detected SNPs on the X-chromosome is 6-fold lower compared to the density of the GWAS-detected SNPs on autosomes. Differences between the X-chromosome and autosomes cannot be explained by differences in the overall SNP density, lower X-chromosome coverage by genotyping platforms or low call rate of X-chromosomal SNPs. Similar differences in the density of GWAS-detected SNPs were found in female-only GWASs (e.g. ovarian cancer GWASs). We hypothesized that the lower density of GWAS-detected SNPs on the X-chromosome compared to autosomes is not a result of a methodological bias, e.g. differences in coverage or call rates, but has a real underlying biological reason–a lower density of functional SNPs on the X-chromosome versus autosomes . This hypothesis is supported by the observation that (i) the overall SNP density of X-chromosome is lower compared to the SNP density on autosomes and that (ii) the density of genic SNPs on the X-chromosome is lower compared to autosomes while densities of intergenic SNPs are similar.
Strength of selection in lung tumors correlates with clinical features better than tumor mutation burden
Single nucleotide substitutions are the most common type of somatic mutations in cancer genome. The goal of this study was to use publicly available somatic mutation data to quantify negative and positive selection in individual lung tumors and test how strength of directional and absolute selection is associated with clinical features. The analysis found a significant variation in strength of selection (both negative and positive) among tumors, with median selection tending to be negative even though tumors with strong positive selection also exist. Strength of selection estimated as the density of missense mutations relative to the density of silent mutations showed only a weak correlation with tumor mutation burden. In the “all histology together” analysis we found that absolute strength of selection was strongly correlated with all clinically relevant features analyzed. In histology-stratified analysis selection was strongest in small cell lung cancer. Selection in adenocarcinoma was somewhat higher compared to squamous cell carcinoma. The study suggests that somatic mutation- based quantifying of directional and absolute selection in individual tumors can be a useful biomarker of tumor aggressiveness.
Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1
To identify risk variants for lung cancer, we conducted a multistage genome-wide association study. In the discovery phase, we analyzed 315,450 tagging SNPs in 1,154 current and former (ever) smoking cases of European ancestry and 1,137 frequency-matched, ever-smoking controls from Houston, Texas. For replication, we evaluated the ten SNPs most significantly associated with lung cancer in an additional 711 cases and 632 controls from Texas and 2,013 cases and 3,062 controls from the UK. Two SNPs, rs1051730 and rs8034191, mapping to a region of strong linkage disequilibrium within 15q25.1 containing PSMA4 and the nicotinic acetylcholine receptor subunit genes CHRNA3 and CHRNA5 , were significantly associated with risk in both replication sets. Combined analysis yielded odds ratios of 1.32 ( P < 1 × 10 −17 ) for both SNPs. Haplotype analysis was consistent with there being a single risk variant in this region. We conclude that variation in a region of 15q25.1 containing nicotinic acetylcholine receptors genes contributes to lung cancer risk.
Ancestry inference using principal component analysis and spatial analysis: a distance-based analysis to account for population substructure
Background Accurate inference of genetic ancestry is of fundamental interest to many biomedical, forensic, and anthropological research areas. Genetic ancestry memberships may relate to genetic disease risks. In a genome association study, failing to account for differences in genetic ancestry between cases and controls may also lead to false-positive results. Although a number of strategies for inferring and taking into account the confounding effects of genetic ancestry are available, applying them to large studies (tens thousands samples) is challenging. The goal of this study is to develop an approach for inferring genetic ancestry of samples with unknown ancestry among closely related populations and to provide accurate estimates of ancestry for application to large-scale studies. Methods In this study we developed a novel distance-based approach, Ancestry Inference using Principal component analysis and Spatial analysis (AIPS) that incorporates an Inverse Distance Weighted (IDW) interpolation method from spatial analysis to assign individuals to population memberships. Results We demonstrate the benefits of AIPS in analyzing population substructure, specifically related to the four most commonly used tools EIGENSTRAT, STRUCTURE, fastSTRUCTURE, and ADMIXTURE using genotype data from various intra-European panels and European-Americans. While the aforementioned commonly used tools performed poorly in inferring ancestry from a large number of subpopulations, AIPS accurately distinguished variations between and within subpopulations. Conclusions Our results show that AIPS can be applied to large-scale data sets to discriminate the modest variability among intra-continental populations as well as for characterizing inter-continental variation. The method we developed will protect against spurious associations when mapping the genetic basis of a disease. Our approach is more accurate and computationally efficient method for inferring genetic ancestry in the large-scale genetic studies.
Exposure-inducible genes may contribute to missingness in RNAseq-based gene expression analyses
Missing gene expression values are a common issue in RNAseq-based analyses of gene expression. However, an analysis of genetic and environmental factors contributing to data missingness in RNAseq-based assessment of gene expression has never been conducted. In this study we tried to identify factors in RNAseq data missingness. We used RNAseq data from 66 lung adenocarcinoma tumors and corresponding adjacent normal lung tissues. We found a strong negative association between the gene expression level and missingness, supporting the idea that the borderline expression level is a key contributor to missingness. In a more detailed analysis, the relationship between gene expression and missingness was more complex: while the expected negative association between missingness and the expression level was observed for genes with low missingness, mean expression spiked at the right end of the distribution which included genes with very high missingness. We hypothesized that genes with a high missing rate include not only genes with borderline expression but also genes with high expression in some individuals but no expression in others (true biological missingness, TBM). The results of the comparative analysis of missingness in smokers and nonsmokers, an examination of the proportion of known tobacco smoke-sensitive genes by missing rate, and gene enrichment analysis support the hypothesis. We argue that it would be beneficial first to check data for the presence of genes with true biological missingness. The presence of highly expressed genes with missingness is an indication of TBM related to inter-individual variation in gene expression level. The results of our analysis call for caution in indiscriminatory imputation of missing values. When true biological missingness is present, it is advisable to identify genes with true biological missingness and analyze them separately because including such genes in imputation will lead to a bias: expression values will be assigned to a subset of the genes that are not expressed.
Gene expression in tumor and adjacent normal tissues in lung adenocarcinoma subtypes
Background Lung adenocarcinoma (LUAD) has several histologically distinct subtypes that differ by a number of clinical features including patient survival. Molecular mechanisms underlying histological and clinical differences between subtypes remain poorly understood. Methods We conducted a comparative analyses of gene expression in acinar, lepidic, papillary and solid subtypes, as well as mucinous adenocarcinoma. We used a novel, more efficient approach to identify subtype-specific genes. We compared the mean gene expression level separately for tumors and adjacent normal tissue with pure or a highly represented (≥ 75%) subtype of interest to the mean expression in tumors where the subtype of interest was not present. We also performed tumor to adjacent normal tissue comparisons and identified genes differentially expressed between tumor and adjacent normal tissues for each subtype. Results The number of subtype-specific genes varied from 1 for the acinar to 482 for the papillary subtype. Comparative analysis of gene expression in adjacent normal tissues also identified subtype-specific genes, 38 in total. Gene set enrichment analysis identified oxidative phosphorylation as a biological function associated with papillary, and immune response - with solid subtype. Using data on differential expression between tumor and adjacent normal tissue among the subtype-specific genes and existing evidence for association with lung carcinogenesis, we have identified several candidate subtype-specific driver genes. Conclusio n We identified subtype-specific genes, biological functions, and potential drivers of subtype-specific carcinogenesis for LUAD subtypes. The study showed importance of gene expression in adjacent normal tissue for subtype-specific tumorigenesis.
Allelic Spectra of Risk SNPs Are Different for Environment/Lifestyle Dependent versus Independent Diseases
Genome-wide association studies (GWAS) have generated sufficient data to assess the role of selection in shaping allelic diversity of disease-associated SNPs. Negative selection against disease risk variants is expected to reduce their frequencies making them overrepresented in the group of minor (<50%) alleles. Indeed, we found that the overall proportion of risk alleles was higher among alleles with frequency <50% (minor alleles) compared to that in the group of major alleles. We hypothesized that negative selection may have different effects on environment (or lifestyle)-dependent versus environment (or lifestyle)-independent diseases. We used an environment/lifestyle index (ELI) to assess influence of environmental/lifestyle factors on disease etiology. ELI was defined as the number of publications mentioning \"environment\" or \"lifestyle\" AND disease per 1,000 disease-mentioning publications. We found that the frequency distributions of the risk alleles for the diseases with strong environmental/lifestyle components follow the distribution expected under a selectively neutral model, while frequency distributions of the risk alleles for the diseases with weak environmental/lifestyle influences is shifted to the lower values indicating effects of negative selection. We hypothesized that previously selectively neutral variants become risk alleles when environment changes. The hypothesis of ancestrally neutral, currently disadvantageous risk-associated alleles predicts that the distribution of risk alleles for the environment/lifestyle dependent diseases will follow a neutral model since natural selection has not had enough time to influence allele frequencies. The results of our analysis suggest that prediction of SNP functionality based on the level of evolutionary conservation may not be useful for SNPs associated with environment/lifestyle dependent diseases.
Gene characteristics predicting missense, nonsense and frameshift mutations in tumor samples
Background Because driver mutations provide selective advantage to the mutant clone, they tend to occur at a higher frequency in tumor samples compared to selectively neutral (passenger) mutations. However, mutation frequency alone is insufficient to identify cancer genes because mutability is influenced by many gene characteristics, such as size, nucleotide composition, etc. The goal of this study was to identify gene characteristics associated with the frequency of somatic mutations in the gene in tumor samples. Results We used data on somatic mutations detected by genome wide screens from the Catalog of Somatic Mutations in Cancer (COSMIC). Gene size, nucleotide composition, expression level of the gene, relative replication time in the cell cycle, level of evolutionary conservation and other gene characteristics (totaling 11) were used as predictors of the number of somatic mutations. We applied stepwise multiple linear regression to predict the number of mutations per gene. Because missense, nonsense, and frameshift mutations are associated with different sets of gene characteristics, they were modeled separately. Gene characteristics explain 88% of the variation in the number of missense, 40% of nonsense, and 23% of frameshift mutations. Comparisons of the observed and expected numbers of mutations identified genes with a higher than expected number of mutations– positive outliers. Many of these are known driver genes. A number of novel candidate driver genes was also identified. Conclusions By comparing the observed and predicted number of mutations in a gene, we have identified known cancer-associated genes as well as 111 novel cancer associated genes. We also showed that adding the number of silent mutations per gene reported by genome/exome wide screens across all cancer type (COSMIC data) as a predictor substantially exceeds predicting accuracy of the most popular cancer gene predicting tool - MutsigCV.
SNP characteristics and validation success in genome wide association studies
Genome wide association studies (GWASs) have identified tens of thousands of single nucleotide polymorphisms (SNPs) associated with human diseases and characteristics. A significant fraction of GWAS findings can be false positives. The gold standard for true positives is an independent validation. The goal of this study was to identify SNP features associated with validation success. Summary statistics from the Catalog of Published GWASs were used in the analysis. Since our goal was an analysis of reproducibility, we focused on the diseases/phenotypes targeted by at least 10 GWASs. GWASs were arranged in discovery-validation pairs based on the time of publication, with the discovery GWAS published before validation. We used four definitions of the validation success that differ by stringency. Associations of SNP features with validation success were consistent across the definitions. The strongest predictor of SNP validation was the level of statistical significance in the discovery GWAS. The magnitude of the effect size was associated with validation success in a non-linear manner. SNPs with risk allele frequencies in the range 30–70% showed a higher validation success rate compared to rarer or more common SNPs. Missense, 5’UTR, stop gained, and SNPs located in transcription factor binding sites had a higher validation success rate compared to intergenic, intronic and synonymous SNPs. There was a positive association between validation success and the level of evolutionary conservation of the sites. In addition, validation success was higher when discovery and validation GWASs targeted the same ethnicity. All predictors of validation success remained significant in a multivariate logistic regression model indicating their independent contribution. To conclude, we identified SNP features predicting validation success of GWAS hits. These features can be used to select SNPs for validation and downstream functional studies.