Catalogue Search | MBRL

The network experience : new value from Smart Business Networks

by Vervest, Peter H. M , Liere, Diederik W. van , Zheng, Li, 1964- in Computer networks Management.

Book

Share this book

Add to My Shelf

Extracting health-related causality from twitter messages using natural language processing

by Doan, Son , Li, Peter W. , Torii, Manabu in Annotations , Artificial intelligence , Automation

2019

Background Twitter messages (tweets) contain various types of topics in our daily life, which include health-related topics. Analysis of health-related tweets would help us understand health conditions and concerns encountered in our daily lives. In this paper we evaluate an approach to extracting causalities from tweets using natural language processing (NLP) techniques. Methods Lexico-syntactic patterns based on dependency parser outputs are used for causality extraction. We focused on three health-related topics: “stress”, “insomnia”, and “headache.” A large dataset consisting of 24 million tweets are used. Results The results show the proposed approach achieved an average precision between 74.59 to 92.27% in comparisons with human annotations. Conclusions Manual analysis on extracted causalities in tweets reveals interesting findings about expressions on health-related topic posted by Twitter users.

Journal Article

Share this book

Add to My Shelf

Recent Segmental Duplications in the Human Genome

by Gu, Zhiping , Schwartz, Stuart , Bailey, Jeffrey A. in Alleles , Analysis , Base Sequence

2002

Primate-specific segmental duplications are considered important in human disease and evolution. The inability to distinguish between allelic and duplication sequence overlap has hampered their characterization as well as assembly and annotation of our genome. We developed a method whereby each public sequence is analyzed at the clone level for overrepresentation within a whole-genome shotgun sequence. This test has the ability to detect duplications larger than 15 kilobases irrespective of copy number, location, or high sequence similarity. We mapped 169 large regions flanked by highly similar duplications. Twenty-four of these hot spots of genomic instability have been associated with genetic disease. Our analysis indicates a highly nonrandom chromosomal and genic distribution of recent segmental duplications, with a likely role in expanding protein diversity.

Journal Article

Share this book

Add to My Shelf

Common Oncogene Mutations and Novel SND1-BRAF Transcript Fusion in Lung Adenocarcinoma from Never Smokers

by Sun, ZhiFu , Jeon, Hyo-Sung , Liyanage, Hema in 38/91 , 45/77 , 49/47

2015

Lung adenocarcinomas from never smokers account for approximately 15 to 20% of all lung cancers and these tumors often carry genetic alterations that are responsive to targeted therapy. Here we examined mutation status in 10 oncogenes among 89 lung adenocarcinomas from never smokers. We also screened for oncogene fusion transcripts in 20 of the 89 tumors by RNA-Seq. In total, 62 tumors had mutations in at least one of the 10 oncogenes, including EGFR (49 cases, 55%), K- ras (5 cases, 6%), BRAF (4 cases, 5%), PIK3CA (3 cases, 3%) and ERBB2 (4 cases, 5%). In addition to ALK fusions identified by IHC/FISH in four cases, two previously known fusions involving EZR- ROS1 and KIF5B-RET were identified by RNA-Seq as well as a third novel fusion transcript that was formed between exons 1–9 of SND1 and exons 2 to 3′ end of BRAF . This in-frame fusion was observed in 3/89 tested tumors and 2/64 additional never smoker lung adenocarcinoma samples. Ectopic expression of SND1-BRAF in H1299 cells increased phosphorylation levels of MEK/ERK, cell proliferation and spheroid formation compared to parental mock-transfected control. Jointly, our results suggest a potential role of the novel BRAF fusion in lung cancer development and therapy.

Journal Article

Share this book

Add to My Shelf

ReliefSeq: A Gene-Wise Adaptive-K Nearest-Neighbor Feature Selection Tool for Finding Gene-Gene Interactions and Main Effects in mRNA-Seq Gene Expression Data

by White, Bill C. , Oberg, Ann L. , McKinney, Brett A. in Algorithms , Artificial Intelligence , Binomial distribution

2013

Relief-F is a nonparametric, nearest-neighbor machine learning method that has been successfully used to identify relevant variables that may interact in complex multivariate models to explain phenotypic variation. While several tools have been developed for assessing differential expression in sequence-based transcriptomics, the detection of statistical interactions between transcripts has received less attention in the area of RNA-seq analysis. We describe a new extension and assessment of Relief-F for feature selection in RNA-seq data. The ReliefSeq implementation adapts the number of nearest neighbors (k) for each gene to optimize the Relief-F test statistics (importance scores) for finding both main effects and interactions. We compare this gene-wise adaptive-k (gwak) Relief-F method with standard RNA-seq feature selection tools, such as DESeq and edgeR, and with the popular machine learning method Random Forests. We demonstrate performance on a panel of simulated data that have a range of distributional properties reflected in real mRNA-seq data including multiple transcripts with varying sizes of main effects and interaction effects. For simulated main effects, gwak-Relief-F feature selection performs comparably to standard tools DESeq and edgeR for ranking relevant transcripts. For gene-gene interactions, gwak-Relief-F outperforms all comparison methods at ranking relevant genes in all but the highest fold change/highest signal situations where it performs similarly. The gwak-Relief-F algorithm outperforms Random Forests for detecting relevant genes in all simulation experiments. In addition, Relief-F is comparable to the other methods based on computational time. We also apply ReliefSeq to an RNA-Seq study of smallpox vaccine to identify gene expression changes between vaccinia virus-stimulated and unstimulated samples. ReliefSeq is an attractive tool for inclusion in the suite of tools used for analysis of mRNA-Seq data; it has power to detect both main effects and interaction effects. Software Availability: http://insilico.utulsa.edu/ReliefSeq.php.

Journal Article

Share this book

Add to My Shelf

Rationalizing governance of genetically modified products in developing countries

by Kearns, Peter , Murphy, Denis J , Li, Yun-He in 631/61 , 631/61/447 , Agriculture

2018

Journal Article

Share this book

Add to My Shelf

ReliefSeq: A Gene-Wise Adaptive-K Nearest-Neighbor Feature Selection Tool for Finding Gene-Gene Interactions and Main Effects in mRNA-Seq Gene Expression Data: e81527

by Grill, Diane E , Poland, Gregory A , Li, Peter W

2013

Relief-F is a nonparametric, nearest-neighbor machine learning method that has been successfully used to identify relevant variables that may interact in complex multivariate models to explain phenotypic variation. While several tools have been developed for assessing differential expression in sequence-based transcriptomics, the detection of statistical interactions between transcripts has received less attention in the area of RNA-seq analysis. We describe a new extension and assessment of Relief-F for feature selection in RNA-seq data. The ReliefSeq implementation adapts the number of nearest neighbors (k) for each gene to optimize the Relief-F test statistics (importance scores) for finding both main effects and interactions. We compare this gene-wise adaptive-k (gwak) Relief-F method with standard RNA-seq feature selection tools, such as DESeq and edgeR, and with the popular machine learning method Random Forests. We demonstrate performance on a panel of simulated data that have a range of distributional properties reflected in real mRNA-seq data including multiple transcripts with varying sizes of main effects and interaction effects. For simulated main effects, gwak-Relief-F feature selection performs comparably to standard tools DESeq and edgeR for ranking relevant transcripts. For gene-gene interactions, gwak-Relief-F outperforms all comparison methods at ranking relevant genes in all but the highest fold change/highest signal situations where it performs similarly. The gwak-Relief-F algorithm outperforms Random Forests for detecting relevant genes in all simulation experiments. In addition, Relief-F is comparable to the other methods based on computational time. We also apply ReliefSeq to an RNA-Seq study of smallpox vaccine to identify gene expression changes between vaccinia virus-stimulated and unstimulated samples. ReliefSeq is an attractive tool for inclusion in the suite of tools used for analysis of mRNA-Seq data; it has power to detect both main effects and interaction effects. Software Availability: http://insilico.utulsa.edu/ReliefSeq.php.

Journal Article

Share this book

Add to My Shelf

Recent segmental duplications in the human genome

by Gu, Zhiping , Schwartz, Stuart , Bailey, Jeffrey A. in Analysis , Chromosome mapping , Genetic algorithms

2002

Journal Article

Share this book

Add to My Shelf

Human chromosome 7: DNA sequence and Biology

by Farra, Chantal G. , Kim, Hyung-Goo , Kwasnicka, Dorota in Analysis , Chromosome mapping , Genetic disorders

2003