Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Reading LevelReading Level
-
Content TypeContent Type
-
YearFrom:-To:
-
More FiltersMore FiltersItem TypeIs Full-Text AvailableSubjectPublisherSourceDonorLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
1,855
result(s) for
"Li, Peter W."
Sort by:
Extracting health-related causality from twitter messages using natural language processing
2019
Background
Twitter messages (tweets) contain various types of topics in our daily life, which include health-related topics. Analysis of health-related tweets would help us understand health conditions and concerns encountered in our daily lives. In this paper we evaluate an approach to extracting causalities from tweets using natural language processing (NLP) techniques.
Methods
Lexico-syntactic patterns based on dependency parser outputs are used for causality extraction. We focused on three health-related topics: “stress”, “insomnia”, and “headache.” A large dataset consisting of 24 million tweets are used.
Results
The results show the proposed approach achieved an average precision between 74.59 to 92.27% in comparisons with human annotations.
Conclusions
Manual analysis on extracted causalities in tweets reveals interesting findings about expressions on health-related topic posted by Twitter users.
Journal Article
Recent Segmental Duplications in the Human Genome
2002
Primate-specific segmental duplications are considered important in human disease and evolution. The inability to distinguish between allelic and duplication sequence overlap has hampered their characterization as well as assembly and annotation of our genome. We developed a method whereby each public sequence is analyzed at the clone level for overrepresentation within a whole-genome shotgun sequence. This test has the ability to detect duplications larger than 15 kilobases irrespective of copy number, location, or high sequence similarity. We mapped 169 large regions flanked by highly similar duplications. Twenty-four of these hot spots of genomic instability have been associated with genetic disease. Our analysis indicates a highly nonrandom chromosomal and genic distribution of recent segmental duplications, with a likely role in expanding protein diversity.
Journal Article
Common Oncogene Mutations and Novel SND1-BRAF Transcript Fusion in Lung Adenocarcinoma from Never Smokers
2015
Lung adenocarcinomas from never smokers account for approximately 15 to 20% of all lung cancers and these tumors often carry genetic alterations that are responsive to targeted therapy. Here we examined mutation status in 10 oncogenes among 89 lung adenocarcinomas from never smokers. We also screened for oncogene fusion transcripts in 20 of the 89 tumors by RNA-Seq. In total, 62 tumors had mutations in at least one of the 10 oncogenes, including
EGFR
(49 cases, 55%), K-
ras
(5 cases, 6%),
BRAF
(4 cases, 5%),
PIK3CA
(3 cases, 3%) and
ERBB2
(4 cases, 5%). In addition to ALK fusions identified by IHC/FISH in four cases, two previously known fusions involving
EZR- ROS1
and
KIF5B-RET
were identified by RNA-Seq as well as a third novel fusion transcript that was formed between exons 1–9 of
SND1
and exons 2 to 3′ end of
BRAF
. This in-frame fusion was observed in 3/89 tested tumors and 2/64 additional never smoker lung adenocarcinoma samples. Ectopic expression of SND1-BRAF in H1299 cells increased phosphorylation levels of MEK/ERK, cell proliferation and spheroid formation compared to parental mock-transfected control. Jointly, our results suggest a potential role of the novel BRAF fusion in lung cancer development and therapy.
Journal Article
ReliefSeq: A Gene-Wise Adaptive-K Nearest-Neighbor Feature Selection Tool for Finding Gene-Gene Interactions and Main Effects in mRNA-Seq Gene Expression Data
by
White, Bill C.
,
Oberg, Ann L.
,
McKinney, Brett A.
in
Algorithms
,
Artificial Intelligence
,
Binomial distribution
2013
Relief-F is a nonparametric, nearest-neighbor machine learning method that has been successfully used to identify relevant variables that may interact in complex multivariate models to explain phenotypic variation. While several tools have been developed for assessing differential expression in sequence-based transcriptomics, the detection of statistical interactions between transcripts has received less attention in the area of RNA-seq analysis. We describe a new extension and assessment of Relief-F for feature selection in RNA-seq data. The ReliefSeq implementation adapts the number of nearest neighbors (k) for each gene to optimize the Relief-F test statistics (importance scores) for finding both main effects and interactions. We compare this gene-wise adaptive-k (gwak) Relief-F method with standard RNA-seq feature selection tools, such as DESeq and edgeR, and with the popular machine learning method Random Forests. We demonstrate performance on a panel of simulated data that have a range of distributional properties reflected in real mRNA-seq data including multiple transcripts with varying sizes of main effects and interaction effects. For simulated main effects, gwak-Relief-F feature selection performs comparably to standard tools DESeq and edgeR for ranking relevant transcripts. For gene-gene interactions, gwak-Relief-F outperforms all comparison methods at ranking relevant genes in all but the highest fold change/highest signal situations where it performs similarly. The gwak-Relief-F algorithm outperforms Random Forests for detecting relevant genes in all simulation experiments. In addition, Relief-F is comparable to the other methods based on computational time. We also apply ReliefSeq to an RNA-Seq study of smallpox vaccine to identify gene expression changes between vaccinia virus-stimulated and unstimulated samples. ReliefSeq is an attractive tool for inclusion in the suite of tools used for analysis of mRNA-Seq data; it has power to detect both main effects and interaction effects. Software Availability: http://insilico.utulsa.edu/ReliefSeq.php.
Journal Article
ReliefSeq: A Gene-Wise Adaptive-K Nearest-Neighbor Feature Selection Tool for Finding Gene-Gene Interactions and Main Effects in mRNA-Seq Gene Expression Data: e81527
2013
Relief-F is a nonparametric, nearest-neighbor machine learning method that has been successfully used to identify relevant variables that may interact in complex multivariate models to explain phenotypic variation. While several tools have been developed for assessing differential expression in sequence-based transcriptomics, the detection of statistical interactions between transcripts has received less attention in the area of RNA-seq analysis. We describe a new extension and assessment of Relief-F for feature selection in RNA-seq data. The ReliefSeq implementation adapts the number of nearest neighbors (k) for each gene to optimize the Relief-F test statistics (importance scores) for finding both main effects and interactions. We compare this gene-wise adaptive-k (gwak) Relief-F method with standard RNA-seq feature selection tools, such as DESeq and edgeR, and with the popular machine learning method Random Forests. We demonstrate performance on a panel of simulated data that have a range of distributional properties reflected in real mRNA-seq data including multiple transcripts with varying sizes of main effects and interaction effects. For simulated main effects, gwak-Relief-F feature selection performs comparably to standard tools DESeq and edgeR for ranking relevant transcripts. For gene-gene interactions, gwak-Relief-F outperforms all comparison methods at ranking relevant genes in all but the highest fold change/highest signal situations where it performs similarly. The gwak-Relief-F algorithm outperforms Random Forests for detecting relevant genes in all simulation experiments. In addition, Relief-F is comparable to the other methods based on computational time. We also apply ReliefSeq to an RNA-Seq study of smallpox vaccine to identify gene expression changes between vaccinia virus-stimulated and unstimulated samples. ReliefSeq is an attractive tool for inclusion in the suite of tools used for analysis of mRNA-Seq data; it has power to detect both main effects and interaction effects. Software Availability: http://insilico.utulsa.edu/ReliefSeq.php.
Journal Article
Recent segmental duplications in the human genome
by
Gu, Zhiping
,
Schwartz, Stuart
,
Bailey, Jeffrey A.
in
Analysis
,
Chromosome mapping
,
Genetic algorithms
2002
Journal Article
Human chromosome 7: DNA sequence and Biology
by
Farra, Chantal G.
,
Kim, Hyung-Goo
,
Kwasnicka, Dorota
in
Analysis
,
Chromosome mapping
,
Genetic disorders
2003
Journal Article
Comparative Genomics of the Eukaryotes
by
Lemaitre, Bruno
,
Boguski, Mark S.
,
Nelson, Catherine R.
in
Cells
,
Chromosome mapping
,
Drosophila
2000
Journal Article