Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
3,572 result(s) for "overlapping"
Sort by:
Genomic Prediction Enhanced Sparse Testing for Multi-environment Trials
“Sparse testing” refers to reduced multi-environment breeding trials in which not all genotypes of interest are grown in each environment. Using genomic-enabled prediction and a model embracing genotype × environment interaction (GE), the non-observed genotype-in-environment combinations can be predicted. Consequently, the overall costs can be reduced and the testing capacities can be increased. The accuracy of predicting the unobserved data depends on different factors including (1) how many genotypes overlap between environments, (2) in how many environments each genotype is grown, and (3) which prediction method is used. In this research, we studied the predictive ability obtained when using a fixed number of plots and different sparse testing designs. The considered designs included the extreme cases of (1) no overlap of genotypes between environments, and (2) complete overlap of the genotypes between environments. In the latter case, the prediction set fully consists of genotypes that have not been tested at all. Moreover, we gradually go from one extreme to the other considering (3) intermediates between the two previous cases with varying numbers of different or non-overlapping (NO)/overlapping (O) genotypes. The empirical study is built upon two different maize hybrid data sets consisting of different genotypes crossed to two different testers (T1 and T2) and each data set was analyzed separately. For each set, phenotypic records on yield from three different environments are available. Three different prediction models were implemented, two main effects models (M1 and M2), and a model (M3) including GE. The results showed that the genome-based model including GE (M3) captured more phenotypic variation than the models that did not include this component. Also, M3 provided higher prediction accuracy than models M1 and M2 for the different allocation scenarios. Reducing the size of the calibration sets decreased the prediction accuracy under all allocation designs with M3 being the less affected model; however, using the genome-enabled models (i.e., M2 and M3) the predictive ability is recovered when more genotypes are tested across environments. Our results indicate that a substantial part of the testing resources can be saved when using genome-based models including GE for optimizing sparse testing designs.
Down-weighting overlapping genes improves gene set analysis
Background The identification of gene sets that are significantly impacted in a given condition based on microarray data is a crucial step in current life science research. Most gene set analysis methods treat genes equally, regardless how specific they are to a given gene set. Results In this work we propose a new gene set analysis method that computes a gene set score as the mean of absolute values of weighted moderated gene t-scores. The gene weights are designed to emphasize the genes appearing in few gene sets, versus genes that appear in many gene sets. We demonstrate the usefulness of the method when analyzing gene sets that correspond to the KEGG pathways, and hence we called our method P athway A nalysis with D own-weighting of O verlapping G enes ( PADOG ). Unlike most gene set analysis methods which are validated through the analysis of 2-3 data sets followed by a human interpretation of the results, the validation employed here uses 24 different data sets and a completely objective assessment scheme that makes minimal assumptions and eliminates the need for possibly biased human assessments of the analysis results. Conclusions PADOG significantly improves gene set ranking and boosts sensitivity of analysis using information already available in the gene expression profiles and the collection of gene sets to be analyzed. The advantages of PADOG over other existing approaches are shown to be stable to changes in the database of gene sets to be analyzed. PADOG was implemented as an R package available at: http://bioinformaticsprb.med.wayne.edu/PADOG/ or http://www.bioconductor.org .
An algorithm for overlapping chromosome segmentation based on region selection
Chromosome images are commonly used in karyotype analysis to diagnose chromosomal diseases. However, there are often chromosome adhesion and overlaps in chromosome images, so effective chromosome segmentation is conducive to smooth karyotype analysis. To date, some progress has been made in automatic chromosome segmentation, and existing methods can be used to segment overlapping chromosomes in most cases. However, when two or more overlapping regions are too close to each other in the image of overlapping chromosomes, the existing segmentation methods adjust the non-overlapping regions that do not belong to the overlapping region, resulting in incomplete segmentation of chromatids. Therefore, we use a heuristic algorithm to solve this problem from the point of view of mathematics and geometry to improve the segmentation of overlapping chromosomes. Starting from chromosome images, the existing problems and solutions are explained and displayed in the way of visualized interpretable image features, which helps to better understand the algorithm. Our method achieves 92.86% splicing accuracy and 90.44% overall segmentation accuracy on open datasets. The experimental results show that our method can effectively improve the problem of incorrect chromosome segmentation when two or more overlapping parts of overlapping chromosomes are too close to each other. It can accelerate the development of artificial intelligence in computational pathology and provide patients with more accurate medical services.
A Simple Method to Detect Candidate Overlapping Genes in Viruses Using Single Genome Sequences
Overlapping genes in viruses maximize the coding capacity of their genomes and allow the generation of new genes without major increases in genome size. Despite their importance, the evolution and function of overlapping genes are often not well understood, in part due to difficulties in their detection. In addition, most bioinformatic approaches for the detection of overlapping genes require the comparison of multiple genome sequences that may not be available in metagenomic surveys of virus biodiversity. We introduce a simple new method for identifying candidate functional overlapping genes using single virus genome sequences. Our method uses randomization tests to estimate the expected length of open reading frames and then identifies overlapping open reading frames that significantly exceed this length and are thus predicted to be functional. We applied this method to 2548 reference RNA virus genomes and find that it has both high sensitivity and low false discovery for genes that overlap by at least 50 nucleotides. Notably, this analysis provided evidence for 29 previously undiscovered functional overlapping genes, some of which are coded in the antisense direction suggesting there are limitations in our current understanding of RNA virus replication.
UCoDe: unified community detection with graph convolutional networks
Community detection finds homogeneous groups of nodes in a graph. Existing approaches either partition the graph into disjoint, non-overlapping , communities, or determine only overlapping communities. To date, no method supports both detections of overlapping and non-overlapping communities. We propose UCoDe, a unified method for community detection in attributed graphs that detects both overlapping and non-overlapping communities by means of a novel contrastive loss that captures node similarity on a macro-scale. Our thorough experimental assessment on real data shows that, regardless of the data distribution, our method is either the top performer or among the top performers in both overlapping and non-overlapping detection without burdensome hyper-parameter tuning.
Generative artificial intelligence
Recent developments in the field of artificial intelligence (AI) have enabled new paradigms of machine processing, shifting from data-driven, discriminative AI tasks toward sophisticated, creative tasks through generative AI. Leveraging deep generative models, generative AI is capable of producing novel and realistic content across a broad spectrum (e.g., texts, images, or programming code) for various domains based on basic user prompts. In this article, we offer a comprehensive overview of the fundamentals of generative AI with its underpinning concepts and prospects. We provide a conceptual introduction to relevant terms and techniques, outline the inherent properties that constitute generative AI, and elaborate on the potentials and challenges. We underline the necessity for researchers and practitioners to comprehend the distinctive characteristics of generative artificial intelligence in order to harness its potential while mitigating its risks and to contribute to a principal understanding.
Algorithmic and mathematical modeling for synthetically controlled overlapping
For most classifiers, overlapping regions, where various classes are difficult to distinguish, affect the classifier’s overall performance in multi-class imbalanced data more than the imbalance itself. In problem-data space, the overlapped samples share similar characteristics, resulting in a complex boundary, making it difficult to separate the samples of classes from each other, causing performance degradation. The research community agreed upon the relationship of the class overlapping issues with the classifier performance, but how much the classifier is affected is still unanswered. There is also a gap in the literature to demonstrate the different levels of class overlapping in multi-class problems. Accordingly, in this paper, four algorithms are implemented to synthetically generate controlled overlapping samples to be used with multiclass datasets using different schemes to show the worst effect of class overlapping. Experiments involve using different state-of-the-art non-parametric classifiers, support vector machines, k-nearest neighbor, and random forest, to classify these multi-class datasets to validate the class overlapping effect on their learning. The models are used to test the suitability, stability, and versatility of the proposed algorithms for the schemes and to highlight the effect of growing overlapping samples in complex multi-class problems having an imbalanced distribution of data and class overlapping issues. The experimental results using 20 real-world datasets, show the different levels of overlapping data and the effect of each level on the underlying classifiers.
Purge Haplotigs: allelic contig reassignment for third-gen diploid genome assemblies
Background Recent developments in third-gen long read sequencing and diploid-aware assemblers have resulted in the rapid release of numerous reference-quality assemblies for diploid genomes. However, assembly of highly heterozygous genomes is still problematic when regional heterogeneity is so high that haplotype homology is not recognised during assembly. This results in regional duplication rather than consolidation into allelic variants and can cause issues with downstream analysis, for example variant discovery, or haplotype reconstruction using the diploid assembly with unpaired allelic contigs. Results A new pipeline—Purge Haplotigs—was developed specifically for third-gen sequencing-based assemblies to automate the reassignment of allelic contigs, and to assist in the manual curation of genome assemblies. The pipeline uses a draft haplotype-fused assembly or a diploid assembly, read alignments, and repeat annotations to identify allelic variants in the primary assembly. The pipeline was tested on a simulated dataset and on four recent diploid (phased) de novo assemblies from third-generation long-read sequencing, and compared with a similar tool. After processing with Purge Haplotigs, haploid assemblies were less duplicated with minimal impact on genome completeness, and diploid assemblies had more pairings of allelic contigs. Conclusions Purge Haplotigs improves the haploid and diploid representations of third-gen sequencing based genome assemblies by identifying and reassigning allelic contigs. The implementation is fast and scales well with large genomes, and it is less likely to over-purge repetitive or paralogous elements compared to alignment-only based methods. The software is available at https://bitbucket.org/mroachawri/purge_haplotigs under a permissive MIT licence.
Bidirectional promoters generate pervasive transcription in yeast
Genome-wide pervasive transcription has been reported in many eukaryotic organisms revealing a highly interleaved transcriptome organization that involves hundreds of previously unknown non-coding RNAs. These recently identified transcripts either exist stably in cells (stable unannotated transcripts, SUTs) or are rapidly degraded by the RNA surveillance pathway (cryptic unstable transcripts, CUTs). One characteristic of pervasive transcription is the extensive overlap of SUTs and CUTs with previously annotated features, which prompts questions regarding how these transcripts are generated, and whether they exert function. Single-gene studies have shown that transcription of SUTs and CUTs can be functional, through mechanisms involving the generated RNAs or their generation itself. So far, a complete transcriptome architecture including SUTs and CUTs has not been described in any organism. Knowledge about the position and genome-wide arrangement of these transcripts will be instrumental in understanding their function. Here we provide a comprehensive analysis of these transcripts in the context of multiple conditions, a mutant of the exosome machinery and different strain backgrounds of Saccharomyces cerevisiae. We show that both SUTs and CUTs display distinct patterns of distribution at specific locations. Most of the newly identified transcripts initiate from nucleosome-free regions (NFRs) associated with the promoters of other transcripts (mostly protein-coding genes), or from NFRs at the 3' ends of protein-coding genes. Likewise, about half of all coding transcripts initiate from NFRs associated with promoters of other transcripts. These data change our view of how a genome is transcribed, indicating that bidirectionality is an inherent feature of promoters. Such an arrangement of divergent and overlapping transcripts may provide a mechanism for local spreading of regulatory signals--that is, coupling the transcriptional regulation of neighbouring genes by means of transcriptional interference or histone modification.
Dynamically evolving novel overlapping gene as a factor in the SARS-CoV-2 pandemic
Understanding the emergence of novel viruses requires an accurate and comprehensive annotation of their genomes. Overlapping genes (OLGs) are common in viruses and have been associated with pandemics but are still widely overlooked. We identify and characterize ORF3d , a novel OLG in SARS-CoV-2 that is also present in Guangxi pangolin-CoVs but not other closely related pangolin-CoVs or bat-CoVs. We then document evidence of ORF3d translation, characterize its protein sequence, and conduct an evolutionary analysis at three levels: between taxa (21 members of Severe acute respiratory syndrome-related coronavirus ), between human hosts (3978 SARS-CoV-2 consensus sequences), and within human hosts (401 deeply sequenced SARS-CoV-2 samples). ORF3d has been independently identified and shown to elicit a strong antibody response in COVID-19 patients. However, it has been misclassified as the unrelated gene ORF3b , leading to confusion. Our results liken ORF3d to other accessory genes in emerging viruses and highlight the importance of OLGs.