Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
50 result(s) for "de Almeida, Bernardo P."
Sort by:
Identification of candidate causal variants and target genes at 41 breast cancer risk loci through differential allelic expression analysis
Understanding breast cancer genetic risk relies on identifying causal variants and candidate target genes in risk loci identified by genome-wide association studies (GWAS), which remains challenging. Since most loci fall in active gene regulatory regions, we developed a novel approach facilitated by pinpointing the variants with greater regulatory potential in the disease’s tissue of origin. Through genome-wide differential allelic expression (DAE) analysis, using microarray data from 64 normal breast tissue samples, we mapped the variants associated with DAE (daeQTLs). Then, we intersected these with GWAS data to reveal candidate risk regulatory variants and analysed their cis-acting regulatory potential. Finally, we validated our approach by extensive functional analysis of the 5q14.1 breast cancer risk locus. We observed widespread gene expression regulation by cis-acting variants in breast tissue, with 65% of coding and noncoding expressed genes displaying DAE (daeGenes). We identified over 54 K daeQTLs for 6761 (26%) daeGenes, including 385 daeGenes harbouring variants previously associated with BC risk. We found 1431 daeQTLs mapped to 93 different loci in strong linkage disequilibrium with risk-associated variants (risk-daeQTLs), suggesting a link between risk-causing variants and cis-regulation. There were 122 risk-daeQTL with stronger cis-acting potential in active regulatory regions with protein binding evidence. These variants mapped to 41 risk loci, of which 29 had no previous report of target genes and were candidates for regulating the expression levels of 65 genes. As validation, we identified and functionally characterised five candidate causal variants at the 5q14.1 risk locus targeting the ATG10 and ATP6AP1L genes, likely acting via modulation of alternative transcription and transcription factor binding. Our study demonstrates the power of DAE analysis and daeQTL mapping to identify causal regulatory variants and target genes at breast cancer risk loci, including those with complex regulatory landscapes. It additionally provides a genome-wide resource of variants associated with DAE for future functional studies.
Roadmap of DNA methylation in breast cancer identifies novel prognostic biomarkers
Background Breast cancer is a highly heterogeneous disease resulting in diverse clinical behaviours and therapeutic responses. DNA methylation is a major epigenetic alteration that is commonly perturbed in cancers. The aim of this study is to characterize the relationship between DNA methylation and aberrant gene expression in breast cancer. Methods We analysed DNA methylation and gene expression profiles from breast cancer tissue and matched normal tissue in The Cancer Genome Atlas (TCGA). Genome-wide differential methylation analysis and methylation-gene expression correlation was performed. Gene expression changes were subsequently validated in the METABRIC dataset. The Oncoscore tool was used to identify genes that had previously been associated with cancer in the literature. A subset of genes that had not previously been studied in cancer was chosen for further analysis. Results We identified 368 CpGs that were differentially methylated between tumor and normal breast tissue (∆β > 0.4). Hypermethylated CpGs were overrepresented in tumor tissue and were found predominantly (56%) in upstream promoter regions. Conversely, hypomethylated CpG sites were found primarily in the gene body (66%). Expression analysis revealed that 209 of the differentially-methylated CpGs were located in 169 genes that were differently expressed between normal and breast tumor tissue. Methylation-expression correlations were predominantly negative (70%) for promoter CpG sites and positive (74%) for gene body CpG sites. Among these differentially-methylated and differentially-expressed genes, we identified 7 that had not previously been studied in any form of cancer. Three of these, TDRD10 , PRAC2 and TMEM132C , contained CpG sites that showed diagnostic and prognostic value in breast cancer, particularly in estrogen-receptor (ER)-positive samples. A pan-cancer analysis confirmed differential expression of these genes together with diagnostic and prognostic value of their respective CpG sites in multiple cancer types. Conclusion We have identified 368 DNA methylation changes that characterize breast cancer tumor tissue, of which 209 are associated with genes that are differentially-expressed in the same samples. Novel DNA methylation markers were identified, of which cg12374721 ( PRAC2 ), cg18081940 ( TDRD10) and cg04475027 ( TMEM132C) show promise as diagnostic and prognostic markers in breast cancer as well as other cancer types.
Developmental and housekeeping transcriptional programs display distinct modes of enhancer-enhancer cooperativity in Drosophila
Genomic enhancers are key transcriptional regulators which, upon the binding of sequence-specific transcription factors, activate their cognate target promoters. Although enhancers have been extensively studied in isolation, a substantial number of genes have more than one simultaneously active enhancer, and it remains unclear how these cooperate to regulate transcription. Using Drosophila melanogaster S2 cells as a model, we assay the activities of more than a thousand individual enhancers and about a million enhancer pairs toward housekeeping and developmental core promoters with STARR-seq. We report that housekeeping and developmental enhancers show distinct modes of enhancer-enhancer cooperativity: while housekeeping enhancers are additive such that their combined activity mirrors the sum of their individual activities, developmental enhancers are super-additive and combine multiplicatively. Super-additivity between developmental enhancers is promiscuous and neither depends on the enhancers’ endogenous genomic contexts nor on specific transcription factor motif signatures. However, it can be further boosted by Twist and Trl motifs and saturates for the highest levels of enhancer activity. These results have important implications for our understanding of gene regulation in complex multi-enhancer developmental loci and genomically clustered housekeeping genes, providing a rationale to interpret the transcriptional impact of non-coding mutations at different loci. High-throughput analyses show that developmental enhancer pairs are super-additive, combining multiplicatively until saturation, while housekeeping enhancers are additive. Super-additivity is promiscuous, but boosted by Trl and Twist motifs.
Over-elongation of centrioles in cancer promotes centriole amplification and chromosome missegregation
Centrosomes are the major microtubule organising centres of animal cells. Deregulation in their number occurs in cancer and was shown to trigger tumorigenesis in mice. However, the incidence, consequence and origins of this abnormality are poorly understood. Here, we screened the NCI-60 panel of human cancer cell lines to systematically analyse centriole number and structure. Our screen shows that centriole amplification is widespread in cancer cell lines and highly prevalent in aggressive breast carcinomas. Moreover, we identify another recurrent feature of cancer cells: centriole size deregulation. Further experiments demonstrate that severe centriole over-elongation can promote amplification through both centriole fragmentation and ectopic procentriole formation. Furthermore, we show that overly long centrioles form over-active centrosomes that nucleate more microtubules, a known cause of invasiveness, and perturb chromosome segregation. Our screen establishes centriole amplification and size deregulation as recurrent features of cancer cells and identifies novel causes and consequences of those abnormalities. Cancer cells are characterised by abnormalities in the number of centrosomes and this phenotype is linked with tumorigenesis. Here the authors report centriole length deregulation in a subset of cancer cell lines and suggest a link with subsequent alterations in centriole numbers and chromosomal instability.
DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers
Enhancer sequences control gene expression and comprise binding sites (motifs) for different transcription factors (TFs). Despite extensive genetic and computational studies, the relationship between DNA sequence and regulatory activity is poorly understood, and de novo enhancer design has been challenging. Here, we built a deep-learning model, DeepSTARR, to quantitatively predict the activities of thousands of developmental and housekeeping enhancers directly from DNA sequence in Drosophila melanogaster S2 cells. The model learned relevant TF motifs and higher-order syntax rules, including functionally nonequivalent instances of the same TF motif that are determined by motif-flanking sequence and intermotif distances. We validated these rules experimentally and demonstrated that they can be generalized to humans by testing more than 40,000 wildtype and mutant Drosophila and human enhancers. Finally, we designed and functionally validated synthetic enhancers with desired activities de novo. A deep-learning model called DeepSTARR quantitatively predicts enhancer activity on the basis of DNA sequence. The model learns relevant motifs and syntax rules, allowing for the design of synthetic enhancers with specific strengths.
Pan-cancer association of a centrosome amplification gene expression signature with genomic alterations and clinical outcome
Centrosome amplification (CA) is a common feature of human tumours and a promising target for cancer therapy. However, CA's pan-cancer prevalence, molecular role in tumourigenesis and therapeutic value in the clinical setting are still largely unexplored. Here, we used a transcriptomic signature (CA20) to characterise the landscape of CA-associated gene expression in 9,721 tumours from The Cancer Genome Atlas (TCGA). CA20 is upregulated in cancer and associated with distinct clinical and molecular features of breast cancer, consistently with our experimental CA quantification in patient samples. Moreover, we show that CA20 upregulation is positively associated with genomic instability, alteration of specific chromosomal arms and C>T mutations, and we propose novel molecular players associated with CA in cancer. Finally, high CA20 is associated with poor prognosis and, by integrating drug sensitivity with drug perturbation profiles in cell lines, we identify candidate compounds for selectively targeting cancer cells exhibiting transcriptomic evidence for CA.
Decoding a cancer-relevant splicing decision in the RON proto-oncogene using high-throughput mutagenesis
Mutations causing aberrant splicing are frequently implicated in human diseases including cancer. Here, we establish a high-throughput screen of randomly mutated minigenes to decode the cis -regulatory landscape that determines alternative splicing of exon 11 in the proto-oncogene MST1R ( RON ). Mathematical modelling of splicing kinetics enables us to identify more than 1000 mutations affecting RON exon 11 skipping, which corresponds to the pathological isoform RON∆165. Importantly, the effects correlate with RON alternative splicing in cancer patients bearing the same mutations. Moreover, we highlight heterogeneous nuclear ribonucleoprotein H (HNRNPH) as a key regulator of RON splicing in healthy tissues and cancer. Using iCLIP and synergy analysis, we pinpoint the functionally most relevant HNRNPH binding sites and demonstrate how cooperative HNRNPH binding facilitates a splicing switch of RON exon 11. Our results thereby offer insights into splicing regulation and the impact of mutations on alternative splicing in cancer. Alternative splicing is a critical step in eukaryotic gene expression but its molecular rules are not fully understood. Here, the authors develop a high-throughput mutagenesis approach to comprehensively characterise determinants of alternative splicing for the RON proto-oncogene.
A foundational large language model for edible plant genomes
Significant progress has been made in the field of plant genomics, as demonstrated by the increased use of high-throughput methodologies that enable the characterization of multiple genome-wide molecular phenotypes. These findings have provided valuable insights into plant traits and their underlying genetic mechanisms, particularly in model plant species. Nonetheless, effectively leveraging them to make accurate predictions represents a critical step in crop genomic improvement. We present AgroNT, a foundational large language model trained on genomes from 48 plant species with a predominant focus on crop species. We show that AgroNT can obtain state-of-the-art predictions for regulatory annotations, promoter/terminator strength, tissue-specific gene expression, and prioritize functional variants. We conduct a large-scale in silico saturation mutagenesis analysis on cassava to evaluate the regulatory impact of over 10 million mutations and provide their predicted effects as a resource for variant characterization. Finally, we propose the use of the diverse datasets compiled here as the Plants Genomic Benchmark (PGB), providing a comprehensive benchmark for deep learning-based methods in plant genomic research. The pre-trained AgroNT model is publicly available on HuggingFace at https://huggingface.co/InstaDeepAI/agro-nucleotide-transformer-1b  for future research purposes. A DNA-based large language model, AgroNT, trained on multiple plant genomes, can accurately predict various molecular phenotypes within plant species, including important crops.
Nucleotide Transformer: building and evaluating robust foundation models for human genomics
The prediction of molecular phenotypes from DNA sequences remains a longstanding challenge in genomics, often driven by limited annotated data and the inability to transfer learnings between tasks. Here, we present an extensive study of foundation models pre-trained on DNA sequences, named Nucleotide Transformer, ranging from 50 million up to 2.5 billion parameters and integrating information from 3,202 human genomes and 850 genomes from diverse species. These transformer models yield context-specific representations of nucleotide sequences, which allow for accurate predictions even in low-data settings. We show that the developed models can be fine-tuned at low cost to solve a variety of genomics applications. Despite no supervision, the models learned to focus attention on key genomic elements and can be used to improve the prioritization of genetic variants. The training and application of foundational models in genomics provides a widely applicable approach for accurate molecular phenotype prediction from DNA sequence. Nucleotide Transformer is a series of genomics foundation models of different parameter sizes and training datasets that can be applied to various downstream tasks by fine-tuning.
Targeted design of synthetic enhancers for selected tissues in the Drosophila embryo
Enhancers control gene expression and have crucial roles in development and homeostasis 1 – 3 . However, the targeted de novo design of enhancers with tissue-specific activities has remained challenging. Here we combine deep learning and transfer learning to design tissue-specific enhancers for five tissues in the Drosophila melanogaster embryo: the central nervous system, epidermis, gut, muscle and brain. We first train convolutional neural networks using genome-wide single-cell assay for transposase-accessible chromatin with sequencing (ATAC-seq) datasets and then fine-tune the convolutional neural networks with smaller-scale data from in vivo enhancer activity assays, yielding models with 13% to 76% positive predictive value according to cross-validation. We designed and experimentally assessed 40 synthetic enhancers (8 per tissue) in vivo, of which 31 (78%) were active and 27 (68%) functioned in the target tissue (100% for central nervous system and muscle). The strategy of combining genome-wide and small-scale functional datasets by transfer learning is generally applicable and should enable the design of tissue-, cell type- and cell state-specific enhancers in any system. Deep learning and transfer learning were used to design tissue-specific enhancers in the Drosophila embryo that were active and specific, validating this approach to achieve tissue-, cell type- and cell state-specific expression control.