Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
CGRclust: Chaos Game Representation for twin contrastive clustering of unlabelled DNA sequences
by
Alipour, Fatemeh
, Kari, Lila
, Hill, Kathleen A.
in
Algorithms
/ Alignment
/ Alignment-free DNA sequence comparison
/ Animal Genetics and Genomics
/ Animals
/ Artificial neural networks
/ Biomedical and Life Sciences
/ Chaos Game Representation (CGR)
/ Classification
/ Cluster Analysis
/ Clustering
/ Computational biology
/ Computational Biology - methods
/ Datasets
/ Deoxyribonucleic acid
/ DNA
/ DNA sequence clustering
/ Fish
/ Gene sequencing
/ Genetic research
/ Genome, Mitochondrial
/ Genome, Viral
/ Genomes
/ Genomics
/ Identification and classification
/ Image classification
/ Information management
/ Labels
/ Learning
/ Life Sciences
/ Machine learning
/ Methods
/ Microarrays
/ Microbial Genetics and Genomics
/ Mitochondrial DNA
/ Neural networks
/ Neural Networks, Computer
/ Nucleotide sequence
/ Plant Genetics and Genomics
/ Proteomics
/ Representations
/ Sequence Analysis, DNA - methods
/ Supervised learning
/ Taxonomic classification
/ Taxonomy
/ Twin contrastive learning
/ Unsupervised learning
/ Unsupervised Machine Learning
2024
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
CGRclust: Chaos Game Representation for twin contrastive clustering of unlabelled DNA sequences
by
Alipour, Fatemeh
, Kari, Lila
, Hill, Kathleen A.
in
Algorithms
/ Alignment
/ Alignment-free DNA sequence comparison
/ Animal Genetics and Genomics
/ Animals
/ Artificial neural networks
/ Biomedical and Life Sciences
/ Chaos Game Representation (CGR)
/ Classification
/ Cluster Analysis
/ Clustering
/ Computational biology
/ Computational Biology - methods
/ Datasets
/ Deoxyribonucleic acid
/ DNA
/ DNA sequence clustering
/ Fish
/ Gene sequencing
/ Genetic research
/ Genome, Mitochondrial
/ Genome, Viral
/ Genomes
/ Genomics
/ Identification and classification
/ Image classification
/ Information management
/ Labels
/ Learning
/ Life Sciences
/ Machine learning
/ Methods
/ Microarrays
/ Microbial Genetics and Genomics
/ Mitochondrial DNA
/ Neural networks
/ Neural Networks, Computer
/ Nucleotide sequence
/ Plant Genetics and Genomics
/ Proteomics
/ Representations
/ Sequence Analysis, DNA - methods
/ Supervised learning
/ Taxonomic classification
/ Taxonomy
/ Twin contrastive learning
/ Unsupervised learning
/ Unsupervised Machine Learning
2024
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
CGRclust: Chaos Game Representation for twin contrastive clustering of unlabelled DNA sequences
by
Alipour, Fatemeh
, Kari, Lila
, Hill, Kathleen A.
in
Algorithms
/ Alignment
/ Alignment-free DNA sequence comparison
/ Animal Genetics and Genomics
/ Animals
/ Artificial neural networks
/ Biomedical and Life Sciences
/ Chaos Game Representation (CGR)
/ Classification
/ Cluster Analysis
/ Clustering
/ Computational biology
/ Computational Biology - methods
/ Datasets
/ Deoxyribonucleic acid
/ DNA
/ DNA sequence clustering
/ Fish
/ Gene sequencing
/ Genetic research
/ Genome, Mitochondrial
/ Genome, Viral
/ Genomes
/ Genomics
/ Identification and classification
/ Image classification
/ Information management
/ Labels
/ Learning
/ Life Sciences
/ Machine learning
/ Methods
/ Microarrays
/ Microbial Genetics and Genomics
/ Mitochondrial DNA
/ Neural networks
/ Neural Networks, Computer
/ Nucleotide sequence
/ Plant Genetics and Genomics
/ Proteomics
/ Representations
/ Sequence Analysis, DNA - methods
/ Supervised learning
/ Taxonomic classification
/ Taxonomy
/ Twin contrastive learning
/ Unsupervised learning
/ Unsupervised Machine Learning
2024
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
CGRclust: Chaos Game Representation for twin contrastive clustering of unlabelled DNA sequences
Journal Article
CGRclust: Chaos Game Representation for twin contrastive clustering of unlabelled DNA sequences
2024
Request Book From Autostore
and Choose the Collection Method
Overview
Background
Traditional supervised learning methods applied to DNA sequence taxonomic classification rely on the labor-intensive and time-consuming step of labelling the primary DNA sequences. Additionally, standard DNA classification/clustering methods involve time-intensive multiple sequence alignments, which impacts their applicability to large genomic datasets or distantly related organisms. These limitations indicate a need for robust, efficient, and scalable unsupervised DNA sequence clustering methods that do not depend on sequence labels or alignment.
Results
This study proposes CGRclust, a novel combination of unsupervised twin contrastive clustering of Chaos Game Representations (CGR) of DNA sequences, with convolutional neural networks (CNNs). To the best of our knowledge, CGRclust is the first method to use unsupervised learning for image classification (herein applied to two-dimensional CGR images) for clustering datasets of DNA sequences. CGRclust overcomes the limitations of traditional sequence classification methods by leveraging unsupervised twin contrastive learning to detect distinctive sequence patterns, without requiring DNA sequence alignment or biological/taxonomic labels. CGRclust accurately clustered twenty-five diverse datasets, with sequence lengths ranging from 664 bp to 100 kbp, including mitochondrial genomes of fish, fungi, and protists, as well as viral whole genome assemblies and synthetic DNA sequences. Compared with three recent clustering methods for DNA sequences (DeLUCS,
i
DeLUCS, and MeShClust v3.0.), CGRclust is the only method that surpasses 81.70% accuracy across all four taxonomic levels tested for mitochondrial DNA genomes of fish. Moreover, CGRclust also consistently demonstrates superior performance across all the viral genomic datasets. The high clustering accuracy of CGRclust on these twenty-five datasets, which vary significantly in terms of sequence length, number of genomes, number of clusters, and level of taxonomy, demonstrates its robustness, scalability, and versatility.
Conclusion
CGRclust is a novel, scalable, alignment-free DNA sequence clustering method that uses CGR images of DNA sequences and CNNs for twin contrastive clustering of unlabelled primary DNA sequences, achieving superior or comparable accuracy and performance over current approaches. CGRclust demonstrated enhanced reliability, by consistently achieving over 80% accuracy in more than 90% of the datasets analyzed. In particular, CGRclust performed especially well in clustering viral DNA datasets, where it consistently outperformed all competing methods.
Publisher
BioMed Central,BioMed Central Ltd,Springer Nature B.V,BMC
Subject
/ Alignment-free DNA sequence comparison
/ Animal Genetics and Genomics
/ Animals
/ Biomedical and Life Sciences
/ Chaos Game Representation (CGR)
/ Computational Biology - methods
/ Datasets
/ DNA
/ Fish
/ Genomes
/ Genomics
/ Identification and classification
/ Labels
/ Learning
/ Methods
/ Microbial Genetics and Genomics
/ Sequence Analysis, DNA - methods
/ Taxonomy
This website uses cookies to ensure you get the best experience on our website.