Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
Linguistically inspired roadmap for building biologically reliable protein language models
by
Greiff, Victor
, Haug, Dag Trygve Truslew
, Sandve, Geir Kjetil
, Akbar, Rahmad
, Robert, Philippe A.
, Swiatczak, Bartlomiej
, Vu, Mai Ha
in
4007/4009
/ 631/114/2397
/ 631/114/2410
/ Data
/ Datasets
/ Embedding
/ Engineering
/ Extraction
/ Function
/ Grammar
/ Knowledge
/ Language
/ Language modeling
/ Linguistics
/ Machine learning
/ Mathematical models
/ Natural language
/ Neural networks
/ Perspective
/ Proteins
/ Semantics
/ Sequences
2023
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
Linguistically inspired roadmap for building biologically reliable protein language models
by
Greiff, Victor
, Haug, Dag Trygve Truslew
, Sandve, Geir Kjetil
, Akbar, Rahmad
, Robert, Philippe A.
, Swiatczak, Bartlomiej
, Vu, Mai Ha
in
4007/4009
/ 631/114/2397
/ 631/114/2410
/ Data
/ Datasets
/ Embedding
/ Engineering
/ Extraction
/ Function
/ Grammar
/ Knowledge
/ Language
/ Language modeling
/ Linguistics
/ Machine learning
/ Mathematical models
/ Natural language
/ Neural networks
/ Perspective
/ Proteins
/ Semantics
/ Sequences
2023
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
Linguistically inspired roadmap for building biologically reliable protein language models
by
Greiff, Victor
, Haug, Dag Trygve Truslew
, Sandve, Geir Kjetil
, Akbar, Rahmad
, Robert, Philippe A.
, Swiatczak, Bartlomiej
, Vu, Mai Ha
in
4007/4009
/ 631/114/2397
/ 631/114/2410
/ Data
/ Datasets
/ Embedding
/ Engineering
/ Extraction
/ Function
/ Grammar
/ Knowledge
/ Language
/ Language modeling
/ Linguistics
/ Machine learning
/ Mathematical models
/ Natural language
/ Neural networks
/ Perspective
/ Proteins
/ Semantics
/ Sequences
2023
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
Linguistically inspired roadmap for building biologically reliable protein language models
Journal Article
Linguistically inspired roadmap for building biologically reliable protein language models
2023
Request Book From Autostore
and Choose the Collection Method
Overview
Deep neural-network-based language models (LMs) are increasingly applied to large-scale protein sequence data to predict protein function. However, being largely black-box models and thus challenging to interpret, current protein LM approaches do not contribute to a fundamental understanding of sequence–function mappings, hindering rule-based biotherapeutic drug development. We argue that guidance drawn from linguistics, a field specialized in analytical rule extraction from natural language data, can aid with building more interpretable protein LMs that are more likely to learn relevant domain-specific rules. Differences between protein sequence data and linguistic sequence data require the integration of more domain-specific knowledge in protein LMs compared with natural language LMs. Here, we provide a linguistics-based roadmap for protein LM pipeline choices with regard to training data, tokenization, token embedding, sequence embedding and model interpretation. Incorporating linguistic ideas into protein LMs enables the development of next-generation interpretable machine learning models with the potential of uncovering the biological mechanisms underlying sequence–function relationships.
Language models trained on proteins can help to predict functions from sequences but provide little insight into the underlying mechanisms. Vu and colleagues explain how extracting the underlying rules from a protein language model can make them interpretable and help explain biological mechanisms.
This website uses cookies to ensure you get the best experience on our website.