Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
The effectiveness of large language models with RAG for auto-annotating trait and phenotype descriptions
by
Kainer, David
in
Artificial Intelligence in Biology and Bioinformatics
/ Embedding
/ Large language models
/ Ontology
/ Phenotypes
2025
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
The effectiveness of large language models with RAG for auto-annotating trait and phenotype descriptions
by
Kainer, David
in
Artificial Intelligence in Biology and Bioinformatics
/ Embedding
/ Large language models
/ Ontology
/ Phenotypes
2025
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
The effectiveness of large language models with RAG for auto-annotating trait and phenotype descriptions
Journal Article
The effectiveness of large language models with RAG for auto-annotating trait and phenotype descriptions
2025
Request Book From Autostore
and Choose the Collection Method
Overview
Abstract
Ontologies are highly prevalent in biology and medicine and are always evolving. Annotating biological text, such as observed phenotype descriptions, with ontology terms is a challenging and tedious task. The process of annotation requires a contextual understanding of the input text and of the ontological terms available. While text-mining tools are available to assist, they are largely based on directly matching words and phrases and so lack understanding of the meaning of the query item and of the ontology term labels. Large Language Models (LLMs), however, excel at tasks that require semantic understanding of input text and therefore may provide an improvement for the auto-annotation of text with ontological terms. Here we describe a series of workflows incorporating OpenAI GPT’s capabilities to annotate Arabidopsis thaliana and forest tree phenotypic observations with ontology terms, aiming for results that resemble manually curated annotations. These workflows make use of an LLM to intelligently parse phenotypes into short concepts, followed by finding appropriate ontology terms via embedding vector similarity or via Retrieval-Augmented Generation (RAG). The RAG model is a state-of-the-art approach that augments conversational prompts to the LLM with context-specific data to empower it beyond its pre-trained parameter space. We show that the RAG produces the most accurate automated annotations that are often highly similar or identical to expert-curated annotations.
Publisher
Oxford University Press
This website uses cookies to ensure you get the best experience on our website.