Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
EVLF-FM: Explainable Vision Language Foundation Model for Medicine
by
Elze, Tobias
, Zhou, Jun
, Hon Lim, Tony Kiat
, Darren Shu Jeng Ting
, Tan, Marcus
, Mehta, Jod
, Gutierrez, Laura
, Hiok Hong Chan
, Nielsen, Christopher S
, Cheng, Lionel Tim-Ee
, Bai, Yang
, Ching Yu Cheng
, Lim, Soon Thye
, Teo, Zhen Ling
, Schmetterer, Leopold
, Fukutsu, Kanae
, Linh Le Dinh
, Yao, Jie
, Ang, Marcus
, Liu, Nan
, Soetikno, Brian T
, Ke, Yuhe
, Koh, Victor
, Klang, Eyal
, Chee Leong Cheng
, Hussain, Rahat
, Aung, Tin
, Tan, Iain Beehuat
, Tran Nguyen Tuan Anh
, Rick Siow Mong Goh
, Li, Kelvin Z
, Zhou, Yang
, Thirunavukarasu, Arun
, Chrystie Wan Ning Quek
, Li, Zengxiang
, Yip, Leonard
, Yih Chung Tham
, Daniel Shu Wei Ting
, Cheng, Haoran
, Hong, Ashley
, Liu, Yong
, Gavin Siew Wei Tan
, Tien Yin Wong
in
Accuracy
/ Datasets
/ Dermatology
/ Medical imaging
/ Ophthalmology
/ Reasoning
/ Vision
2025
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
EVLF-FM: Explainable Vision Language Foundation Model for Medicine
by
Elze, Tobias
, Zhou, Jun
, Hon Lim, Tony Kiat
, Darren Shu Jeng Ting
, Tan, Marcus
, Mehta, Jod
, Gutierrez, Laura
, Hiok Hong Chan
, Nielsen, Christopher S
, Cheng, Lionel Tim-Ee
, Bai, Yang
, Ching Yu Cheng
, Lim, Soon Thye
, Teo, Zhen Ling
, Schmetterer, Leopold
, Fukutsu, Kanae
, Linh Le Dinh
, Yao, Jie
, Ang, Marcus
, Liu, Nan
, Soetikno, Brian T
, Ke, Yuhe
, Koh, Victor
, Klang, Eyal
, Chee Leong Cheng
, Hussain, Rahat
, Aung, Tin
, Tan, Iain Beehuat
, Tran Nguyen Tuan Anh
, Rick Siow Mong Goh
, Li, Kelvin Z
, Zhou, Yang
, Thirunavukarasu, Arun
, Chrystie Wan Ning Quek
, Li, Zengxiang
, Yip, Leonard
, Yih Chung Tham
, Daniel Shu Wei Ting
, Cheng, Haoran
, Hong, Ashley
, Liu, Yong
, Gavin Siew Wei Tan
, Tien Yin Wong
in
Accuracy
/ Datasets
/ Dermatology
/ Medical imaging
/ Ophthalmology
/ Reasoning
/ Vision
2025
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
EVLF-FM: Explainable Vision Language Foundation Model for Medicine
by
Elze, Tobias
, Zhou, Jun
, Hon Lim, Tony Kiat
, Darren Shu Jeng Ting
, Tan, Marcus
, Mehta, Jod
, Gutierrez, Laura
, Hiok Hong Chan
, Nielsen, Christopher S
, Cheng, Lionel Tim-Ee
, Bai, Yang
, Ching Yu Cheng
, Lim, Soon Thye
, Teo, Zhen Ling
, Schmetterer, Leopold
, Fukutsu, Kanae
, Linh Le Dinh
, Yao, Jie
, Ang, Marcus
, Liu, Nan
, Soetikno, Brian T
, Ke, Yuhe
, Koh, Victor
, Klang, Eyal
, Chee Leong Cheng
, Hussain, Rahat
, Aung, Tin
, Tan, Iain Beehuat
, Tran Nguyen Tuan Anh
, Rick Siow Mong Goh
, Li, Kelvin Z
, Zhou, Yang
, Thirunavukarasu, Arun
, Chrystie Wan Ning Quek
, Li, Zengxiang
, Yip, Leonard
, Yih Chung Tham
, Daniel Shu Wei Ting
, Cheng, Haoran
, Hong, Ashley
, Liu, Yong
, Gavin Siew Wei Tan
, Tien Yin Wong
in
Accuracy
/ Datasets
/ Dermatology
/ Medical imaging
/ Ophthalmology
/ Reasoning
/ Vision
2025
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
EVLF-FM: Explainable Vision Language Foundation Model for Medicine
Paper
EVLF-FM: Explainable Vision Language Foundation Model for Medicine
2025
Request Book From Autostore
and Choose the Collection Method
Overview
Despite the promise of foundation models in medical AI, current systems remain limited - they are modality-specific and lack transparent reasoning processes, hindering clinical adoption. To address this gap, we present EVLF-FM, a multimodal vision-language foundation model (VLM) designed to unify broad diagnostic capability with fine-grain explainability. The development and testing of EVLF-FM encompassed over 1.3 million total samples from 23 global datasets across eleven imaging modalities related to six clinical specialties: dermatology, hepatology, ophthalmology, pathology, pulmonology, and radiology. External validation employed 8,884 independent test samples from 10 additional datasets across five imaging modalities. Technically, EVLF-FM is developed to assist with multiple disease diagnosis and visual question answering with pixel-level visual grounding and reasoning capabilities. In internal validation for disease diagnostics, EVLF-FM achieved the highest average accuracy (0.858) and F1-score (0.797), outperforming leading generalist and specialist models. In medical visual grounding, EVLF-FM also achieved stellar performance across nine modalities with average mIOU of 0.743 and Acc@0.5 of 0.837. External validations further confirmed strong zero-shot and few-shot performance, with competitive F1-scores despite a smaller model size. Through a hybrid training strategy combining supervised and visual reinforcement fine-tuning, EVLF-FM not only achieves state-of-the-art accuracy but also exhibits step-by-step reasoning, aligning outputs with visual evidence. EVLF-FM is an early multi-disease VLM model with explainability and reasoning capabilities that could advance adoption of and trust in foundation models for real-world clinical deployment.
Publisher
Cornell University Library, arXiv.org
Subject
MBRLCatalogueRelatedBooks
Related Items
Related Items
We currently cannot retrieve any items related to this title. Kindly check back at a later time.
This website uses cookies to ensure you get the best experience on our website.