Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
Preliminary evaluation of DeepSeek-R1 and GPT-5.3 in selected PET/CT clinical scenarios: patient preparation, report interpretation, and diagnostic reasoning
by
Tianyue Li
, Runze Duan
, Lu Zheng
, Yujing Hu
, Yanzhu Bian
, Jing Pang
, Ziyu Guo
in
[18F]FDG PET/CT
/ artificial intelligence
/ Chatbot
/ DeepSeek-R1
/ GPT-5.3
/ patient communication
2026
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
Preliminary evaluation of DeepSeek-R1 and GPT-5.3 in selected PET/CT clinical scenarios: patient preparation, report interpretation, and diagnostic reasoning
by
Tianyue Li
, Runze Duan
, Lu Zheng
, Yujing Hu
, Yanzhu Bian
, Jing Pang
, Ziyu Guo
in
[18F]FDG PET/CT
/ artificial intelligence
/ Chatbot
/ DeepSeek-R1
/ GPT-5.3
/ patient communication
2026
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
Preliminary evaluation of DeepSeek-R1 and GPT-5.3 in selected PET/CT clinical scenarios: patient preparation, report interpretation, and diagnostic reasoning
by
Tianyue Li
, Runze Duan
, Lu Zheng
, Yujing Hu
, Yanzhu Bian
, Jing Pang
, Ziyu Guo
in
[18F]FDG PET/CT
/ artificial intelligence
/ Chatbot
/ DeepSeek-R1
/ GPT-5.3
/ patient communication
2026
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
Preliminary evaluation of DeepSeek-R1 and GPT-5.3 in selected PET/CT clinical scenarios: patient preparation, report interpretation, and diagnostic reasoning
Journal Article
Preliminary evaluation of DeepSeek-R1 and GPT-5.3 in selected PET/CT clinical scenarios: patient preparation, report interpretation, and diagnostic reasoning
2026
Request Book From Autostore
and Choose the Collection Method
Overview
ObjectiveTo evaluate the performance of DeepSeek (R1 version), an open-source large language model, in three core clinical scenarios: answering patients’ common questions, interpreting PET/CT reports with follow-up inquiries, and diagnosing complex cases, and comparison with GPT-5.3, to verify the clinical applicability of DeepSeek-R1 as an alternative AI assistant.MethodsA total of 39 standardized tasks were assigned to both models, including responding to 15 frequently asked questions about [18F]FDG PET/CT, interpreting 12 anonymized reports of lung cancer and lymphoma (with follow-up inquiries regarding tumor staging or treatment), and providing primary and differential diagnoses for 10 difficult cases. Both models were accessed via their official platforms with default parameters, and all prompts and evaluation criteria were kept identical for cross-model comparison. Two senior nuclear medicine physicians independently rated the model responses using a 4-point standardized scale (assessing appropriateness, helpfulness, inter-trial consistency, and reference validity) and a binary scale for empathy; Cohen’s Kappa coefficient was used to evaluate inter-rater agreement. McNemar’s test was used to compare paired proportions of appropriateness, empathy, and response inconsistency between the two models.ResultsAcross the 39 tasks, DeepSeek-R1 achieved 94.9% appropriateness and 100% helpfulness. Specifically, 91.7% of responses to follow-up inquiries about tumor staging or treatment were rated empathetic. However, 7.7% of regenerated responses showed substantial inconsistencies, primarily in tumor staging, and only 37% of cited references were fully valid, with 11.1% being invalid. GPT-5.3 exhibited equivalent core performance to DeepSeek-R1 with 94.9% appropriateness and 100% helpfulness, a slightly lower substantial inconsistency rate (5.1%), favorable reference validity (33% fully valid, 7.4% invalid), but a notably lower empathy score (66.7%) for follow-up inquiries. McNemar tests showed identical appropriateness (p = 1.00) and no significant difference in inconsistency (p = 1.00, 95% CI 0.60–14.80) between models. DeepSeek-R1 had higher empathy, the difference was not significant (p = 0.25, 95% CI 0.09–0.66). For the 10 identical difficult cases, both models reached 10% primary diagnosis accuracy and 60% differential diagnosis accuracy.ConclusionDeepSeek-R1 and GPT-5.3 have complementary strengths but similar reference hallucination issues and cannot replace clinicians. DeepSeek-R1 is a cost-effective auxiliary tool, with future optimization needed for consistency, diagnostic accuracy and reference validity.
Publisher
Frontiers Media S.A
Subject
This website uses cookies to ensure you get the best experience on our website.