Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
A Comparison Between GPT-3.5, GPT-4, and GPT-4V: Can the Large Language Model (ChatGPT) Pass the Japanese Board of Orthopaedic Surgery Examination?
by
Imai, Hirotatsu
, Kanie, Yuya
, Uemura, Keisuke
, Fujimori, Takahito
, Okada, Seiji
, Nakajima, Nozomu
, Kita, Kosuke
, Furuya, Masayuki
in
Accuracy
/ Artificial intelligence
/ Bone surgery
/ Chatbots
/ Knowledge
/ Language
/ Large language models
/ Multiple choice
/ Orthopedics
/ Performance evaluation
/ Reproducibility
/ Statistical analysis
2024
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
A Comparison Between GPT-3.5, GPT-4, and GPT-4V: Can the Large Language Model (ChatGPT) Pass the Japanese Board of Orthopaedic Surgery Examination?
by
Imai, Hirotatsu
, Kanie, Yuya
, Uemura, Keisuke
, Fujimori, Takahito
, Okada, Seiji
, Nakajima, Nozomu
, Kita, Kosuke
, Furuya, Masayuki
in
Accuracy
/ Artificial intelligence
/ Bone surgery
/ Chatbots
/ Knowledge
/ Language
/ Large language models
/ Multiple choice
/ Orthopedics
/ Performance evaluation
/ Reproducibility
/ Statistical analysis
2024
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
A Comparison Between GPT-3.5, GPT-4, and GPT-4V: Can the Large Language Model (ChatGPT) Pass the Japanese Board of Orthopaedic Surgery Examination?
by
Imai, Hirotatsu
, Kanie, Yuya
, Uemura, Keisuke
, Fujimori, Takahito
, Okada, Seiji
, Nakajima, Nozomu
, Kita, Kosuke
, Furuya, Masayuki
in
Accuracy
/ Artificial intelligence
/ Bone surgery
/ Chatbots
/ Knowledge
/ Language
/ Large language models
/ Multiple choice
/ Orthopedics
/ Performance evaluation
/ Reproducibility
/ Statistical analysis
2024
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
A Comparison Between GPT-3.5, GPT-4, and GPT-4V: Can the Large Language Model (ChatGPT) Pass the Japanese Board of Orthopaedic Surgery Examination?
Journal Article
A Comparison Between GPT-3.5, GPT-4, and GPT-4V: Can the Large Language Model (ChatGPT) Pass the Japanese Board of Orthopaedic Surgery Examination?
2024
Request Book From Autostore
and Choose the Collection Method
Overview
Introduction Recently, large-scale language models, such as ChatGPT (OpenAI, San Francisco, CA), have evolved. These models are designed to think and act like humans and possess a broad range of specialized knowledge. GPT-3.5 was reported to be at a level of passing the United States Medical Licensing Examination. Its capabilities continue to evolve, and in October 2023, GPT-4V became available as a model capable of image recognition. Therefore, it is important to know the current performance of these models because they will be soon incorporated into medical practice. We aimed to evaluate the performance of ChatGPT in the field of orthopedic surgery. Methods We used three years' worth of Japanese Board of Orthopaedic Surgery Examinations (JBOSE) conducted in 2021, 2022, and 2023. Questions and their multiple-choice answers were used in their original Japanese form, as was the official examination rubric. We inputted these questions into three versions of ChatGPT: GPT-3.5, GPT-4, and GPT-4V. For image-based questions, we inputted only textual statements for GPT-3.5 and GPT-4, and both image and textual statements for GPT-4V. As the minimum scoring rate acquired to pass is not officially disclosed, it was calculated using publicly available data. Results The estimated minimum scoring rate acquired to pass was calculated as 50.1% (43.7-53.8%). For GPT-4, even when answering all questions, including the image-based ones, the percentage of correct answers was 59% (55-61%) and GPT-4 was able to achieve the passing line. When excluding image-based questions, the score reached 67% (63-73%). For GPT-3.5, the percentage was limited to 30% (28-32%), and this version could not pass the examination. There was a significant difference in the performance between GPT-4 and GPT-3.5 (p < 0.001). For image-based questions, the percentage of correct answers was 25% in GPT-3.5, 38% in GPT-4, and 38% in GPT-4V. There was no significant difference in the performance for image-based questions between GPT-4 and GPT-4V. Conclusions ChatGPT had enough performance to pass the orthopedic specialist examination. After adding further training data such as images, ChatGPT is expected to be applied to the orthopedics field.
Publisher
Springer Nature B.V,Cureus
This website uses cookies to ensure you get the best experience on our website.