Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
63 result(s) for "Miki, Soichiro"
Sort by:
GPT-4 Turbo with Vision fails to outperform text-only GPT-4 Turbo in the Japan Diagnostic Radiology Board Examination
PurposeTo assess the performance of GPT-4 Turbo with Vision (GPT-4TV), OpenAI’s latest multimodal large language model, by comparing its ability to process both text and image inputs with that of the text-only GPT-4 Turbo (GPT-4 T) in the context of the Japan Diagnostic Radiology Board Examination (JDRBE).Materials and methodsThe dataset comprised questions from JDRBE 2021 and 2023. A total of six board-certified diagnostic radiologists discussed the questions and provided ground-truth answers by consulting relevant literature as necessary. The following questions were excluded: those lacking associated images, those with no unanimous agreement on answers, and those including images rejected by the OpenAI application programming interface. The inputs for GPT-4TV included both text and images, whereas those for GPT-4 T were entirely text. Both models were deployed on the dataset, and their performance was compared using McNemar’s exact test. The radiological credibility of the responses was assessed by two diagnostic radiologists through the assignment of legitimacy scores on a five-point Likert scale. These scores were subsequently used to compare model performance using Wilcoxon's signed-rank test.ResultsThe dataset comprised 139 questions. GPT-4TV correctly answered 62 questions (45%), whereas GPT-4 T correctly answered 57 questions (41%). A statistical analysis found no significant performance difference between the two models (P = 0.44). The GPT-4TV responses received significantly lower legitimacy scores from both radiologists than the GPT-4 T responses.ConclusionNo significant enhancement in accuracy was observed when using GPT-4TV with image input compared with that of using text-only GPT-4 T for JDRBE questions.
Performance changes due to differences in training data for cerebral aneurysm detection in head MR angiography images
PurposeThe performance of computer-aided detection (CAD) software depends on the quality and quantity of the dataset used for machine learning. If the data characteristics in development and practical use are different, the performance of CAD software degrades. In this study, we investigated changes in detection performance due to differences in training data for cerebral aneurysm detection software in head magnetic resonance angiography images.Materials and methodsWe utilized three types of CAD software for cerebral aneurysm detection in MRA images, which were based on 3D local intensity structure analysis, graph-based features, and convolutional neural network. For each type of CAD software, we compared three types of training pattern, which were two types of training using single-site data and one type of training using multisite data. We also carried out internal and external evaluations.ResultsIn training using single-site data, the performance of CAD software largely and unpredictably fluctuated when the training dataset was changed. Training using multisite data did not show the lowest performance among the three training patterns for any CAD software and dataset.ConclusionThe training of cerebral aneurysm detection software using data collected from multiple sites is desirable to ensure the stable performance of the software.
Automatic detection of actionable radiology reports using bidirectional encoder representations from transformers
Background It is essential for radiologists to communicate actionable findings to the referring clinicians reliably. Natural language processing (NLP) has been shown to help identify free-text radiology reports including actionable findings. However, the application of recent deep learning techniques to radiology reports, which can improve the detection performance, has not been thoroughly examined. Moreover, free-text that clinicians input in the ordering form (order information) has seldom been used to identify actionable reports. This study aims to evaluate the benefits of two new approaches: (1) bidirectional encoder representations from transformers (BERT), a recent deep learning architecture in NLP, and (2) using order information in addition to radiology reports. Methods We performed a binary classification to distinguish actionable reports (i.e., radiology reports tagged as actionable in actual radiological practice) from non-actionable ones (those without an actionable tag). 90,923 Japanese radiology reports in our hospital were used, of which 788 (0.87%) were actionable. We evaluated four methods, statistical machine learning with logistic regression (LR) and with gradient boosting decision tree (GBDT), and deep learning with a bidirectional long short-term memory (LSTM) model and a publicly available Japanese BERT model. Each method was used with two different inputs, radiology reports alone and pairs of order information and radiology reports. Thus, eight experiments were conducted to examine the performance. Results Without order information, BERT achieved the highest area under the precision-recall curve (AUPRC) of 0.5138, which showed a statistically significant improvement over LR, GBDT, and LSTM, and the highest area under the receiver operating characteristic curve (AUROC) of 0.9516. Simply coupling the order information with the radiology reports slightly increased the AUPRC of BERT but did not lead to a statistically significant improvement. This may be due to the complexity of clinical decisions made by radiologists. Conclusions BERT was assumed to be useful to detect actionable reports. More sophisticated methods are required to use order information effectively.
Capability of GPT-4V(ision) in the Japanese National Medical Licensing Examination: Evaluation Study
Previous research applying large language models (LLMs) to medicine was focused on text-based information. Recently, multimodal variants of LLMs acquired the capability of recognizing images. We aim to evaluate the image recognition capability of generative pretrained transformer (GPT)-4V, a recent multimodal LLM developed by OpenAI, in the medical field by testing how visual information affects its performance to answer questions in the 117th Japanese National Medical Licensing Examination. We focused on 108 questions that had 1 or more images as part of a question and presented GPT-4V with the same questions under two conditions: (1) with both the question text and associated images and (2) with the question text only. We then compared the difference in accuracy between the 2 conditions using the exact McNemar test. Among the 108 questions with images, GPT-4V's accuracy was 68% (73/108) when presented with images and 72% (78/108) when presented without images (P=.36). For the 2 question categories, clinical and general, the accuracies with and those without images were 71% (70/98) versus 78% (76/98; P=.21) and 30% (3/10) versus 20% (2/10; P≥.99), respectively. The additional information from the images did not significantly improve the performance of GPT-4V in the Japanese National Medical Licensing Examination.
Deep generative abnormal lesion emphasization validated by nine radiologists and 1000 chest X-rays with lung nodules
A general-purpose method of emphasizing abnormal lesions in chest radiographs, named EGGPALE (Extrapolative, Generative and General-Purpose Abnormal Lesion Emphasizer), is presented. The proposed EGGPALE method is composed of a flow-based generative model and L-infinity-distance-based extrapolation in a latent space. The flow-based model is trained using only normal chest radiographs, and an invertible mapping function from the image space to the latent space is determined. In the latent space, a given unseen image is extrapolated so that the image point moves away from the normal chest X-ray hyperplane. Finally, the moved point is mapped back to the image space and the corresponding emphasized image is created. The proposed method was evaluated by an image interpretation experiment with nine radiologists and 1,000 chest radiographs, of which positive suspected lung cancer cases and negative cases were validated by computed tomography examinations. The sensitivity of EGGPALE-processed images showed +0.0559 average improvement compared with that of the original images, with -0.0192 deterioration of average specificity. The area under the receiver operating characteristic curve of the ensemble of nine radiologists showed a statistically significant improvement. From these results, the feasibility of EGGPALE for enhancing abnormal lesions was validated. Our code is available at https://github.com/utrad-ical/Eggpale .
Identity Diffuser: Preserving Abnormal Region of Interests While Diffusing Identity
To release medical images that can be freely used in downstream processes while maintaining their utility, it is necessary to remove personal features from the images while preserving the lesion structures. Unlike previous studies that focused on removing lesion structures while preserving the individuality of medical images, this study proposes and validates a new framework that maintains the lesion structures while diffusing individual characteristics. In this framework, we apply local differential privacy techniques to provide theoretical guarantees of privacy protection. Additionally, to enhance the utility of protected medical images, we perform denoising using a diffusion model on the noise-contaminated medical images. Numerous chest X-rays generated by the proposed method were evaluated by physicians, revealing a trade-off between the level of privacy protection and utility. In other words, it was confirmed that increasing the level of personal information protection tends to result in relatively lower utility. This study potentially enables the release of certain types of medical images that were previously difficult to share.
Computer-aided detection of cerebral aneurysms with magnetic resonance angiography: usefulness of volume rendering to display lesion candidates
PurposeThe clinical usefulness of computer-aided detection of cerebral aneurysms has been investigated using different methods to present lesion candidates, but suboptimal methods may have limited its usefulness. We compared three presentation methods to determine which can benefit radiologists the most by enabling them to detect more aneurysms.Materials and methodsWe conducted a multireader multicase observer performance study involving six radiologists and using 470 lesion candidates output by a computer-aided detection program, and compared the following three different presentation methods using the receiver operating characteristic analysis: (1) a lesion candidate is encircled on axial slices, (2) a lesion candidate is overlaid on a volume-rendered image, and (3) combination of (1) and (2). The response time was also compared.ResultsAs compared with axial slices, radiologists showed significantly better detection performance when presented with volume-rendered images. There was no significant difference in response time between the two methods. The combined method was associated with a significantly longer response time, but had no added merit in terms of diagnostic accuracy.ConclusionEven with the aid of computer-aided detection, radiologists overlook many aneurysms if the presentation method is not optimal. Overlaying colored lesion candidates on volume-rendered images can help them detect more aneurysms.
Performance changes due to differences among annotating radiologists for training data in computerized lesion detection
Purpose The quality and bias of annotations by annotators (e.g., radiologists) affect the performance changes in computer-aided detection (CAD) software using machine learning. We hypothesized that the difference in the years of experience in image interpretation among radiologists contributes to annotation variability. In this study, we focused on how the performance of CAD software changes with retraining by incorporating cases annotated by radiologists with varying experience. Methods We used two types of CAD software for lung nodule detection in chest computed tomography images and cerebral aneurysm detection in magnetic resonance angiography images. Twelve radiologists with different years of experience independently annotated the lesions, and the performance changes were investigated by repeating the retraining of the CAD software twice, with the addition of cases annotated by each radiologist. Additionally, we investigated the effects of retraining using integrated annotations from multiple radiologists. Results The performance of the CAD software after retraining differed among annotating radiologists. In some cases, the performance was degraded compared to that of the initial software. Retraining using integrated annotations showed different performance trends depending on the target CAD software, notably in cerebral aneurysm detection, where the performance decreased compared to using annotations from a single radiologist. Conclusions Although the performance of the CAD software after retraining varied among the annotating radiologists, no direct correlation with their experience was found. The performance trends differed according to the type of CAD software used when integrated annotations from multiple radiologists were used.
Unsupervised Deep Anomaly Detection in Chest Radiographs
The purposes of this study are to propose an unsupervised anomaly detection method based on a deep neural network (DNN) model, which requires only normal images for training, and to evaluate its performance with a large chest radiograph dataset. We used the auto-encoding generative adversarial network (α-GAN) framework, which is a combination of a GAN and a variational autoencoder, as a DNN model. A total of 29,684 frontal chest radiographs from the Radiological Society of North America Pneumonia Detection Challenge dataset were used for this study (16,880 male and 12,804 female patients; average age, 47.0 years). All these images were labeled as “Normal,” “No Opacity/Not Normal,” or “Opacity” by board-certified radiologists. About 70% (6,853/9,790) of the Normal images were randomly sampled as the training dataset, and the rest were randomly split into the validation and test datasets in a ratio of 1:2 (7,610 and 15,221). Our anomaly detection system could correctly visualize various lesions including a lung mass, cardiomegaly, pleural effusion, bilateral hilar lymphadenopathy, and even dextrocardia. Our system detected the abnormal images with an area under the receiver operating characteristic curve (AUROC) of 0.752. The AUROCs for the abnormal labels Opacity and No Opacity/Not Normal were 0.838 and 0.704, respectively. Our DNN-based unsupervised anomaly detection method could successfully detect various diseases or anomalies in chest radiographs by training with only the normal images.