Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
80
result(s) for
"Halpern, Allan"
Sort by:
Human–computer collaboration for skin cancer recognition
2020
The rapid increase in telemedicine coupled with recent advances in diagnostic artificial intelligence (AI) create the imperative to consider the opportunities and risks of inserting AI-based support into new paradigms of care. Here we build on recent achievements in the accuracy of image-based AI for skin cancer diagnosis to address the effects of varied representations of AI-based support across different levels of clinical expertise and multiple clinical workflows. We find that good quality AI-based support of clinical decision-making improves diagnostic accuracy over that of either AI or physicians alone, and that the least experienced clinicians gain the most from AI-based support. We further find that AI-based multiclass probabilities outperformed content-based image retrieval (CBIR) representations of AI in the mobile technology environment, and AI-based support had utility in simulations of second opinions and of telemedicine triage. In addition to demonstrating the potential benefits associated with good quality AI in the hands of non-expert clinicians, we find that faulty AI can mislead the entire spectrum of clinicians, including experts. Lastly, we show that insights derived from AI class-activation maps can inform improvements in human diagnosis. Together, our approach and findings offer a framework for future studies across the spectrum of image-based diagnostics to improve human–computer collaboration in clinical practice.
A systematic evaluation of the value of AI-based decision support in skin tumor diagnosis demonstrates the superiority of human–computer collaboration over each individual approach and supports the potential of automated approaches in diagnostic medicine.
Journal Article
BCN20000: Dermoscopic Lesions in the Wild
by
Podlipnik, Sebastian
,
Codella, Noel C. F.
,
Hernández-Pérez, Carlos
in
631/67
,
692/699/67/1813/1634
,
Artificial Intelligence
2024
Advancements in dermatological artificial intelligence research require high-quality and comprehensive datasets that mirror real-world clinical scenarios. We introduce a collection of 18,946 dermoscopic images spanning from 2010 to 2016, collated at the Hospital Clínic in Barcelona, Spain. The BCN20000 dataset aims to address the problem of unconstrained classification of dermoscopic images of skin cancer, including lesions in hard-to-diagnose locations such as those found in nails and mucosa, large lesions which do not fit in the aperture of the dermoscopy device, and hypo-pigmented lesions. Our dataset covers eight key diagnostic categories in dermoscopy, providing a diverse range of lesions for artificial intelligence model training. Furthermore, a ninth out-of-distribution (OOD) class is also present on the test set, comprised of lesions which could not be distinctively classified as any of the others. By providing a comprehensive collection of varied images, BCN20000 helps bridge the gap between the training data for machine learning models and the day-to-day practice of medical practitioners. Additionally, we present a set of baseline classifiers based on state-of-the-art neural networks, which can be extended by other researchers for further experimentation.
Journal Article
A patient-centric dataset of images and metadata for identifying melanomas using clinical context
by
Kurtansky, Nicholas
,
Codella, Noel
,
Malvehy, Josep
in
692/1807/1812
,
692/699/67/1813
,
Artificial Intelligence
2021
Prior skin image datasets have not addressed patient-level information obtained from multiple skin lesions from the same patient. Though artificial intelligence classification algorithms have achieved expert-level performance in controlled studies examining single images, in practice dermatologists base their judgment holistically from multiple lesions on the same patient. The 2020 SIIM-ISIC Melanoma Classification challenge dataset described herein was constructed to address this discrepancy between prior challenges and clinical practice, providing for each image in the dataset an identifier allowing lesions from the same patient to be mapped to one another. This patient-level contextual information is frequently used by clinicians to diagnose melanoma and is especially useful in ruling out false positives in patients with many atypical nevi. The dataset represents 2,056 patients (20.8% with at least one melanoma, 79.2% with zero melanomas) from three continents with an average of 16 lesions per patient, consisting of 33,126 dermoscopic images and 584 (1.8%) histopathologically confirmed melanomas compared with benign melanoma mimickers.
Measurement(s)
melanoma • Skin Lesion
Technology Type(s)
Dermoscopy • digital curation
Factor Type(s)
approximate age • sex • anatomic site
Sample Characteristic - Organism
Homo sapiens
Machine-accessible metadata file describing the reported data:
https://doi.org/10.6084/m9.figshare.13070345
Journal Article
Prospective validation of dermoscopy-based open-source artificial intelligence for melanoma diagnosis (PROVE-AI study)
by
Weber, Jochen
,
DeFazio, Jennifer
,
Dusza, Stephen W
in
Accuracy
,
Algorithms
,
Artificial intelligence
2023
The use of artificial intelligence (AI) has the potential to improve the assessment of lesions suspicious of melanoma, but few clinical studies have been conducted. We validated the accuracy of an open-source, non-commercial AI algorithm for melanoma diagnosis and assessed its potential impact on dermatologist decision-making. We conducted a prospective, observational clinical study to assess the diagnostic accuracy of the AI algorithm (ADAE) in predicting melanoma from dermoscopy skin lesion images. The primary aim was to assess the reliability of ADAE’s sensitivity at a predefined threshold of 95%. Patients who had consented for a skin biopsy to exclude melanoma were eligible. Dermatologists also estimated the probability of melanoma and indicated management choices before and after real-time exposure to ADAE scores. All lesions underwent biopsy. Four hundred thirty-five participants were enrolled and contributed 603 lesions (95 melanomas). Participants had a mean age of 59 years, 54% were female, and 96% were White individuals. At the predetermined 95% sensitivity threshold, ADAE had a sensitivity of 96.8% (95% CI: 91.1–98.9%) and specificity of 37.4% (95% CI: 33.3–41.7%). The dermatologists’ ability to assess melanoma risk significantly improved after ADAE exposure (AUC 0.7798 vs. 0.8161, p = 0.042). Post-ADAE dermatologist decisions also had equivalent or higher net benefit compared to biopsying all lesions. We validated the accuracy of an open-source melanoma AI algorithm and showed its theoretical potential for improving dermatology experts’ ability to evaluate lesions suspicious of melanoma. Larger randomized trials are needed to fully evaluate the potential of adopting this AI algorithm into clinical workflows.
Journal Article
In vivo tumor immune microenvironment phenotypes correlate with inflammation and vasculature to predict immunotherapy response
2022
Response to immunotherapies can be variable and unpredictable. Pathology-based phenotyping of tumors into ‘hot’ and ‘cold’ is static, relying solely on T-cell infiltration in single-time single-site biopsies, resulting in suboptimal treatment response prediction. Dynamic vascular events (tumor angiogenesis, leukocyte trafficking) within tumor immune microenvironment (TiME) also influence anti-tumor immunity and treatment response. Here, we report dynamic cellular-level TiME phenotyping in vivo that combines inflammation profiles with vascular features through non-invasive reflectance confocal microscopic imaging. In skin cancer patients, we demonstrate three main TiME phenotypes that correlate with gene and protein expression, and response to toll-like receptor agonist immune-therapy. Notably, phenotypes with high inflammation associate with immunostimulatory signatures and those with high vasculature with angiogenic and endothelial anergy signatures. Moreover, phenotypes with high inflammation and low vasculature demonstrate the best treatment response. This non-invasive in vivo phenotyping approach integrating dynamic vasculature with inflammation serves as a reliable predictor of response to topical immune-therapy in patients.
Standard assessment of immune infiltration of biopsies is not sufficient to accurately predict response to immunotherapy. Here, the authors show that reflectance confocal microscopy can be used to quantify dynamic vasculature and inflammatory features to better predict treatment response in skin cancers.
Journal Article
Transforming Dermatologic Imaging for the Digital Era: Metadata and Standards
by
Clunie, David
,
Curiel-Lewandrowski, Clara
,
Halpern, Allan C
in
Dermatology
,
Digital imaging
,
Disease control
2018
Imaging is increasingly being used in dermatology for documentation, diagnosis, and management of cutaneous disease. The lack of standards for dermatologic imaging is an impediment to clinical uptake. Standardization can occur in image acquisition, terminology, interoperability, and metadata. This paper presents the International Skin Imaging Collaboration position on standardization of metadata for dermatologic imaging. Metadata is essential to ensure that dermatologic images are properly managed and interpreted. There are two standards-based approaches to recording and storing metadata in dermatologic imaging. The first uses standard consumer image file formats, and the second is the file format and metadata model developed for the Digital Imaging and Communication in Medicine (DICOM) standard. DICOM would appear to provide an advantage over using consumer image file formats for metadata as it includes all the patient, study, and technical metadata necessary to use images clinically. Whereas, consumer image file formats only include technical metadata and need to be used in conjunction with another actor—for example, an electronic medical record—to supply the patient and study metadata. The use of DICOM may have some ancillary benefits in dermatologic imaging including leveraging DICOM network and workflow services, interoperability of images and metadata, leveraging existing enterprise imaging infrastructure, greater patient safety, and better compliance to legislative requirements for image retention.
Journal Article
Evaluating skin tone scales for dermatologic dataset labeling: a prospective-comparative study
2025
Skin tone affects artificial intelligence (AI) performance in dermatology. While labeling datasets for skin tone could improve algorithm generalizability for detecting dermatologic malignancies, large-scale validation of skin tone assessments is lacking. This prospective observational study assessed reliability of subjective tools (Fitzpatrick Skin Type [FST], Monk Skin Tone [MST], Pantone SkinTone Guide) and an objective colorimeter for in-person and photography-based settings to evaluate utility for labeling dermoscopic datasets. Colorimetry (gold standard for color measurement) demonstrated high precision with in-person measurements. Of subjective scales, MST demonstrated slightly tighter clustering in the color space and high repeatability for in-person and photography-based assessments (latter varied by lighting). Dermoscopic image-extracted color values correlated poorly with colorimetry values. For subjective ratings, MST more effectively captured differences in AI melanoma classification scores than FST. Findings underscore that FST is not a proxy for skin tone; an important role remains for skin tone assessment to improve AI performance.
Journal Article
Improving dataset transparency in dermatologic Artificial Intelligence using a dataset nutrition label
2025
Biased and poorly documented dermatology datasets pose risks to the development of safe and generalizable artificial intelligence (AI) tools. We created a Dataset Nutrition Label (DNL) for multiple dermatology datasets to support transparent and responsible data use. The DNL offers a structured, digestible summary of key attributes, including metadata, limitations, and risks, enabling data users to better assess suitability and proactively address potential sources of bias in datasets.
Journal Article
Agreement Between Experts and an Untrained Crowd for Identifying Dermoscopic Features Using a Gamified App: Reader Feasibility Study
2023
Dermoscopy is commonly used for the evaluation of pigmented lesions, but agreement between experts for identification of dermoscopic structures is known to be relatively poor. Expert labeling of medical data is a bottleneck in the development of machine learning (ML) tools, and crowdsourcing has been demonstrated as a cost- and time-efficient method for the annotation of medical images.
The aim of this study is to demonstrate that crowdsourcing can be used to label basic dermoscopic structures from images of pigmented lesions with similar reliability to a group of experts.
First, we obtained labels of 248 images of melanocytic lesions with 31 dermoscopic \"subfeatures\" labeled by 20 dermoscopy experts. These were then collapsed into 6 dermoscopic \"superfeatures\" based on structural similarity, due to low interrater reliability (IRR): dots, globules, lines, network structures, regression structures, and vessels. These images were then used as the gold standard for the crowd study. The commercial platform DiagnosUs was used to obtain annotations from a nonexpert crowd for the presence or absence of the 6 superfeatures in each of the 248 images. We replicated this methodology with a group of 7 dermatologists to allow direct comparison with the nonexpert crowd. The Cohen κ value was used to measure agreement across raters.
In total, we obtained 139,731 ratings of the 6 dermoscopic superfeatures from the crowd. There was relatively lower agreement for the identification of dots and globules (the median κ values were 0.526 and 0.395, respectively), whereas network structures and vessels showed the highest agreement (the median κ values were 0.581 and 0.798, respectively). This pattern was also seen among the expert raters, who had median κ values of 0.483 and 0.517 for dots and globules, respectively, and 0.758 and 0.790 for network structures and vessels. The median κ values between nonexperts and thresholded average-expert readers were 0.709 for dots, 0.719 for globules, 0.714 for lines, 0.838 for network structures, 0.818 for regression structures, and 0.728 for vessels.
This study confirmed that IRR for different dermoscopic features varied among a group of experts; a similar pattern was observed in a nonexpert crowd. There was good or excellent agreement for each of the 6 superfeatures between the crowd and the experts, highlighting the similar reliability of the crowd for labeling dermoscopic images. This confirms the feasibility and dependability of using crowdsourcing as a scalable solution to annotate large sets of dermoscopic images, with several potential clinical and educational applications, including the development of novel, explainable ML tools.
Journal Article
The degradation of performance of a state-of-the-art skin image classifier when applied to patient-driven internet search
by
Park, Gyeong Hun
,
Han, Seung Seog
,
Navarrete-Dechent, Cristian
in
631/114/1305
,
692/4028/67/1813
,
692/699/4033
2022
Model Dermatology (
https://modelderm.com
; Build2021) is a publicly testable neural network that can classify 184 skin disorders. We aimed to investigate whether our algorithm can classify clinical images of an Internet community along with tertiary care center datasets. Consecutive images from an Internet skin cancer community (‘RD’ dataset, 1,282 images posted between 25 January 2020 to 30 July 2021;
https://reddit.com/r/melanoma
) were analyzed retrospectively, along with hospital datasets (Edinburgh dataset, 1,300 images; SNU dataset, 2,101 images; TeleDerm dataset, 340 consecutive images). The algorithm’s performance was equivalent to that of dermatologists in the curated clinical datasets (Edinburgh and SNU datasets). However, its performance deteriorated in the RD and TeleDerm datasets because of insufficient image quality and the presence of out-of-distribution disorders, respectively. For the RD dataset, the algorithm’s Top-1/3 accuracy (39.2%/67.2%) and AUC (0.800) were equivalent to that of general physicians (36.8%/52.9%). It was more accurate than that of the laypersons using random Internet searches (19.2%/24.4%). The Top-1/3 accuracy was affected by inadequate image quality (adequate = 43.2%/71.3% versus inadequate = 32.9%/60.8%), whereas participant performance did not deteriorate (adequate = 35.8%/52.7% vs. inadequate = 38.4%/53.3%). In this report, the algorithm performance was significantly affected by the change of the intended settings, which implies that AI algorithms at dermatologist-level, in-distribution setting, may not be able to show the same level of performance in with out-of-distribution settings.
Journal Article