Catalogue Search | MBRL

Overcoming the Interobserver Variability in Lung Adenocarcinoma Subtyping: A Clustering Approach to Establish a Ground Truth for Downstream Applications

by Matsumoto, K. , Brcic, L. , Cavazza, A. in Adenocarcinoma , Adenocarcinoma of Lung , Care and treatment

2023

The accurate identification of different lung adenocarcinoma histologic subtypes is important for determining prognosis but can be challenging because of overlaps in the diagnostic features, leading to considerable interobserver variability. To provide an overview of the diagnostic agreement for lung adenocarcinoma subtypes among pathologists and to create a ground truth using the clustering approach for downstream computational applications. Three sets of lung adenocarcinoma histologic images with different evaluation levels (small patches, areas with relatively uniform histology, and whole slide images) were reviewed by 17 international expert lung pathologists and 1 pathologist in training. Each image was classified into one or several lung adenocarcinoma subtypes. Among the 4702 patches of the first set, 1742 (37%) had an overall consensus among all pathologists. The overall Fleiss κ score for the agreement of all subtypes was 0.58. Using cluster analysis, pathologists were hierarchically grouped into 2 clusters, with κ scores of 0.588 and 0.563 in clusters 1 and 2, respectively. Similar results were obtained for the second and third sets, with fair-to-moderate agreements. Patches from the first 2 sets that obtained the consensus of the 18 pathologists were retrieved to form consensus patches and were regarded as the ground truth of lung adenocarcinoma subtypes. Our observations highlight discrepancies among experts when assessing lung adenocarcinoma subtypes. However, a subsequent number of consensus patches could be retrieved from each cluster, which can be used as ground truth for the downstream computational pathology applications, with minimal influence from interobserver variability.

Journal Article

Share this book

Add to My Shelf

Anastomotic perfusion assessment with indocyanine green in robot-assisted low-anterior resection, a multicenter study of interobserver variation

by Andersen, Per V , Ellebaek, Mark B , Dohrn, Niclas in Colorectal cancer , Colorectal surgery , Endoscopy

2023

BackgroundSecuring sufficient blood perfusion to the anastomotic area after low-anterior resection is a crucial factor in preventing anastomotic leakage (AL). Intra-operative indocyanine green fluorescent imaging (ICG-FI) has been suggested as a tool to assess perfusion. However, knowledge of inter-observer variation among surgeons in the interpretation of ICG-FI is sparse. Our primary objective was to evaluate inter-observer variation among surgeons in the interpretation of bowel blood-perfusion assessed visually by ICG-FI. Our secondary objective was to compare the results both from the visual assessment of ICG and from computer-based quantitative analyses of ICG-FI between patients with and without the development of AL. MethodA multicenter study, including patients undergoing robot-assisted low anterior resection with stapled anastomosis. ICG-FI was evaluated visually by the surgeon intra-operatively. Postoperatively, recorded videos were anonymized and exchanged between centers for inter-observer evaluation. Time to visibility (TTV), time to maximum visibility (TMV), and time to wash-out (TWO) were visually assessed. In addition, the ICG-FI video-recordings were analyzed using validated pixel analysis software to quantify blood perfusion.ResultsFifty-five patients were included, and five developed clinical AL. Bland–Altman plots (BA plots) demonstrated wide inter-observer variation for visually assessed fluorescence on all parameters (TTV, TMV, and TWO). Comparing leak-group with no-leak group, we found no significant differences for TTV: Hazard Ratio; HR = 0.82 (CI 0.32; 2.08), TMV: HR = 0.62 (CI 0.24; 1.59), or TWO: HR = 1.11 (CI 0.40; 3.11). In the quantitative pixel analysis, a lower slope of the fluorescence time-curve was found in patients with a subsequent leak: median 0.08 (0.07;0.10) compared with non-leak patients: median 0.13 (0.10;0.17) (p = 0.04).ConclusionThe surgeon’s visual assessment of the ICG-FI demonstrated wide inter-observer variation, there were no differences between patients with and without AL. However, quantitative pixel analysis showed a significant difference between groups.Trial RegistrationClinicalTrials.gov Identifier: NCT04766060.

Journal Article

Share this book

Add to My Shelf

How variation among field assessments can affect biodiversity offset outcomes

by Caves, Karen , Dorrough, Josh W. , Gorrod, Emma in Assessments , Biodiversity , Biodiversity loss

2025

Biodiversity offsetting aims to balance biodiversity loss at development sites with gains at offset sites. Measurement of loss and gain relies on transparent and repeatable estimates of biodiversity values. However, these estimates are often derived from field assessments by people who differ in their interpretation and measurement of biodiversity, either randomly or systematically. Variation among people during field assessments may therefore impact offset outcomes and contribute to uncertainty around the effectiveness of biodiversity offset schemes. Here, we describe variation in loss, gain, and offset outcomes using concurrent assessments by five assessors on eight sites using a multi‐metric biodiversity valuation method from New South Wales, Australia. We found variation among assessors was high for field estimates but substantially decreased for current biodiversity valuations. However, variation increased for the prediction of future biodiversity gains, in the calculation of the required offset area, and contributed an average of 19% variation in development credits (biodiversity loss) and 34% variation in offset credits (biodiversity gain). Evidence of systematic bias among observers for some attributes added further uncertainty to offset outcomes. Our study reveals the need for improved assessor training and field methods to improve assessment consistency, transparency, and reduce offset outcome variability. We used concurrent assessments from five offsets assessors to quantify inter‐observer variation and its effect on biodiversity offset outcomes. Variation among assessors was high during field estimates of biodiversity surrogate values, and this led to uncertain offset outcomes. Global offset schemes will need to reassess their measurement methodology, definitions, and training to improve biodiversity outcomes.

Journal Article

Share this book

Add to My Shelf

Whole-body MRI versus an 18FFDG-PET/CT-based reference standard for early response assessment and restaging of paediatric Hodgkin’s lymphoma: a prospective multicentre study

by Sábado, Constantino , de Keizer, Bart , Tolboom, Nelleke in Adolescent , Agreements , Cancer and Oncology

2021

Objectives To compare WB-MRI with an [ 18 F]FDG-PET/CT-based reference for early response assessment and restaging in children with Hodgkin’s lymphoma (HL). Methods Fifty-one children (ages 10–17) with HL were included in this prospective, multicentre study. All participants underwent WB-MRI and [ 18 F]FDG-PET/CT at early response assessment. Thirteen of the 51 patients also underwent both WB-MRI and [ 18 F]FDG-PET/CT at restaging. Two radiologists independently evaluated all WB-MR images in two separate readings: without and with DWI. The [ 18 F]FDG-PET/CT examinations were evaluated by a nuclear medicine physician. An expert panel assessed all discrepancies between WB-MRI and [ 18 F]FDG-PET/CT to derive the [ 18 F]FDG-PET/CT-based reference standard. Inter-observer agreement for WB-MRI was calculated using kappa statistics. Concordance, PPV, NPV, sensitivity and specificity for a correct assessment of the response between WB-MRI and the reference standard were calculated for both nodal and extra-nodal disease presence and total response evaluation. Results Inter-observer agreement of WB-MRI including DWI between both readers was moderate ( κ 0.46–0.60). For early response assessment, WB-MRI DWI agreed with the reference standard in 33/51 patients (65%, 95% CI 51–77%) versus 15/51 (29%, 95% CI 19–43%) for WB-MRI without DWI. For restaging, WB-MRI including DWI agreed with the reference standard in 9/13 patients (69%, 95% CI 42–87%) versus 5/13 patients (38%, 95% CI 18–64%) for WB-MRI without DWI. Conclusions The addition of DWI to the WB-MRI protocol in early response assessment and restaging of paediatric HL improved agreement with the [ 18 F]FDG-PET/CT-based reference standard. However, WB-MRI remained discordant in 30% of the patients compared to standard imaging for assessing residual disease presence. Key Points • Inter-observer agreement of WB-MRI including DWI between both readers was moderate for (early) response assessment of paediatric Hodgkin’s lymphoma. • The addition of DWI to the WB-MRI protocol in early response assessment and restaging of paediatric Hodgkin’s lymphoma improved agreement with the [18F]FDG-PET/CT-based reference standard. • WB-MRI including DWI agreed with the reference standard in respectively 65% and 69% of the patients for early response assessment and restaging.

Journal Article

Share this book

Add to My Shelf

Non-tuberculous mycobacterial lung disease: diagnosis based on computed tomography of the chest

by Yim, Jae-Joon , Lee, Hyun-Ju , Lee, Chang Hyun in Accuracy , Aged , Agreements

2016

Objectives To elucidate the accuracy and inter-observer agreement of non-tuberculous mycobacterial lung disease (NTM-LD) diagnosis based on chest CT findings. Methods Two chest radiologists and two pulmonologists interpreted chest CTs of 66 patients with NTM-LD, 33 with pulmonary tuberculosis and 33 with non-cystic fibrosis bronchiectasis. These observers selected one of these diagnoses for each case without knowing any clinical information except age and sex. Sensitivity and specificity were calculated according to degree of observer confidence. Inter-observer agreement was assessed using Fleiss’ κ values. Multiple logistic regression was performed to elucidate which radiological features led to the correct diagnosis. Results The sensitivity of NTM-LD diagnosis was 56.4 % (95 % CI 47.9–64.7) and specificity 80.3 % (73.1–86.0). The specificity of NTM-LD diagnosis increased with confidence: 44.4 % (20.5–71.3) for possible, 77.4 % (67.4–85.0) for probable, 95.2 % (87.2–98.2) for definite ( P < 0.001) diagnoses. Inter-observer agreement for NTM-LD diagnosis was moderate (κ = 0.453). Tree-in-bud pattern (adjusted odds ratio [aOR] 6.24, P < 0.001), consolidation (aOR 1.92, P = 0.036) and atelectasis (aOR 3.73, P < 0.001) were associated with correct NTM-LD diagnoses, whereas presence of pleural effusion (aOR 0.05, P < 0.001) led to false diagnoses. Conclusions NTM-LD diagnosis based on chest CT findings is specific but not sensitive. Key Points • Diagnosis of NTM-LD based on radiological findings showed high specificity. • Sensitivity of NTM-LD diagnosis was around 50 %. • Inter- observer agreement was moderate. • Identification of tree-in-bud pattern, consolidation and atelectasis led to correct diagnoses.

Journal Article

Share this book

Add to My Shelf

Inter-observer variation between pathologists in diffuse parenchymal lung disease

by Nicholson, A G , Addis, B J , Gibbs, A R in Agreements , Biological and medical sciences , Biopsy

2004

Background: There have been few inter-observer studies of diffuse parenchymal lung disease (DPLD), but the recent ATS/ERS consensus classification provides a basis for such a study. Methods: A method for categorising numerically the percentage likelihood of these differential diagnoses was developed, and the diagnostic confidence of pathologists using this classification and the reproducibility of their diagnoses were assessed. Results: The overall kappa coefficient of agreement for the first choice diagnosis was 0.38 (n = 133 biopsies), increasing to 0.43 for patients (n = 83) with multiple biopsies. Weighted kappa coefficients of agreement, quantifying the level of probability of individual diagnoses, were moderate to good (mean 0.58, range 0.40–0.75). However, in 18% of biopsy specimens the diagnosis was given with low confidence. Over 50% of inter-observer variation related to the diagnosis of non-specific interstitial pneumonia and, in particular, its distinction from usual interstitial pneumonia. Conclusion: These results show that the ATS/ERS classification can be applied reproducibly by pathologists who evaluate DPLD routinely, and support the practice of taking multiple biopsy specimens.

Journal Article

Share this book

Add to My Shelf

Should the meniscal height be considered for preoperative sizing in meniscal transplantation?

by Severino, Nilson Roberto , Netto, Alfredo dos Santos , Silva, Julio Cesar de Almeida e in Adolescent , Adult , Anthropometry

2018

Purpose and hypothesis In preoperative sizing for meniscal transplantation, most authors take into consideration the length and width of the original meniscus, but not its height. This study aimed at evaluating (1) whether the meniscal height is associated with the meniscal length and width, (2) whether the heights of the meniscal segments are associated with the individual’s anthropometric data, (3) whether the heights of the meniscal segments are associated with each other in the same meniscus, and (4) the degree of symmetry of the meniscal dimensions between the right and left knees. Methods In this cross-sectional, observational study, two independent radiologists measured the meniscal length, width and height in knee magnetic resonance imaging scans obtained from 25 patients with patello-femoral pain syndrome. Reproducibility of measurements was calculated with intraclass correlation coefficients. Associations between the anthropometric data and the meniscal measurements, the meniscal length and width versus height, and the heights of the meniscal segments in the same meniscus were examined with Pearson’s correlation. Results Inter-observer reliability was excellent (>0.8) for length and height and good (0.6–0.8) for width measurements. There was also excellent agreement (>0.8) for the length and width of the menisci in the right and left knees. The heights of the horns of the lateral meniscus showed good agreement (0.6–0.8), while the heights of the other meniscal segments had excellent agreement between the sides (>0.8). There were significant associations with generally low ( r < 0.5) correlation between the heights of the meniscal segments and the lengths and widths of the menisci, between the meniscal height and anthropometric data, and between the heights of the meniscal segments in the same meniscus. Correlations between anthropometric data and meniscal length and width were generally high ( r > 0.7). Conclusions There was excellent agreement between the meniscal dimensions of the right and left knees, and a weak association between the meniscal height with the meniscal width and length, between the height of the menisci with anthropometric data and between the heights of the segments in the same meniscus. The height of the meniscal segments may be a new variable in preoperative meniscal measurement.

Journal Article

Share this book

Add to My Shelf

Validation of Varian’s SmartAdapt® deformable image registration algorithm for clinical application

by Louwe, Robert J W , Evans, Jamie , Hamilton, David A in Algorithms , Atrophy , Biomedical and Life Sciences

2015

Background Re-contouring of structures on consecutive planning computed tomography (CT) images for patients that exhibit anatomical changes is elaborate and may negatively impact the turn-around time if this is required for many patients. This study was therefore initiated to validate the accuracy and usefulness of automatic contour propagation for head and neck cancer patients using SmartAdapt® which is the deformable image registration (DIR) application in Varian’s Eclipse™ treatment planning system. Methods CT images of eight head and neck cancer patients with multiple planning CTs were registered using SmartAdapt®. The contoured structures of target volumes and OARs of the primary planning CT were deformed accordingly and subsequently compared with a reference structure set being either: 1) a structure set independently contoured by the treating Radiation Oncologist (RO), or 2) the DIR-generated structure set after being reviewed and modified by the RO. Results Application of DIR offered a considerable time saving for ROs in delineation of structures on CTs that were acquired mid-treatment. Quantitative analysis showed that 84% of the volume of the DIR-generated structures overlapped with the independently re-contoured structures, while 94% of the volume overlapped with the DIR-generated structures after review by the RO. This apparent intra-observer variation was further investigated resulting in the identification of several causes. Qualitative analysis showed that 92% of the DIR-generated structures either need no or only minor modification during RO reviews. Conclusions SmartAdapt is a powerful tool with sufficient accuracy that saves considerable time in re-contouring structures on re-CTs. However, careful review of the DIR-generated structures is mandatory, in particular in areas where tumour regression plays a role.

Journal Article

Share this book

Add to My Shelf

Meniscus sizing using three-dimensional models of the ipsilateral tibia plateau based on CT scans – an experimental study of a new sizing approach

by Fürnstahl, Philipp , Sutter, Reto , Beeler, Silvan in Inter‐observer variation , Knee , Magnetic resonance

2020

Purpose Selection of a meniscus allograft with a similar three-dimensional (3D) size is essential for good clinical results in meniscus allograft surgery. Direct meniscus sizing by MRI scan is not possible in total meniscectomy and indirect sizing by conventional radiography is often inaccurate. The purpose of this study was to develop a new indirect sizing method, based on the 3D shape of the ipsilateral tibia plateau, which is independent of the meniscus condition. Methods MRI and CT scans of fifty healthy knee joints were used to create 3D surface models of both menisci (MRI) and tibia plateau (CT). 3D bone models of the proximal 10 mm of the entire and half tibia plateau (with / without intercondylar area) were created in a standardized fashion. For each meniscus, the best fitting “allograft” couple out of all other 49 menisci were assessed by the surface distance of the 3D meniscus (best available allograft), of the 3D tibia plateau (3D-CT) and by the radiographic method of Pollard (2D-RX). Results 3D-CT sizing was significantly better by using only the half tibia plateau without the intercondylar area ( p < 0.001). But neither sizing by 3D-CT, nor by 2D-RX could select the best available allograft. Compared to 2D-RX, 3D-CT sizing was significantly better for the medial, but not for the lateral meniscus. Conclusions Automatized, indirect meniscus sizing using the 3D bone models of the tibia plateau is feasible and more precise than the previously described 2D-RX method.. However, further technical improvement is needed to select always the best available allograft.

Journal Article

Share this book

Add to My Shelf

New endoscopic classification of cascade stomach, a risk factor for reflux esophagitis

by Shimoyama, Yasuyuki , Sagawa, Toshihiko , Mizuide, Masafumi in Abdominal Surgery , Adult , Aged

2017

Background We recently demonstrated that cascade stomach detected by barium studies was correlated with upper gastrointestinal symptoms. We developed a new endoscopic classification of cascade stomach and examined its relationship with reflux esophagitis. Methods Study 1: the classification (grades 0–3) was based on detecting a ridge that runs from the cardia toward the anterior wall crossing the greater curvature. Inter-observer variation was evaluated by kappa statistics when ten experienced endoscopists used this classification three times each. Study 2: in 710 consecutive subjects (500 men and 210 women) undergoing endoscopic screening, the grade of cascade stomach and incidence of reflux esophagitis were compared. Results In study 1, the kappa values at the third assessment were 0.85, 0.58, 0.50, and 0.78 for each grade, respectively, while overall agreement was 0.68. In study 2, the incidence of reflux esophagitis in men was 20 % in grade 0, 17 % in grade 1, 25 % in grade 2, and 30 % in grade 3, showing significant differences. Among women, the incidence of reflux esophagitis in each grade was 9, 3, 6, and 35 %, respectively, also showing significant differences. Multivariate analysis showed that independent risk factors for reflux esophagitis were cascade stomach (odds ratio = 2.20), body mass index, and hiatus hernia in men, as well as cascade stomach (odds ratio = 9.01) and smoking tobacco in women. Conclusions This endoscopic classification of cascade stomach showed acceptable inter-observer variation. Cascade stomach is a risk factor for reflux esophagitis.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter