Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
87
result(s) for
"Downing, Steven"
Sort by:
A call for the utilization of consensus standards in the surgical education literature
by
Downing, Steven M.
,
Kasten, Steven J.
,
Korndorffer, James R.
in
Biological and medical sciences
,
Clinical Competence
,
Computer Simulation
2010
Assessment methods and theory continue to evolve in the general education literature. Nowhere is this more evident than in the framework of validity methods and concepts. The consensus standards of the American Educational Research Association, American Psychological Association, and National Council on Measurement in Education have changed from “types of validity” (criterion, construct, and content) and “valid instruments,” last used in 1974, to a concept of identifying evidence for the validity of results and the use of those results. The purpose of this study was to evaluate the surgical education literature for the adoption of the current consensus standards.
As a representative sample of the surgical educational literature, the validation effort in laparoscopic simulator education was chosen. A MEDLINE search using the terms validity.tw and laparoscop$.tw between 1996 and 2008 (September week 1) yielded 192 citations. All titles and abstracts were reviewed, resulting in 47 studies appropriate for in-depth analysis.
Validation studies have evaluated 21 different simulators. Twenty-three percent of the studies adhere, in part, to the new consensus standards for validity. One hundred percent use the old framework of types of validity including 75% using construct validity, 38% using face validity, and 11% using content.
The widespread use of the currently (after 1999) accepted framework for validity is lacking in the surgical education literature. Surgical educators must remain current and begin to investigate our assessments within the contemporary framework of validity to avoid improper judgments of performance.
Journal Article
Effective home laparoscopic simulation training: a preliminary evaluation of an improved training paradigm
by
Downing, Steven M.
,
Bellows, Charles F.
,
Harris, Ilene B.
in
Adult
,
Analysis of Variance
,
Biological and medical sciences
2012
Laparoscopic simulation training has proven to be effective in developing skills but requires expensive equipment, is a challenge to integrate into a work-hour restricted surgical residency, and may use nonoptimal practice schedules. The purpose of this study was to evaluate the efficacy of laparoscopic skills training at home using inexpensive trainer boxes.
Residents (n = 20, postgraduate years 1–5) enrolled in an institutional review board–approved laparoscopic skills training protocol. An instructional video was reviewed, and baseline testing was performed using the fundamentals of laparoscopic surgery (FLS) peg transfer and suturing tasks. Participants were randomized to home training with inexpensive, self-contained trainer boxes or to simulation center training using standard video trainers. Discretionary, goal-directed training of at least 1 hour per week was encouraged. A posttest and retention test were performed. Intragroup and intergroup comparisons as well as the relationship between the suture score and the total training sessions, the time in training, and attempts were studied.
Intragroup comparisons showed significant improvement from baseline to the posttest and the retention test. No differences were shown between the groups. The home-trained group practiced more, and the number of sessions correlated with suture retention score (
r
2 = .54,
P < .039).
Home training results in laparoscopic skill acquisition and retention. Training is performed in a more distributed manner and trends toward improved skill retention.
Journal Article
The Effects of Violating Standard Item Writing Principles on Tests and Students: The Consequences of Using Flawed Test Items on Achievement Examinations in Medical Education
The purpose of this research was to study the effects of violations of standard multiple-choice item writing principles on test characteristics, student scores, and pass-fail outcomes. Four basic science examinations, administered to year-one and year-two medical students, were randomly selected for study. Test items were classified as either standard or flawed by three independent raters, blinded to all item performance data. Flawed test questions violated one or more standard principles of effective item writing. Thirty-six to sixty-five percent of the items on the four tests were flawed. Flawed items were 0-15 percentage points more difficult than standard items measuring the same construct. Over all four examinations, 646 (53%) students passed the standard items while 575 (47%) passed the flawed items. The median passing rate difference between flawed and standard items was 3.5 percentage points, but ranged from -1 to 35 percentage points. Item flaws had little effect on test score reliability or other psychometric quality indices. Results showed that flawed multiple-choice test items, which violate well established and evidence-based principles of effective item writing, disadvantage some medical students. Item flaws introduce the systematic error of construct-irrelevant variance to assessments, thereby reducing the validity evidence for examinations and penalizing some examinees.
Journal Article
Assessment in Health Professions Education
2009,2008
Assessment in Health Professions Education provides comprehensive guidance for persons engaged in the teaching and testing of the health professions – medicine, dentistry, nursing, pharmacy and allied fields. Part I of the book provides a user-friendly introduction to assessment fundamentals and their theoretical underpinnings; Part II describes specific assessment methods used in the health professions, with a focus on best practices, assessment challenges, and practical guidelines for the effective implementation of successful assessment programs. Key features :
Comprehensive – the first text to provide broad, single-source coverage of all aspects of assessment in the health professions.
Accessible – while scholarly and evidence-based, the book is geared towards health professions educators who are not measurement specialists.
Thematic – assessment validity is an organizing theme and provides a conceptual framework throughout the book.
List of Figures
List of Tables
Preface, Steven M. Downing and Rachel Yudkowsky
Chapter-Specific Acknowledgments
1 Introduction to Assessment, Steven M. Downing and Rachel Yudkowsky
2 Validity, Steven M. Downing and Thomas M. Haladyna
3 Reliability, Rick D. Axelson and Clarence D. Kreiter
4 Generalizability Theory, Clarence D. Kreiter
5 Statistics of Testing, Steven M. Downing
6 Standard Setting , Rachel Yudkowsky, Steven M. Downing, and Ara Tekian
7 Written Tests: Constructed-Response and Selected-Response Formats, Steven M. Downing
8 Observational Assessment, William C. McGaghie, John Butter, and Marsha Kaye
9 Performance Tests, Rachel Yudkowsky
10 Simulations in Assessment, William C. McGaghie and S. Bary Issenberg
11 Oral Examinations, Ara Tekian and Rachel Yudkowsky
12 Assessment Portfolios, Ara Tekian and Rachel Yudkowsky
List of Contributors
Index
Teaching of evidence-based medicine to medical students in Mexico: a randomized controlled trial
by
Schwartz, Alan
,
Kieffer-Escobar, Luis F
,
Sánchez-Mendiola, Melchor
in
Aerospace medicine
,
Aerospace Medicine - education
,
Assessment and evaluation of admissions
2012
Background
Evidence-Based Medicine (EBM) is an important competency for the healthcare professional. Experimental evidence of EBM educational interventions from rigorous research studies is limited. The main objective of this study was to assess EBM learning (knowledge, attitudes and self-reported skills) in undergraduate medical students with a randomized controlled trial.
Methods
The educational intervention was a one-semester EBM course in the 5
th
year of a public medical school in Mexico. The study design was an experimental parallel group randomized controlled trial for the main outcome measures in the 5
th
year class (M5 EBM vs. M5 non-EBM groups), and quasi-experimental with static-groups comparisons for the 4
th
year (M4, not yet exposed) and 6
th
year (M6, exposed 6 months to a year earlier) groups. EBM attitudes, knowledge and self-reported skills were measured using Taylor’s questionnaire and a summative exam which comprised of a 100-item multiple-choice question (MCQ) test.
Results
289 Medical students were assessed: M5 EBM=48, M5 non-EBM=47, M4=87, and M6=107. There was a higher reported use of the Cochrane Library and secondary journals in the intervention group (M5 vs. M5 non-EBM). Critical appraisal skills and attitude scores were higher in the intervention group (M5) and in the group of students exposed to EBM instruction during the previous year (M6). The knowledge level was higher after the intervention in the M5 EBM group compared to the M5 non-EBM group (p<0.001, Cohen's
d
=0.88 with Taylor's instrument and 3.54 with the 100-item MCQ test). M6 Students that received the intervention in the previous year had a knowledge score higher than the M4 and M5 non-EBM groups, but lower than the M5 EBM group.
Conclusions
Formal medical student training in EBM produced higher scores in attitudes, knowledge and self-reported critical appraisal skills compared with a randomized control group. Data from the concurrent groups add validity evidence to the study, but rigorous follow-up needs to be done to document retention of EBM abilities.
Journal Article
Development and preliminary psychometric properties of a well-being index for medical students
by
Szydlo, Daniel W
,
Sloan, Jeff A
,
Dyrbye, Liselotte N
in
Demographic aspects
,
Education
,
Health aspects
2010
Background
Psychological distress is common among medical students but manifests in a variety of forms. Currently, no brief, practical tool exists to simultaneously evaluate these domains of distress among medical students. The authors describe the development of a subject-reported assessment (Medical Student Well-Being Index, MSWBI) intended to screen for medical student distress across a variety of domains and examine its preliminary psychometric properties.
Methods
Relevant domains of distress were identified, items generated, and a screening instrument formed using a process of literature review, nominal group technique, input from deans and medical students, and correlation analysis from previously administered assessments. Eleven experts judged the clarity, relevance, and representativeness of the items. A Content Validity Index (CVI) was calculated. Interrater agreement was assessed using pair-wise percent agreement adjusted for chance agreement. Data from 2248 medical students who completed the MSWBI along with validated full-length instruments assessing domains of interest was used to calculate reliability and explore internal structure validity.
Results
Burnout (emotional exhaustion and depersonalization), depression, mental quality of life (QOL), physical QOL, stress, and fatigue were domains identified for inclusion in the MSWBI. Six of 7 items received item CVI-relevance and CVI-representativeness of ≥0.82. Overall scale CVI-relevance and CVI-representativeness was 0.94 and 0.91. Overall pair-wise percent agreement between raters was ≥85% for clarity, relevance, and representativeness. Cronbach's alpha was 0.68. Item by item percent pair-wise agreements and Phi were low, suggesting little overlap between items. The majority of MSWBI items had a ≥74% sensitivity and specificity for detecting distress within the intended domain.
Conclusions
The results of this study provide evidence of reliability and content-related validity of the MSWBI. Further research is needed to assess remaining psychometric properties and establish scores for which intervention is warranted.
Journal Article
Validation of a Colonoscopy Simulation Model for Skills Assessment
by
Schwartz, Alan J
,
Sedlack, Robert E
,
Baron, Todd H
in
Animals
,
Biological and medical sciences
,
Cattle
2007
The purpose is to provide initial validation of a novel simulation model's fidelity and ability to assess competence in colonoscopy skills.
In a prospective, cross-sectional design, each of 39 endoscopists (13 staff, 13 second year fellows, and 13 novices) performed a colonoscopy on a novel bovine simulation model. Staff endoscopists also completed a survey examining different aspects of the model's realism as compared to human colonoscopy. The groups' simulation performances were compared. Additionally, individual performances were correlated to patient-based performance data.
Median model realism evaluation scores were favorable for nearly all parameters evaluated with mucosa appearance, endoscopic view, and paradoxical motion parameters receiving the highest scores. During simulation procedures, each group outperformed the less experienced groups in all parameters evaluated. Specifically, median cecal intubation times were: staff 226 s (IQR [interquartile range] 179-273), fellows 340 s (282-568), and novices 1,027 s (970-1,122) (P < 0.05). Median total procedure times on the model were: staff 468 s (416-501), fellows 527 s (459-824), and novices 1,350 s (1,318-1,428) (P < 0.05). Finally, individual cecal intubation times on the simulation model had a very high correlation to their respective patient-based times (r = 0.764).
Overall, this model possesses a favorable degree of realism and is able to easily differentiate users based on their level of colonoscopy experience. More impressive, however, is the strong correlation between individual's simulated intubation times and actual patient-based colonoscopy data. In light of these findings, we speculate that this model has potential to be an effective tool for assessment of colonoscopic competence.
Journal Article
Quality control of an OSCE using generalizability theory and many-faceted Rasch measurement
by
Myford, Carol M.
,
Downing, Steven M.
,
Iramaneerat, Cherdsak
in
Adult
,
Chi-Square Distribution
,
Clinical Competence
2008
An Objective Structured Clinical Examination (OSCE) is an effective method for evaluating competencies. However, scores obtained from an OSCE are vulnerable to many potential measurement errors that cases, items, or standardized patients (SPs) can introduce. Monitoring these sources of errors is an important quality control mechanism to ensure valid interpretations of the scores. We describe how one can use generalizability theory (GT) and many-faceted Rasch measurement (MFRM) approaches in quality control monitoring of an OSCE. We examined the communication skills OSCE of 79 residents from one Midwestern university in the United States. Each resident performed six communication tasks with SPs, who rated the performance of each resident using 18 5-category rating scale items. We analyzed their ratings with generalizability and MFRM studies. The generalizability study revealed that the largest source of error variance besides the residual error variance was SPs/cases. The MFRM study identified specific SPs/cases and items that introduced measurement errors and suggested the nature of the errors. SPs/cases were significantly different in their levels of severity/difficulty. Two SPs gave inconsistent ratings, which suggested problems related to the ways they portrayed the case, their understanding of the rating scale, and/or the case content. SPs interpreted two of the items inconsistently, and the rating scales for two items did not function as 5-category scales. We concluded that generalizability and MFRM analyses provided useful complementary information for monitoring and improving the quality of an OSCE.
Journal Article
Objective structured clinical examinations as an assessment method in residency training : Practical considerations
by
Downing, StevenM
,
Hijazi, Mohammed
in
Clinical Competence
,
Education, Medical, Graduate - standards
,
Humans
2008
Unfortunately, the assessment of competence using clinical examinations has its limitations because of low reliability and validity. Because of improved reliability and validity, the objective structured clinical examination (OSCE) is a more accurate means of assessing the clinical competence of residents.
Journal Article
Threats to the Validity of Locally Developed Multiple-Choice Tests in Medical Education: Construct-Irrelevant Variance and Construct Underrepresentation
2002
Construct-irrelevant variance (CIV) - the erroneous inflation or deflation of test scores due to certain types of uncontrolled or systematic measurement error - and construct underrepresentation (CUR) - the under-sampling of the achievement domain - are discussed as threats to the meaningful interpretation of scores from objective tests developed for local medical education use. Several sources of CIV and CUR are discussed and remedies are suggested. Test score inflation or deflation, due to the systematic measurement error introduced by CIV, may result from poorly crafted test questions, insecure test questions and other types of test irregularities, testwiseness, guessing, and test item bias. Using indefensible passing standards can interact with test scores to produce CIV. Sources of content underrepresentation are associated with tests that are too short to support legitimate inferences to the domain and which are composed of trivial questions written at low-levels of the cognitive domain. \"Teaching to the test\" is another frequent contributor to CUR in examinations used in medical education. Most sources of CIV and CUR can be controlled or eliminated from the tests used at all levels of medical education, given proper training and support of the faculty who create these important examinations.
Journal Article