Catalogue Search | MBRL

Test-takers’ adaptability to computerized language testing in China: taking Test for English Majors as an example

by Wang, Wei in Adaptability , College students , Computer Appl. in Social and Behavioral Sciences

2024

As information and communication technologies develop in China, language tests are shifting from conventional paper-and-pencil testing to computerized testing. The aim of this study is to investigate Chinese test-takers’ adaptability to computerized language exams, including their performance across testing modes and their perception of the transition to computerized language tests. The study was conducted at three universities in China and its participants were 206 English language majors. This research was designed as a mixed methods study. In this respect, participants’ performance in the exam was regarded as quantitative data and their perception of the exam as qualitative data. Quantitative data were gathered through two rounds of language tests and an after-test questionnaire, while qualitative data were collected through interviews. It was found that there was no statistically significant difference in test-takers’ scores between paper-and-pencil language testing and computerized language testing. It was revealed that candidates in China were generally able to adapt to computer-based language tests. The scores retrieved from the language test and questionnaire showed that Chinese candidates were most adapted to computerized listening tests, followed by reading and writing tests. Based on the research findings, the challenges posed by the change of testing modes to Chinese test-takers are summarized, and suggestions for teachers, language learners and language test administrators in China are provided. While this work explores students’ adaptability to computerized language testing specifically in China, the presented findings can provide insights for countries with a similar context.

Journal Article

Share this book

Add to My Shelf

The Four-Parameter Logistic Item Response Theory Model As a Robust Method of Estimating Ability Despite Aberrant Responses

by Yen, Yung-Chin , Liao, Wen-Wei , Ho, Rong-Guey in Aberrant Responses , Ability , Administrative efficiency

2012

In computerized adaptive testing (CAT), aberrant responses such as careless errors and lucky guesses may cause significant ability estimation biases in the dynamic administration of test items. We investigated the robustness of the 4-parameter logistic item response theory (4PL IRT; Barton & Lord, 1981) model in comparison with the 3-parameter logistic (3PL) IRT model (Birnbaum, 1968). We applied additional precision and efficiency measures to evaluate the 4PL IRT model. We measured the precision of CAT with respect to the estimation bias and mean absolute differences (MAD) between estimated and actual abilities. An improvement in administrative efficiency is reflected in fewer items being required for satisfying the stopping rule. Our results indicate that the 4PL IRT model provides a more efficient and robust ability estimation than the 3PL model.

Journal Article

Share this book

Add to My Shelf

A narrative review of data collection and analysis guidelines for comparative effectiveness research in chronic pain using patient-reported outcomes and electronic health records

by Gillman, Andrea G , Dressler, Alex M , Wasan, Ajay D in Analgesics , Analysis , Care and treatment

2019

Chronic pain is a widespread and complex set of conditions that are often difficult and expensive to treat. Comparative effectiveness research (CER) is an evolving research method that is useful in determining which treatments are most effective for medical conditions such as chronic pain. An underutilized mechanism for conducting CER in pain medicine involves combining patient-reported outcomes (PROs) with electronic health records (EHRs). Patient-reported pain and mental and physical health outcomes are increasingly collected during clinic visits, and these data can be linked to EHR data that are relevant to the treatment of a patient's pain, such as diagnoses, medications ordered, and medical comorbidities. When aggregated, this information forms a data repository that can be used for high-quality CER. This review provides a blueprint for conducting CER using PROs combined with EHRs. As an example, the University of Pittsburgh's patient outcomes repository for treatment is described. This system includes PROs collected via the Collaborative Health Outcomes Information Registry software and cross-linked data from the University of Pittsburgh Medical Center EHR. The requirements, best practice guidelines, statistical considerations, and caveats for performing CER with this type of data repository are also discussed.

Journal Article

Share this book

Add to My Shelf

Effect of aging on visual attention: Evidence from the Attention Network Test

by Fu, Jia , Yu, Guoming , Zhao, Lun in Aging , Analysis , Attention

2021

We investigated the effects of aging on attentional functions using the Attention Network Test (ANT), which enables simultaneous testing of alerting, orienting, and executive networks, and their interactions. Participants were 38 young adults ([M.sub.age] = 21.35 years) and 36 older adults ([M.sub.age] = 71.17 years). Although the older adults exhibited a slower overall response, the three attentional functions showed different modulation according to age group and the trial block being completed. Older adults exhibited significant impairment in the alerting function, regardless of whether they were completing the first or second block of trials, whereas their executive function decreased significantly only in Block 2 owing to cognitive fatigue. Both age groups performed similarly for the orienting function. Future researchers should seek to further clarify the specificity of attention function with people aged over 70 years to address their attention disturbance.

Journal Article

Share this book

Add to My Shelf

Feasibility of PROMIS using computerized adaptive testing during inpatient rehabilitation

by Heinemann, Allen W. , Rafiq, Riyad Bin , Yount, Susan in Computerized Adaptive Testing , Computerized adaptive testing (CAT) , Feasibility Studies

2023

Background There has been an increased significance on patient-reported outcomes in clinical settings. We aimed to evaluate the feasibility of administering patient-reported outcome measures by computerized adaptive testing (CAT) using a tablet computer with rehabilitation inpatients, assess workload demands on staff, and estimate the extent to which rehabilitation inpatients have elevated T-scores on six Patient Reported Outcomes Measurement Information System® (PROMIS®) measures. Methods Patients (N = 108) with stroke, spinal cord injury, traumatic brain injury, and other neurological disorders participated in this study. PROMIS computerized adaptive tests (CAT) were administered via a web-based platform. Summary scores were calculated for six measures: Pain Interference, Sleep Disruption, Anxiety, Depression, Illness Impact Positive, and Illness Impact Negative. We calculated the percent of patients with T-scores equivalent to 2 standard deviations or greater above the mean. Results During the first phase, we collected data from 19 of 49 patients; of the remainder, 61% were not available or had cognitive or expressive language impairments. In the second phase of the study, 40 of 59 patients participated to complete the assessment. The mean PROMIS T-scores were in the low 50 s, indicating an average symptom level, but 19–31% of patients had elevated T-scores where the patients needed clinical action. Conclusions The study demonstrated that PROMIS assessment using a CAT administration during an inpatient rehabilitation setting is feasible with the presence of a research staff member to complete PROMIS assessment.

Journal Article

Share this book

Add to My Shelf

Prospective, Head-to-Head Study of Three Computerized Neurocognitive Assessment Tools (CNTs): Reliability and Validity for the Assessment of Sport-Related Concussion

by Nelson, Lindsay D. , Randolph, Christopher , Guskiewicz, Kevin in Adolescent , Analysis of Variance , Athletes

2016

Limited data exist comparing the performance of computerized neurocognitive tests (CNTs) for assessing sport-related concussion. We evaluated the reliability and validity of three CNTs—ANAM, Axon Sports/Cogstate Sport, and ImPACT—in a common sample. High school and collegiate athletes completed two CNTs each at baseline. Concussed (n=165) and matched non-injured control (n=166) subjects repeated testing within 24 hr and at 8, 15, and 45 days post-injury. Roughly a quarter of each CNT’s indices had stability coefficients (M=198 day interval) over .70. Group differences in performance were mostly moderate to large at 24 hr and small by day 8. The sensitivity of reliable change indices (RCIs) was best at 24 hr (67.8%, 60.3%, and 47.6% with one or more significant RCIs for ImPACT, Axon, and ANAM, respectively) but diminished to near the false positive rates thereafter. Across time, the CNTs’ sensitivities were highest in those athletes who became asymptomatic within 1 day before neurocognitive testing but was similar to the tests’ false positive rates when including athletes who became asymptomatic several days earlier. Test–retest reliability was similar among these three CNTs and below optimal standards for clinical use on many subtests. Analyses of group effect sizes, discrimination, and sensitivity and specificity suggested that the CNTs may add incrementally (beyond symptom scores) to the identification of clinical impairment within 24 hr of injury or within a short time period after symptom resolution but do not add significant value over symptom assessment later. The rapid clinical recovery course from concussion and modest stability probably jointly contribute to limited signal detection capabilities of neurocognitive tests outside a brief post-injury window. (JINS, 2016, 22, 24–37)

Journal Article

Share this book

Add to My Shelf

The validity of computerized Montreal cognitive assessment among aging people living with HIV: A pilot study

by Avihingsanon, Anchalee , Booncharoen, Kittithatch , Suksawek, Saowaluk in Aged , Aging , Aging - psychology

2025

Background As the population of aging people living with HIV (PWH) increases, many have faced neurocognitive problems. Cognitive assessment plays a crucial role as the initial step in cognitive care of this specific population. We aimed to determine the validity between a traditional paper-based and tablet-based cognitive assessment tool among aging Thai PWH. Methods PWH aged ≥ 50 years underwent cognitive assessment using the Thai-validated Montreal Cognitive Assessment (MoCA). Participants were randomly assigned to receive either the paper-based MoCA or the tablet-based MoCA (eMoCA) first. Two weeks later, participants returned to complete the alternate version of the MoCA. Pearson correlation was used to determine the strength of the relationship between the paper-based MoCA and the eMoCA scores. Concordance correlation coefficients (CCC) were calculated, and a Bland-Altman plot was employed to determine the level of agreement between the two testing methods. Additionally, MoCA scores were compared between individuals with and without prior touchscreen tablet experience. Results Among 46 participants included in the analysis, 12 (26.1%) had experience using a touchscreen tablet. The score discrepancy between the two MoCA versions ranged from − 8 to 6, with a mean (SD) difference of -1.33 (3.22). The Pearson correlation coefficient between the paper-based MoCA and the eMoCA was r = 0.54 ( p = 0.001), with a concordance correlation coefficient of 0.47. The Bland-Altman plot showed 95% limits of agreement between − 7.63 and 4.98. Among participants with prior touchscreen tablet experience, scores between the paper-based MoCA and the eMoCA were comparable. However, those without prior touchscreen experience had significantly lower scores on the eMoCA compared to the paper-based MoCA (mean difference − 1.56, 95% CI -2.72 to -0.40). Conclusions The eMoCA demonstrated moderate correlation with the paper-based MoCA, with prior touchscreen tablet experience significantly affecting the validity of the MoCA scores between the two versions. Clinicians should consider individuals’ level of touchscreen experience before selecting the administration modality.

Journal Article

Share this book

Add to My Shelf

Monthly At-Home Computerized Cognitive Testing to Detect Diminished Practice Effects in Preclinical Alzheimer's Disease

by Yassa, Michael A , Buckley, Rachel F , Johnson, Keith A in Aging , Alzheimer's disease , Biomarkers

2022

Introduction: We investigated whether monthly assessments of a computerized cognitive composite (C3) could aid in the detection of differences in practice effects (PE) in clinically unimpaired (CU) older adults, and whether diminished PE were associated with Alzheimer’s disease (AD) biomarkers and annual cognitive decline. Materials and Methods: N=114 CU participants (age 77.6±5.0, 61% female, MMSE 29±1.2) from the Harvard Aging Brain Study completed the self-administered C3 monthly, at-home, on an iPad for over one year. At baseline, participants underwent in-clinic Preclinical Alzheimer’s Cognitive Composite-5 (PACC5) testing, and a subsample (n=72, age=77.8±4.9, 59% female, MMSE 29±1.3) had one-year follow-up in-clinic PACC5 testing available. Participants had undergone PIB-PET imaging (0.99±1.6 years before at-home baseline) and Flortaucipir PET imaging (n=105, 0.62±1.1 years before at-home baseline). Linear mixed models were used to investigate change over months on the C3 adjusting for age, sex, and years of education, and to extract individual covariate-adjusted slopes over the first three months. We investigated the association of 3-month C3 slopes with global amyloid burden and tau deposition in eight predefined regions of interest, and conducted Receiver Operating Curve analyses to examine how accurately 3-month C3 slopes could identify individuals that showed >0.10 SD annual decline on the PACC-5. Results: Overall, individuals improved on all C3 measures over 12 months (β=0.23, 95%CI=[0.21-0.25], p<.001), but improvement over the first 3 months was greatest (β=0.68, 95%CI=[0.59-0.77], p<.001), suggesting stronger PE over initial repeated exposures. However, lower PE over 3 months were associated with more global amyloid burden (r=-.20, 95%CI=[-0.38 – -0.01], p=.049) and tau deposition in the entorhinal cortex (r=-.38, 95%CI=[-0.54 – -0.19], p<.001) and inferior-temporal lobe (r=-.23, 95%CI=[-0.41 – -0.02], p=.03). 3-month C3 slopes exhibited good discriminative ability to identify PACC-5 decliners (AUC 0.91, 95%CI=[0.84 – 0.98]), which was better than baseline C3 (p<.001) and baseline PACC-5 scores (p=.02). Conclusion: While PE are commonly observed among CU adults, diminished PE over monthly cognitive testing are associated with greater AD biomarker burden and cognitive decline. Our findings imply that unsupervised computerized testing using monthly retest paradigms can provide rapid detection of diminished PE indicative of future cognitive decline in preclinical AD.

Journal Article

Share this book

Add to My Shelf

Examining the subjective fairness of at-home and online tests: Taking Duolingo English Test as an example

by Yao, Don in Access , Achievement tests , Asian students

2023

The Duolingo English Test, a language proficiency test offered online, is now getting prevalent worldwide. A recap of existing literature denotes that there is an insufficient examination of the DET, particularly on its issues of fairness. Besides, empirical test fairness research had mainly focused on the objective aspect but may have overlooked the importance of its subjective aspect. Additionally, compared with in-person tests, the fairness research of at-home tests lags far behind. Therefore, the current study investigated the DET fairness from test takers’ perspectives. A DET Fairness Questionnaire based on (Kunnan’s AJ, 2004) Test Fairness Framework (comprising validity, absence of bias, access, administration, and social consequences) was developed. Data were collected from 1,012 Chinese university students and processed through descriptive and factor analyses. The descriptive analyses revealed that test takers perceived the DET to be fair overall. Specifically, they perceived that they had equal access to the test, but the test was invalid; the factor analyses showed that test takers’ perceptions of DET fairness (especially perceived validity and access) had a significant effect on their test performance. Such findings suggest that the subjective test fairness as an essential component could not be neglected in appraising an assessment as it influences test takers’ performance, and DET developers may strive harder to enhance the validity of DET to provide a fairer testing environment for test takers.

Journal Article

Share this book

Add to My Shelf

Exploration of designing an automatic classifier for questions containing code snippets—A case study of Oracle SQL certification exam questions

by Wang, Yunsen , Chen, Hung-Yi , Shih, Po-Chou in Analysis , Automatic classification , Biology and Life Sciences

2025

This study uses the Oracle SQL certification exam questions to explore the design of automatic classifiers for exam questions containing code snippets. SQL’s question classification assigns a class label in the exam topics to a question. With this classification, questions can be selected from the test bank according to the testing scope to assemble a more suitable test paper. Classifying questions containing code snippets is more challenging than classifying questions with general text descriptions. In this study, we use factorial experiments to identify the effects of the factors of the feature representation scheme and the machine learning method on the performance of the question classifiers. Our experiment results showed the classifier with the TF-IDF scheme and Logistics Regression model performed best in the weighted macro-average AUC and F1 performance indices. The classifier with TF-IDF and Support Vector Machine performed best in weighted macro-average Precision. Moreover, the feature representation scheme was the main factor affecting the classifier’s performance, followed by the machine learning method, over all the performance indices.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter