Catalogue Search | MBRL

Model-selection tests for conditional moment restriction models

by Shi, Xiaoxia , Hsu, Yu-Chin in Asymptotic size , Averages , Candidates

2017

We propose a Vuong-type model-selection test for models defined by conditional moment restrictions. The moment restrictions that define the models can be standard equality restrictions that point-identify the model parameters, or moment equality or inequality restrictions that partially identify the model parameters. The test uses a new average generalized empirical likelihood criterion function designed to incorporate full restriction of the conditional model. We also introduce a new adjustment to the test statistic that makes it asymptotically pivotal whether the candidate models are nested or non-nested. The test uses simple standard normal critical values and is shown to be asymptotically similar, to be consistent against all fixed alternatives, and to have non-trivial power against n ½-local alternatives. Monte Carlo simulations demonstrate that the finite sample performance of the test is in accordance with the theoretical prediction.

Journal Article

Share this book

Add to My Shelf

Contrasting test selection, prioritization, and batch testing at scale

by Adams, Bram , Rigby, Peter C. , Fallahzadeh, Emad in Compilers , Computer Science , Interpreters

2025

The effectiveness of software testing is crucial for successful software releases, and various test optimization techniques aim to enhance this process by reducing the number of test executions or prioritizing potential test failures. Although different families of techniques exist, each with its own evaluation criteria, few studies have compared these different lines of research. This study addresses this gap by empirically comparing Yaraghi et al.’s test prioritization approach, Zhu et al.’s cross-build test prioritization and its equivalent test selection technique, and our BatchAll test batching algorithm. To evaluate these test optimization approaches, we empirically analyze millions of test results from Google Chrome, along with pre- and post-commit test outcomes for a Google project, as well as the JMRI Travis CI dataset. Findings reveal that test selection can reduce actual median feedback time by up to 96% with the same number of machines but may miss up to 55% of failures. In contrast, batching achieves up to a 99% reduction in feedback time without missing any failures. Test selection cuts machine usage by up to 66%, while batching achieves up to an 88% reduction. For failure detection, the test selection is up to 62 minutes faster than the baseline, and the batching algorithm achieves up to a 63-minute median improvement without missing failures. Regarding test execution time, test selection saves up to 66%, whereas batching’s saving can reach up to 98%, although its performance varies based on the machines used. The studied test prioritization algorithms significantly underperform compared to the test selection and batching algorithms. In conclusion, this study provides practical recommendations for selecting appropriate test optimization algorithms based on the testing environment and failure loss tolerance.

Journal Article

Share this book

Add to My Shelf

Different Tests, Different Answers: The Stability of Teacher Value-Added Estimates Across Outcome Measures

by Papay, John P. in Academic Achievement , Achievement Tests , Correlation

2011

Recently, educational researchers and practitioners have turned to value-added models to evaluate teacher performance. Although value-added estimates depend on the assessment used to measure student achievement, the importance of outcome selection has received scant attention in the literature. Using data from a large, urban school district, I examine whether value-added estimates from three separate reading achievement tests provide similar answers about teacher performance. I find moderate-sized rank correlations, ranging from 0.15 to 0.58, between the estimates derived from different tests. Although the tests vary to some degree in content, scaling, and sample of students, these factors do not explain the differences in teacher effects. Instead, test timing and measurement error contribute substantially to the instability of value-added estimates across tests.

Journal Article

Share this book

Add to My Shelf

Comparative study of machine learning test case prioritization for continuous integration testing

by Marijan, Dusica in Classifiers , Comparative studies , Fault diagnosis

2023

There is a growing body of research indicating the potential of machine learning to tackle complex software testing challenges. One such challenge pertains to continuous integration testing, which is highly time-constrained, and generates a large amount of data coming from iterative code commits and test runs. In such a setting, we can use plentiful test data for training machine learning predictors to identify test cases able to speed up the detection of regression bugs introduced during code integration. However, different machine learning models can have different fault prediction performance depending on the context and the parameters of continuous integration testing, for example, variable time budget available for continuous integration cycles, or the size of test execution history used for learning to prioritize failing test cases. Existing studies on test case prioritization rarely study both of these factors, which are essential for the continuous integration practice. In this study, we perform a comprehensive comparison of the fault prediction performance of machine learning approaches that have shown the best performance on test case prioritization tasks in the literature. We evaluate the accuracy of the classifiers in predicting fault-detecting tests for different values of the continuous integration time budget and with different lengths of test history used for training the classifiers. In evaluation, we use real-world and augmented industrial datasets from a continuous integration practice. The results show that different machine learning models have different performance for different size of test history used for model training and for different time budgets available for test case execution. Our results imply that machine learning approaches for test prioritization in continuous integration testing should be carefully configured to achieve optimal performance.

Journal Article

Share this book

Add to My Shelf

Assessing validity of agility and change of direction tests in soccer players: A factor analysis

by Salazar-Rojas, Walter , Pino-Ortega, José , Carvajal-Espinosa, Rafael in Athletes , Body Composition , Cognition & reasoning

2026

Introduction: This study investigated the construct validity of commonly used change of direction and agility tests in elite soccer players through exploratory and confirmatory factor analyses. Change of direction tasks involve preplanned movement patterns, whereas agility tasks require reactive responses to external stimuli, indicating greater perceptual-cognitive demand and a more complex integration of motor and decision-making processes. Despite this conceptual distinction, both constructs are frequently assessed using similar field-based tests, raising questions about their discriminant and construct validity. Methods: Sixty-six high-performance soccer players completed a comprehensive testing battery consisting of four change of direction tests, four agility tests, and three linear sprint tests (5, 10, and 15 m). All tests were administered under standardized conditions. Data were analyzed using exploratory factor analysis to identify underlying latent structures, followed by confirmatory factor analysis to evaluate model fit and construct coherence. Results: Exploratory factor analysis identified four factors: Speed, change of direction, T-test, and agility. Speed and change of direction tests demonstrated strong and distinct factor loadings, supporting their validity as independent physical performance constructs. In contrast, agility tests did not consistently load onto a single factor, with only the Y-test demonstrating a meaningful loading. Confirmatory factor analysis supported these findings, indicating good model fit for speed and change of direction constructs, but poor fit for agility. Discussion: These findings suggest that commonly used agility tests may not adequately capture agility as a unified construct. The results highlight the need for more ecologically valid and cognitively demanding assessments that better reflect perception-action coupling. From an applied perspective, these limitations may affect test selection and neuromuscular monitoring strategies within soccer injury risk management frameworks. Overall, this study reinforces the importance of construct validation in performance testing and calls for a critical reassessment of current agility assessment tools in elite soccer.

Journal Article

Share this book

Add to My Shelf

La escuela activa y su papel en promover la salud de los niños en edad escolar más jóvenes (Active school and its role in promoting health of younger school-age children)

by Bendíková, Elena , Sagat, Peter in Children & youth , Exercise , Health Related Fitness

2024

Background: A recent study has indicated a decline in the level of health-related fitness among school-age children. Active school offers and pursues various strategies to encourage children to engage in physical activity during the school day, with the objective of improving their behavioural, physical performance and literacy outcomes, among other benefits. The objective of the research was to examine the impact of a comprehensive movement programme implemented within a physically active school on the selected factors of general physical performance and posture as a manifestation of fitness in non-exercising pupils. The study employed a number of methodological approaches, including: A total of 25 school-age children, aged between six and seven years, participated in the study on a voluntary basis. The participants were divided into two groups, with six girls and seven boys in each. The pupils' physical fitness was evaluated through the administration of selected standardized tests, including the 4 x 10 m shuttle run, sit-ups in 60 seconds, standing long jump, bent-arm hangs, and a 20-metre multistage endurance shuttle run. The pupils’ posture was assessed and classified using the standardised Klein and Thomas method, modified by Mayer. The exercise programme was conducted over a six-month period. The intervention was implemented on two occasions per day, once during a break and once during a lesson in the form of an exercise break, five times per week, and once per week through an after-school activity for 45 minutes. The results of the study are as follows: In the study sample, there was a significant improvement in scores between the entry and exit points for both girls and boys in each test. A comparison of the sexes revealed that boys exhibited superior (p < 0.01) values in all the selected tests compared to girls. With regard to overall posture, we observed an improvement in both genders (p < 0.01), with no gender-based differences. Conclusion: In conclusion, the results of this study confirm that the chosen tactic of progression of physical activity, the programme for non-exercising/inactive students of younger school age, was an effective way of improving the physical, health-related fitness and overall posture of the participants, as well as motivating and creating interest in physical activities.

Journal Article

Share this book

Add to My Shelf

Test Points Selection for Analog Fault Dictionary Techniques

by Long, Bing , Yang, Chenglin , Tian, Shulin in Algorithms , Analog , CAE) and Design

2009

The test points selection problem for analog fault dictionary is researched extensively in many literatures. Various test points selection strategies and criteria for Integer-Coded fault wise table are described and compared in this paper. Firstly, the construction method of Integer-Coded fault wise table for analog fault dictionary is described. Secondly, theory and algorithms associated with these strategies and criteria are reviewed. Thirdly, the time complexity and solution accuracy of existing algorithms are analyzed and compared. Then, a more accurate test points selection strategy is proposed based on the existing strategies. Finally, statistical experiments are carried out and the accuracy and efficiency of different strategies and criteria are compared in a set of comparative tables and figures. Theoretical analysis and statistical experimental results given in this paper can provide an instruction for coding an efficient and accurate test points selection algorithm easily.

Journal Article

Share this book

Add to My Shelf

The Brain-to-Brain Loop Concept for Laboratory Testing 40 Years After Its Introduction

by Laposata, Michael , Plebani, Mario , Lundberg, George D. in Biological and medical sciences , Clinical Laboratory Techniques - standards , Diagnostic Errors - prevention & control

2011

Forty years ago, Lundberg introduced the concept of the brain-to-brain loop for laboratory testing. In this concept, in the brain of the physician caring for the patient, the first step involves the selection of laboratory tests and the final step is the transmission of the test result to the ordering physician. There are many intermediary steps, some of which are preanalytic, ie, before performance of the test; some are analytic and relate to the actual performance of the test; and others are postanalytic and involve transmission of test results into the medical record. The introduction of this concept led to a system to identify and classify errors associated with laboratory test performance. Errors have since been considered as preanalytic, analytic, and postanalytic. During the past 4 decades, changes in medical practice have significantly altered the brain-to-brain loop for laboratory testing. This review describes the changes and their implications for analysis of errors associated with laboratory testing.

Journal Article

Share this book

Add to My Shelf

Optimization of the performance test length for the Sella Italiano stallions

by Silvestrelli, Maurizio , Sarti, Francesca Maria , Pieramati, Camillo in Saddle horse, Length of test, Station test, Selection

2013

The Sella Italiano stallions are selected by 100-day perfomance test, but many breeders organisations use station test of shorter length. In twelve editions, 17,394 ten-point scores both for character and gaits, and 6291 scores for jumping were assigned to 314 candidate stallions. By means of the comparison of univariate EBVs, calculated from the complete dataset or from the reduced subsets, a dataset lasting only from 33rd day to 78th day of the training period was selected: it included less than 70% of the original data. The subset was validated by 3-trait AM BLUP: there were no evident effects on the estimates of variance component or ratio, and the aggregated selection index for the three traits showed a 99% rank correlation with the official index. These findings demonstrate that the training period could be reduced without affecting the genetic progress in the Sella Italiano breed.

Journal Article

Share this book

Add to My Shelf

Reducing misdiagnoses and cognitive errors using virtual patients and automated feedback in a clinical reasoning curriculum

by Lee, Chel Hee , Smith, David , Rogers, Suzanne in Accuracy , Algorithms , Allied Health Occupations Education

2026

Introduction Diagnostic errors remain prevalent across all specialties, driven largely by deficits in clinical reasoning (CR). Although CR is a core competency, most medical schools lack structured pre-clerkship CR training. Virtual patients (VPs) with automated feedback offer scalable, simulation-based training to improve diagnostic skills and reduce faculty workload. The aim of this study was to assess whether a CR curriculum using VPs with automated scoring and deliberate practice improves diagnostic accuracy and CR. Methods We conducted a multi-site observational study across five North American medical schools. First- and second-year students completed up to 20 diagnostic VP cases on TeachingMedicine.com, each with automated scoring to inform individualized feedback. We analyzed 1.55 million datapoints from 12,400 cases completed by 1,066 students to assess differences in CR performance between correctly and incorrectly diagnosed cases, associations between CR components and diagnostic accuracy, and learning gains over time. Results Misdiagnoses occurred in 20.1% of cases. Correct diagnoses were associated with higher diagnostic justification (DxJ) scores (+ 50%), better test ordering (+ 51%), and fewer cognitive errors (–89%). Multivariate analysis identified DxJ and cognitive errors as the strongest predictors of diagnostic accuracy. With repeated practice, students improved DxJ by 72%, test ordering by 40%, and reduced misdiagnoses threefold and cognitive errors by half, with no plateau observed after 20 cases. By end of pre-clerkship, first-year students who completed 20 cases outperformed second-year students who completed 10 in all CR metrics. All results were statistically significant with p < 0.0001. Conclusions This curriculum shows that CR skills are highly trainable through deliberate practice. Improved DxJ and reduced cognitive errors are strongly associated with lower misdiagnosis rates. In contrast to a common misperception, training CR diagnostic skills is successful when started in the beginning of 1st year medical school prior to students’ acquisition of significant medical knowledge.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter