Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Reading LevelReading Level
-
Content TypeContent Type
-
YearFrom:-To:
-
More FiltersMore FiltersItem TypeIs Full-Text AvailableSubjectPublisherSourceDonorLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
113,431
result(s) for
"scoring"
Sort by:
Your credit score : how to improve the 3-digit number that shapes your financial future
\"Improve your credit score, for real, with the #1 best-selling guide you can trust! Today, a good credit score is essential for getting credit, getting a job, even getting car insurance or a cellphone. Now, best selling journalist Liz Pulliam Weston has thoroughly updated her top-selling guide to credit scores, with crucial new information for protecting (or rebuilding) yours. Weston thoroughly covers brand-new laws and rules surrounding credit scoring - including some surprising good news and some frightening new risks.\" -- Publisher annotation.
Machine Learning and Hebrew NLP for Automated Assessment of Open-Ended Questions in Biology
by
Ariely, Moriah
,
Nazaretsky, Tanya
,
Alexandron, Giora
in
Algorithms
,
Artificial Intelligence
,
Automation
2023
Machine learning algorithms that automatically score scientific explanations can be used to measure students’ conceptual understanding, identify gaps in their reasoning, and provide them with timely and individualized feedback. This paper presents the results of a study that uses Hebrew NLP to automatically score student explanations in Biology according to fine-grained analytic grading rubrics that were developed for formative assessment. The experimental results show that our algorithms achieve a high-level of agreement with human experts, on par with previous work on automated assessment of scientific explanations in English, and that
∼
500 examples are typically enough to build reliable scoring models. The main contribution is twofold. First, we present a conceptual framework for constructing analytic grading rubrics for scientific explanations, which are composed of dichotomous categories that generalize across items. These categories are designed to support automated guidance, but can also be used to provide a composite score. Second, we apply this approach in a new context – Hebrew, which belongs to a group of languages known as Morphologically-Rich. In languages of this group, among them also Arabic and Turkish, each input token may consist of multiple lexical and functional units, making them particularly challenging for NLP. This is the first study on automatic assessment of scientific explanations (and more generally, of open-ended questions) in Hebrew, and among the firsts to do so in Morphologically-Rich Languages.
Journal Article
Credit intelligence : boosting your credit smarts
The authors \"provide you with a roadmap to credit intelligence by sharing their shopping adventures and lessons learned about credit as 'Olympic level' shoppers who have fallen into and pulled each other out of many of the traps and pitfalls surrounding the use of credit and the behavioral buying manipulations by retailers\"--Amazon.com.
From the automated assessment of student essay content to highly informative feedback
by
Gombert, Sebastian
,
Di Mitri, Daniele
,
Frey, Andreas
in
Artificial Intelligence
,
Automation
,
Case studies
2024
Various studies empirically proved the value of highly informative feedback for enhancing learner success. However, digital educational technology has yet to catch up as automated feedback is often provided shallowly. This paper presents a case study on implementing a pipeline that provides German-speaking university students enrolled in an introductory-level educational psychology lecture with content-specific feedback for a lecture assignment. In the assignment, students have to discuss the usefulness and educational grounding (i.e., connection to working memory, metacognition or motivation) of ten learning tips presented in a video within essays. Through our system, students received feedback on the correctness of their solutions and content areas they needed to improve. For this purpose, we implemented a natural language processing pipeline with two steps: (1) segmenting the essays and (2) predicting codes from the resulting segments used to generate feedback texts. As training data for the model in each processing step, we used 689 manually labelled essays submitted by the previous student cohort. We then evaluated approaches based on GBERT, T5, and bag-of-words baselines for scoring them. Both pipeline steps, especially the transformer-based models, demonstrated high performance. In the final step, we evaluated the feedback using a randomised controlled trial. The control group received feedback as usual (essential feedback), while the treatment group received highly informative feedback based on the natural language processing pipeline. We then used a six items long survey to test the perception of feedback. We conducted an ordinary least squares analysis to model these items as dependent variables, which showed that highly informative feedback had positive effects on helpfulness and reflection. (DIPF/Orig.).
Journal Article
Automated Essay Scoring and the Deep Learning Black Box: How Are Rubric Scores Determined?
by
Kumar, Vivekanandan S.
,
Boulanger, David
in
A festschrift in honour of Jim Greer
,
Agreements
,
Algorithms
2021
This article investigates the feasibility of using automated scoring methods to evaluate the quality of student-written essays. In 2012, Kaggle hosted an Automated Student Assessment Prize contest to find effective solutions to automated testing and grading. This article: a) analyzes the datasets from the contest – which contained hand-graded essays – to measure their suitability for developing competent automated grading tools; b) evaluates the potential for deep learning in automated essay scoring (AES) to produce sophisticated testing and grading algorithms; c) advocates for thorough and transparent performance reports on AES research, which will facilitate fairer comparisons among various AES systems and permit study replication; d) uses both deep neural networks and state-of-the-art NLP tools to predict finer-grained rubric scores, to illustrate how rubric scores are determined from a linguistic perspective, and to uncover important features of an effective rubric scoring model. This study’s findings first highlight the level of agreement that exists between two human raters for each rubric as captured in the investigated essay dataset, that is, 0.60 on average as measured by the quadratic weighted kappa (QWK). Only one related study has been found in the literature which also performed rubric score predictions through models trained on the same dataset. At best, the predictive models had an average agreement level (QWK) of 0.53 with the human raters, below the level of agreement among human raters. In contrast, this research’s findings report an average agreement level per rubric with the two human raters’ resolved scores of 0.72 (QWK), well beyond the agreement level between the two human raters. Further, the AES system proposed in this article predicts holistic essay scores through its predicted rubric scores and produces a QWK of 0.78, a competitive performance according to recent literature where cutting-edge AES tools generate agreement levels between 0.77 and 0.81, results computed as per the same procedure as in this article. This study’s AES system goes one step further toward interpretability and the provision of high-level explanations to justify the predicted holistic and rubric scores. It contends that predicting rubric scores is essential to automated essay scoring, because it reveals the reasoning behind AIED-based AES systems. Will building AIED accountability improve the trustworthiness of the formative feedback generated by AES? Will AIED-empowered AES systems thoroughly mimic, or even outperform, a competent human rater? Will such machine-grading systems be subjected to verification by human raters, thus paving the way for a human-in-the-loop assessment mechanism? Will trust in new generations of AES systems be improved with the addition of models that explain the inner workings of a deep learning black box? This study seeks to expand these horizons of AES to make the technique practical, explainable, and trustable.
Journal Article
Después de marcar el primer gol, ¿el equipo es más vulnerable a sufrir el empate poco después? Un análisis del Campeonato Brasileño de fútbol Serie A entre 2011 y 2021
by
Suhey Salim Ferreira dos Santos
,
Leandro Batista Cordeiro
,
Jonatas Ferreira da Silva Santos
in
Scoring
,
Soccer
2023
The aim of the present study was to investigate whether, after scoring a goal, the team becomes more vulnerable to conceding a goal soon after. A total of 518 results were collected and integrated the data of the present study. The independent Chi-square test was used to analyze the data. The values of adjusted residue were observed, and all values outside the range of -1.96 to 1.96 were considered. All analyzes were performed using α = 5%. The scoring goal equalizer moment happened in the first-quarter 109 matches (21.0%), second-quarter 156 matches (30.1%), third-quarter 127 matches (24.5%), and fourth-quarter 126 matches (24.3%), totalizing 518 (100%) matches, but no association between season and scoring goal equalizer moment (χ² (30) = 28.196, p>0.05; Cramer’s V test: 0.135). The data collected showed that the equalizing goal can occur at different moments of the game, not only right after the first goal is score.
Journal Article
Qualitative data : an introduction to coding and analysis
by
Silverstein, Louise B
,
Auerbach, Carl
in
Methodology
,
PSYCHOLOGY
,
Psychology -- Research -- Methodology
2003
Qualitative Data is meant for the novice researcher who needs guidance on what specifically to do when faced with a sea of information. It takes readers through the qualitative research process, beginning with an examination of the basic philosophy of qualitative research, and ending with planning and carrying out a qualitative research study. It provides an explicit, step-by-step procedure that will take the researcher from the raw text of interview data through data analysis and theory construction to the creation of a publishable work.
The volume provides actual examples based on the authors' own work, including two published pieces in the appendix, so that readers can follow examples for each step of the process, from the project's inception to its finished product. The volume also includes an appendix explaining how to implement these data analysis procedures using NVIVO, a qualitative data analysis program.
The Visual Scoring of Sleep in Infants 0 to 2 Months of Age
by
Grigg-Damberger, Madeleine M.
in
Electroencephalography
,
Electroencephalography - statistics & numerical data
,
Eye movements
2016
In March 2014, the American Academy of Sleep Medicine (AASM) Board of Directors requested the Scoring Manual Editorial Board develop rules, terminology, and technical specifications for scoring sleep/wake states in full-term infants from birth to 2 mo of age, cognizant of the 1971 Anders, Emde, and Parmelee Manual for Scoring Sleep in Newborns. On July 1, 2015, the AASM published rules for scoring sleep in infants, ages 0–2 mo. This evidence-based review summarizes the background information provided to the Scoring Manual Editorial Board to write these rules. The Anders Manual only provided criteria for coding physiological and behavioral state characteristics in polysomnograms (PSG) of infants, leaving specific sleep scoring criteria to the individual investigator. Other infant scoring criteria have been published, none widely accepted or used.
The AASM Scoring Manual infant scoring criteria incorporate modern concepts, digital PSG recording techniques, practicalities, and compromises. Important tenets are: (1) sleep/wake should be scored in 30-sec epochs as either wakefulness (W), rapid eye movement, REM (R), nonrapid eye movement, NREM (N) and transitional (T) sleep; (2) an electroencephalographic (EEG) montage that permits adequate display of young infant EEG is: F3-M2, F4-M1, C3-M2, C4-M1, O1-M2, O2-M1; additionally, recording C3-Cz, Cz-C4 help detect early and asynchronous sleep spindles; (3) sleep onsets are more often R sleep until 2–3 mo postterm; (4) drowsiness is best characterized by visual observation (supplemented by later video review); (5) wide open eyes is the most crucial determinant of W; (6) regularity (or irregularity) of respiration is the single most useful PSG characteristic for scoring sleep stages at this age; (7) trace alternant (TA) is the only relatively distinctive EEG pattern, characteristic of N sleep, and usually disappears by 1 mo postterm replaced by high voltage slow (HVS); (8) sleep spindles first appear 44–48 w conceptional age (CA) and when present prompt scoring N; (9) score EEG activity in an epoch as “continuous” or “discontinuous” for inter-scorer reliability; (10) score R if four or more of the following conditions are present, including irregular respiration and rapid eye movement(s): (a) low chin EMG (for the majority of the epoch); (b) eyes closed with at least one rapid eye movement (concurrent with low chin tone); (c) irregular respiration; (d) mouthing, sucking, twitches, or brief head movements; and (e) EEG exhibits a continuous pattern without sleep spindles; (11) because rapid eye movements may not be seen on every page, epochs following an epoch of definite R in the absence of rapid eye movements may be scored if the EEG is continuous without TA or sleep spindles, chin muscle tone low for the majority of the epoch; and there is no intervening arousal; (12) Score N if four or more of the following conditions are present, including regular respiration, for the majority of the epoch: (a) eyes are closed with no eye movements; (b) chin EMG tone present; (c) regular respiration; and (d) EEG patterns of either TA, HVS, or sleep spindles are present; and (13) score T sleep if an epoch contains two or more discordant PSG state characteristics (either three NREM and two REM characteristics or two NREM and three REM characteristics).
These criteria for ages 0–2 mo represent far more than baby steps. Like all the other AASM Manual rules and specifications none are fixed in stone, all open for debate, discussion and revision with the fundamental goal to provide standards for comparison of methods and results.
Commentary:
A commentary on this article appears in this issue on page 291.
Citation:
Grigg-Damberger MM. The visual scoring of sleep in infants 0 to 2 months of age.
J Clin Sleep Med
2016;12(3):429–445.
Journal Article
HIGHER ORDER ELICITABILITY AND OSBAND'S PRINCIPLE
by
Fissler, Tobias
,
Ziegel, Johanna F.
in
Comparative analysis
,
Distribution functions
,
Financial risk
2016
A statistical functional, such as the mean or the median, is called elicitable if there is a scoring function or loss function such that the correct forecast of the functional is the unique minimizer of the expected score. Such scoring functions are called strictly consistent for the functional. The elicitability of a functional opens the possibility to compare competing forecasts and to rank them in terms of their realized scores. In this paper, we explore the notion of elicitability for multi-dimensional functionals and give both necessary and sufficient conditions for strictly consistent scoring functions. We cover the case of functionals with elicitable components, but we also show that one-dimensional functionals that are not elicitable can be a component of a higher order elicitable functional. In the case of the variance, this is a known result. However, an important result of this paper is that spectral risk measures with a spectral measure with finite support are jointly elicitable if one adds the \"correct\" quantiles. A direct consequence of applied interest is that the pair (Value at Risk, Expected Shortfall) is jointly elicitable under mild conditions that are usually fulfilled in risk management applications.
Journal Article
Automated Scoring of Chinese Grades 7–9 Students’ Competence in Interpreting and Arguing from Evidence
2021
Assessing scientific argumentation is one of main challenges in science education. Constructed-response (CR) items can be used to measure the coherence of student ideas and inform science instruction on argumentation. Published research on automated scoring of CR items has been conducted mostly in English writing, rarely in other languages. The objective of this study is to investigate issues related to the automated scoring of Chinese written responses. LightSIDE was used to score students’ written responses in Chinese. The sample of this study was from Beijing (grades 7–9) consisting of 4000 students. Items for assessing interpreting data and making claims under an ecological topic developed by the Stanford NGSS Assessment Project were translated into Chinese and used to assess student competence of interpreting data and making claims. The results show that: (1) at least 800 human-scored student responses were needed as the training sample size to accurately build scoring models. When doubling the training sample size, the accuracy in kappa increased only slightly by 0.03–0.04; (2) there was a nearly perfect agreement between human scoring and computer-automated scoring based on both holistic scores and analytic scores, although analytic scores produced slightly better accuracy than holistic scores; (3) automated scoring accuracy did not differ substantially by student response length, although shorter text length produced slightly higher human-machine agreement. The above findings suggest that automated scoring of Chinese writings produced a similar level of accuracy compared with that of English writings reported in literature, although there are specific considerations, e.g., training data set size, scoring rubric, and text lengths, to be considered using automated scoring of student written responses in Chinese.
Journal Article