Catalogue Search | MBRL

A new model of decision processing in instrumental learning tasks

by Heathcote, Andrew , Trutti, Anne C , Forstmann, Birte U in Adult , Behavior , Cognition & reasoning

2021

Learning and decision-making are interactive processes, yet cognitive modeling of error-driven learning and decision-making have largely evolved separately. Recently, evidence accumulation models (EAMs) of decision-making and reinforcement learning (RL) models of error-driven learning have been combined into joint RL-EAMs that can in principle address these interactions. However, we show that the most commonly used combination, based on the diffusion decision model (DDM) for binary choice, consistently fails to capture crucial aspects of response times observed during reinforcement learning. We propose a new RL-EAM based on an advantage racing diffusion (ARD) framework for choices among two or more options that not only addresses this problem but captures stimulus difficulty, speed-accuracy trade-off, and stimulus-response-mapping reversal effects. The RL-ARD avoids fundamental limitations imposed by the DDM on addressing effects of absolute values of choices, as well as extensions beyond binary choice, and provides a computationally tractable basis for wider applications.

Journal Article

Share this book

Add to My Shelf

Clustering ensemble selection considering quality and diversity

by Rezaie, Vahideh , Samad Nejatian , Bagherifard, Karamolah in Accumulation , Algorithms , Clustering

2019

It is highly likely that there is a partition that is judged by a stability measure as a bad one while it contains one (or more) high quality cluster(s); and then it is totally neglected. So, inspiring from the evaluation of partitions, researchers turn to define measures for evaluation of clusters. Many stability measures have been proposed such as Normalized Mutual Information to validate a partition. The defined measures are based on Normalized Mutual Information. The drawback of the commonly used approach will be discussed in this paper and a criterion is proposed to assess the association between a cluster and a partition which is called Edited Normalized Mutual Information, ENMI criterion. The ENMI criterion compensates the drawback of the common Normalized Mutual Information (NMI) measure. Also, a clustering ensemble method that is based on aggregating a subset of primary clusters is proposed. The proposed method uses the Average ENMI as fitness measure to select a number of clusters. The clusters that satisfy a predefined threshold of the mentioned measure are selected to participate in the final ensemble. To combine the chosen clusters a set of consensus function methods are employed. One class of the used consensus functions is the co-association based consensus functions. Since the Evidence Accumulation Clustering, EAC, method can’t derive the co-association matrix from a subset of clusters, Extended EAC, EEAC, is employed to construct the co-association matrix from the chosen subset of clusters. The second class of the used consensus functions is based on hyper graph partitioning algorithms. The other class of the used consensus functions considers the chosen clusters as a new feature space and uses a simple clustering algorithm to extract the consensus partitioning. The empirical studies show that the proposed method outperforms other well-known ensembles.

Journal Article

Share this book

Add to My Shelf

Are there jumps in evidence accumulation, and what, if anything, do they reflect psychologically? An analysis of Lévy Flights models of decision-making

by Rad, Jamal Amani , Sewell, David K. , Rasanan, Amir Hosein Hadian in Behavioral Science and Psychology , Cognitive Psychology , Decision Making

2024

According to existing theories of simple decision-making, decisions are initiated by continuously sampling and accumulating perceptual evidence until a threshold value has been reached. Many models, such as the diffusion decision model, assume a noisy accumulation process, described mathematically as a stochastic Wiener process with Gaussian distributed noise. Recently, an alternative account of decision-making has been proposed in the Lévy Flights (LF) model, in which accumulation noise is characterized by a heavy-tailed power-law distribution, controlled by a parameter, α . The LF model produces sudden large “jumps\" in evidence accumulation that are not produced by the standard Wiener diffusion model, which some have argued provide better fits to data. It remains unclear, however, whether jumps in evidence accumulation have any real psychological meaning. Here, we investigate the conjecture by Voss et al. ( Psychonomic Bulletin & Review, 26 (3), 813–832, 2019 ) that jumps might reflect sudden shifts in the source of evidence people rely on to make decisions. We reason that if jumps are psychologically real, we should observe systematic reductions in jumps as people become more practiced with a task (i.e., as people converge on a stable decision strategy with experience). We fitted five versions of the LF model to behavioral data from a study by Evans and Brown ( Psychonomic Bulletin & Review , 24 (2), 597–606, 2017 ), using a five-layer deep inference neural network for parameter estimation. The analysis revealed systematic reductions in jumps as a function of practice, such that the LF model more closely approximated the standard Wiener model over time. This trend could not be attributed to other sources of parameter variability, speaking against the possibility of trade-offs with other model parameters. Our analysis suggests that jumps in the LF model might be capturing strategy instability exhibited by relatively inexperienced observers early on in task performance. We conclude that further investigation of a potential psychological interpretation of jumps in evidence accumulation is warranted.

Journal Article

Share this book

Add to My Shelf

Gender and anxiety reveal distinct computational sources of underconfidence

by Fleming, Stephen M. , Katyal, Sucharit in Accumulation , Accuracy , Adolescent

2026

Confidence exhibits systematic individual differences across mental health, gender, and age. However, it remains unknown whether these distinct sources of metacognitive bias have common or distinct computational origins. To address this question, we developed a novel dynamic computational model of metacognition to study the temporal evolution of underconfidence associated with individual differences in transdiagnostic anxiety symptoms and gender in samples of online participants (total = 1,447). We found that underconfidence associated with anxiety symptoms became more prominent the longer individuals took to make metacognitive judgments - suggesting that it is exacerbated by additional time for introspection. In contrast, gender-related underconfidence decreased with greater metacognitive judgment time - suggesting that additional time for introspection is able to remediate prepotent biases. Our computational model of confidence explained these effects - while both gender and anxiety symptoms involved shifts in confidence criteria, only anxiety symptoms involved a temporal accumulation of negatively biased evidence about one's ability. Our study reveals multiple computational pathways to the formation of underconfidence, in turn highlighting specific potential mechanisms for its remediation.

Journal Article

Share this book

Add to My Shelf

The Tweedledum and Tweedledee of dynamic decisions: Discriminating between diffusion decision and accumulator models

by Kvam, Peter D. in Behavioral Science and Psychology , Cognitive Psychology , Decision Making

2025

Theories of dynamic decision-making are typically built on evidence accumulation, which is modeled using racing accumulators or diffusion models that track a shifting balance of support over time. However, these two types of models are only two special cases of a more general evidence accumulation process where options correspond to directions in an accumulation space. Using this generalized evidence accumulation approach as a starting point, I identify four ways to discriminate between absolute-evidence and relative-evidence models. First, an experimenter can look at the information that decision-makers considered to identify whether there is a filtering of near-zero evidence samples, which is characteristic of a relative-evidence decision rule (e.g., diffusion decision model). Second, an experimenter can disentangle different components of drift rates by manipulating the discriminability of the two response options relative to the stimulus to delineate the balance of evidence from the total amount of evidence. Third, a modeler can use machine learning to classify a set of data according to its generative model. Finally, machine learning can also be used to directly estimate the geometric relationships between choice options. I illustrate these different approaches by applying them to data from an orientation-discrimination task, showing converging conclusions across all four methods in favor of accumulator-based representations of evidence during choice. These tools can clearly delineate absolute-evidence and relative-evidence models, and should be useful for comparing many other types of decision theories.

Journal Article

Share this book

Add to My Shelf

People are at least as good at optimizing reward rate under equivalent fixed-trial compared to fixed-time conditions

by Brown, Scott D. , Taylor, Grant J. , Evans, Nathan J. in Adult , Behavioral Science and Psychology , Brief Report

2025

Finding an optimal decision-making strategy requires a careful balance between the competing demands of accuracy and urgency. In experimental settings, researchers are typically interested in whether people can optimise this trade-off, typically operationalised as reward rate, with evidence accumulation models serving as the key framework to determine whether people are performing optimally. However, recent studies have suggested that inferences about optimality can be highly dependent on the task design, meaning that inferences about whether people can achieve optimality may not generalise across contexts. Here, we investigate one typically overlooked design factor: whether participants spend a fixed amount of time on each block (fixed time) or have a fixed number of trials in each block (fixed trials). While fixed-time designs are typically thought to be the most appropriate for optimality studies, as to maximise the number of correct responses participants must optimise RR, our Experiments 1 and 2 indicate that people are at least as good at optimising reward rate under fixed-trial designs as fixed-time designs. However, Experiment 3 provides some evidence that fixed-trial designs with no instructions may not be at least as good as fixed-time designs with very specific instructions. Importantly, these findings challenge the idea that fixed-time designs are the most appropriate for reward rate optimality studies, and further emphasise the importance of carefully considering study design factors when making inferences about optimality in decision-making.

Journal Article

Share this book

Add to My Shelf

A cognitive model of response omissions in distraction paradigms

by Heathcote, Andrew , Matzke, Dora , Castro, Spencer C. in Accumulation , Behavioral Science and Psychology , Cognition

2022

The effects of distraction on responses manifest in three ways: prolonged reaction times, and increased error and response omission rates. However, the latter effect is often ignored or assumed to be due to a separate cognitive process. We investigated omissions occurring in two paradigms that manipulated distraction. One required simple stimulus detection of younger participants, the second required choice responses and was completed by both younger and older participants. We fit data from these paradigms with a model that identifies three causes of omissions: two are related to the process of accumulating the evidence on which a response is based: intrinsic omissions (due to between-trial variation in accumulation rates making it impossible to ever reach the evidence threshold) and design omissions (due to response windows that cause slow responses not to be recorded; a third, contaminant omissions , allows for a cause unrelated to the response process. In both data sets systematic differences in omission rates across conditions were accounted for by task-related omissions. Intrinsic omissions played a lesser role than design omissions, even though the presence of design omissions was not evident in descriptive analyses of the data. The model provided an accurate account of all aspects of the detection data and the choice-response data, but slightly underestimated overall omissions in the choice paradigm, particularly in older participants, suggesting that further investigation of contaminant omission effects is needed.

Journal Article

Share this book

Add to My Shelf

Understanding neural signals of post-decisional performance monitoring: An integrative review

by Murphy, Peter R , Desender, Kobe , Ridderinkhof, K Richard in Accuracy , Behavior , Brain research

2021

Performance monitoring is a key cognitive function, allowing to detect mistakes and adapt future behavior. Post-decisional neural signals have been identified that are sensitive to decision accuracy, decision confidence and subsequent adaptation. Here, we review recent work that supports an understanding of late error/confidence signals in terms of the computational process of post-decisional evidence accumulation. We argue that the error positivity, a positive-going centro-parietal potential measured through scalp electrophysiology, reflects the post-decisional evidence accumulation process itself, which follows a boundary crossing event corresponding to initial decision commitment. This proposal provides a powerful explanation for both the morphological characteristics of the signal and its relation to various expressions of performance monitoring. Moreover, it suggests that the error positivity –a signal with thus far unique properties in cognitive neuroscience – can be leveraged to furnish key new insights into the inputs to, adaptation, and consequences of the post-decisional accumulation process.

Journal Article

Share this book

Add to My Shelf

Economic irrationality is optimal during noisy decision making

by Summerfield, Christopher , Moran, Rani , Moreland, James in Adolescent , Adult , Biological Sciences

2016

According to normative theories, reward-maximizing agents should have consistent preferences. Thus, when faced with alternatives A, B, and C, an individual preferring A to B and B to C should prefer A to C. However, it has been widely argued that humans can incur losses by violating this axiom of transitivity, despite strong evolutionary pressure for reward-maximizing choices. Here, adopting a biologically plausible computational framework, we show that intransitive (and thus economically irrational) choices paradoxically improve accuracy (and subsequent economic rewards) when decision formation is corrupted by internal neural noise. Over three experiments, we show that humans accumulate evidence over time using a “selective integration” policy that discards information about alternatives with momentarily lower value. This policy predicts violations of the axiom of transitivity when three equally valued alternatives differ circularly in their number of winning samples. We confirm this prediction in a fourth experiment reporting significant violations of weak stochastic transitivity in human observers. Crucially, we show that relying on selective integration protects choices against “late” noise that otherwise corrupts decision formation beyond the sensory stage. Indeed, we report that individuals with higher late noise relied more strongly on selective integration. These findings suggest that violations of rational choice theory reflect adaptive computations that have evolved in response to irreducible noise during neural information processing.

Journal Article

Share this book

Add to My Shelf

Motor cortical signals reflecting decision making and action preparation

by Ullsperger, Markus , Rogge, Julia , Jocham, Gerhard in Beta power lateralization , Cognitive ability , Decision making

2022

Decision making often requires accumulating evidence in favour of a particular option. When choices are expressed with a motor response, these actions are preceded by reductions in the power of oscillations in the alpha and beta range in motor cortices. For unimanual movements, these reductions are greater over the hemisphere contralateral to the response side. Such lateralizations are hypothesized to be an online index of the neural state of decisions as they develop over time of processing. In contrast, the lateralized readiness potential (LRP) is considered to selectively activate a response and appears shortly before the motor output. We investigated to what extent these neural signals reflect integration of decision evidence or more motor-related action preparation. Using two different experiments, we found that lateralization of alpha and beta power (APL and BPL, respectively) rapidly emerged after stimulus presentation, even when making an overt response was not yet possible. In contrast, we show that even after prolonged stimulus presentation, no LRP was present. Instead, the LRP emerged only after an imperative cue, prompting participants to indicate their choice. Furthermore, we could show that variations in sensory evidence strength modulate APL and BPL onset times, suggesting that integration of evidence is represented in these motor cortical signals. We conclude that APL and BPL reflect higher cognitive processes rather than pure action preparation, whereas LRP is more closely tied to motor performance. APL and BPL potentially encode decision information in motor areas serving the later preparation of overt decision output.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter