Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
1,241 result(s) for "Falsification"
Sort by:
Application of the Instrumental Inequalities to a Mendelian Randomization Study With Multiple Proposed Instruments
BACKGROUND:Investigators often support the validity of Mendelian randomization (MR) studies, an instrumental variable approach proposing genetic variants as instruments, via subject matter knowledge. However, the instrumental variable model implies certain inequalities, offering an empirical method of falsifying (but not verifying) the underlying assumptions. While these inequalities are said to detect only extreme assumption violations in practice, to our knowledge they have not been used in settings with multiple proposed instruments. METHODS:We applied the instrumental inequalities to an MR analysis of the effect of maternal pregnancy Vitamin D on offspring psychiatric outcomes, proposing four independent maternal genetic variants as instruments. We assessed whether the proposed instruments satisfied the instrumental inequalities separately and jointly and explored the instrumental inequalities’ properties via simulations. RESULTS:The instrumental inequalities were satisfied (i.e., we did not falsify the MR model) when considering each variant separately. However, the inequalities were violated when considering four variants jointly and for some combinations of two or three variants (2 of 36 two-variant combinations and 18 of 24 three-variant combinations). In simulations, the inequalities detected structural biases more often when assessing proposed instruments jointly, while falsification in the absence of structural bias remained rare. CONCLUSIONS:The instrumental inequalities detected violations of the MR assumptions for genetic variants jointly proposed as instruments in our study, though the instrumental inequalities were satisfied when considering each proposed instrument separately. We discuss how investigators can assess instrumental inequalities to eliminate clearly invalid analyses in settings with many proposed instruments and provide appropriate code
WHEN IS PARALLEL TRENDS SENSITIVE TO FUNCTIONAL FORM?
This paper assesses when the validity of difference-in-differences depends on functional form. We provide a novel characterization: the parallel trends assumption holds under all strictly monotonic transformations of the outcome if and only if a stronger “parallel trends”-type condition holds for the cumulative distribution function of untreated potential outcomes. This condition for parallel trends to be insensitive to functional form is satisfied if and essentially only if the population can be partitioned into a subgroup for which treatment is effectively randomly assigned and a remaining sub-group for which the distribution of untreated potential outcomes is stable over time. These conditions have testable implications, and we introduce falsification tests for the null that parallel trends is insensitive to functional form.
Prueba material de la existencia de una falsificación epigráfica en Carmona, Sevilla
Presentamos una pieza epigráfica como prueba material de la existencia de una falsificación en Carmona (Sevilla), dada a conocer en el siglo XVIII por Cándido María Trigueros aunque desapareciendo hasta el siglo XXI en la que resurgió al ser donada al Museo de la localidad. Su polémica radicó en la interpretación que se le dio a las anotaciones de Trigueros en el siglo XIX, siendo incluida en el catálogo de Hübner en CIL II 129* que analizaremos con detenimiento.
A survey of safety and trustworthiness of large language models through the lens of verification and validation
Large language models (LLMs) have exploded a new heatwave of AI for their ability to engage end-users in human-level conversations with detailed and articulate answers across many knowledge domains. In response to their fast adoption in many industrial applications, this survey concerns their safety and trustworthiness. First, we review known vulnerabilities and limitations of the LLMs, categorising them into inherent issues, attacks, and unintended bugs. Then, we consider if and how the Verification and Validation (V&V) techniques, which have been widely developed for traditional software and deep learning models such as convolutional neural networks as independent processes to check the alignment of their implementations against the specifications, can be integrated and further extended throughout the lifecycle of the LLMs to provide rigorous analysis to the safety and trustworthiness of LLMs and their applications. Specifically, we consider four complementary techniques: falsification and evaluation, verification, runtime monitoring, and regulations and ethical use. In total, 370+ references are considered to support the quick understanding of the safety and trustworthiness issues from the perspective of V&V. While intensive research has been conducted to identify the safety and trustworthiness issues, rigorous yet practical methods are called for to ensure the alignment of LLMs with safety and trustworthiness requirements.
Solid support or secret dissent? A list experiment on preference falsification during the Russian war against Ukraine
Do individuals reveal their true preferences when asked for their support for an ongoing war? This research note presents the results of a list experiment implemented in the midst of the Russian invasion of Ukraine. Our experiment allows us to estimate the extent of preference falsification with regard to support for the war by comparing the experimental results with a direct question. Our data comes from an online sample of 3000 Russians. Results show high levels of support for the war and significant levels of preference falsification: when asked directly, 71% of respondents support the war, while this share drops to 61% when using the list experiment. Preference falsification is particularly pronounced among individuals using TV as a main source of news. Our results imply that war leaders can pursue peace without fearing a large popular backlash, but also show that high levels of support for war can be sustained even once the brutality of the war has become clear.
Do Chinese Citizens Conceal Opposition to the CCP in Surveys? Evidence from Two Experiments
Most public opinion research in China uses direct questions to measure support for the Chinese Communist Party (CCP) and government policies. These direct question surveys routinely find that over 90 per cent of Chinese citizens support the government. From this, scholars conclude that the CCP enjoys genuine legitimacy. In this paper, we present results from two survey experiments in contemporary China that make clear that citizens conceal their opposition to the CCP for fear of repression. When respondents are asked directly, we find, like other scholars, approval ratings for the CCP that exceed 90 per cent. When respondents are asked in the form of list experiments, which confer a greater sense of anonymity, CCP support hovers between 50 per cent and 70 per cent. This represents an upper bound, however, since list experiments may not fully mitigate incentives for preference falsification. The list experiments also suggest that fear of government repression discourages some 40 per cent of Chinese citizens from participating in anti-regime protests. Most broadly, this paper suggests that scholars should stop using direct question surveys to measure political opinions in China.
In Search of Self-Censorship
Item nonresponse rates across regime assessment questions and nonsensitive items are used to create a self-censorship index, which can be compared across countries, over time and across population subgroups. For many authoritarian systems, citizens do not display higher rates of item nonresponse on regime assessment questions than their counterparts in democracies. This result suggests such questions may not be that sensitive in many places, which in turn raises doubts that authoritarian citizens are widely feigning positive attitudes towards regimes they secretly despise. Higher levels of self-censorship are found under regimes without electoral competition for the executive.
Equivalence Testing for Regression Discontinuity Designs
Regression discontinuity (RD) designs are increasingly common in political science. They have many advantages, including a known and observable treatment assignment mechanism. The literature has emphasized the need for “falsification tests” and ways to assess the validity of the design. When implementing RD designs, researchers typically rely on two falsification tests, based on empirically testable implications of the identifying assumptions, to argue the design is credible. These tests, one for continuity in the regression function for a pretreatment covariate, and one for continuity in the density of the forcing variable, use a null of no difference in the parameter of interest at the discontinuity. Common practice can, incorrectly, conflate a failure to reject evidence of a flawed design with evidence that the design is credible. The well-known equivalence testing approach addresses these problems, but how to implement equivalence tests in the RD framework is not straightforward. This paper develops two equivalence tests tailored for RD designs that allow researchers to provide statistical evidence that the design is credible. Simulation studies show the superior performance of equivalence-based tests over tests-of-difference, as used in current practice. The tests are applied to the close elections RD data presented in Eggers et al. (2015b).
How Do Travel Costs Shape Collaboration?
We develop a simple theoretical framework for thinking about how geographic frictions, and in particular travel costs, shape scientists’ collaboration decisions and the types of projects that are developed locally versus over distance. We then take advantage of a quasi-experiment—the introduction of new routes by a low-cost airline—to test the predictions of the theory. Results show that travel costs constitute an important friction to collaboration: after a low-cost airline enters, the number of collaborations increases between 0.3 and 1.1 times, a result that is robust to multiple falsification tests and causal in nature. The reduction in geographic frictions is particularly beneficial for high-quality scientists that are otherwise embedded in worse local environments. Consistent with the theory, lower travel costs also endogenously change the types of projects scientists engage in at different levels of distance. After the shock, we observe an increase in higher-quality and novel projects, as well as projects that take advantage of complementary knowledge and skills between subfields, and that rely on specialized equipment. We test the generalizability of our findings from chemistry to a broader data set of scientific publications and to a different field where specialized equipment is less likely to be relevant, mathematics. Last, we discuss implications for the formation of collaborative research and development teams over distance. This paper was accepted by Toby Stuart, entrepreneurship and innovation.
Compositional Falsification of Cyber-Physical Systems with Machine Learning Components
Cyber-physical systems (CPS), such as automotive systems, are starting to include sophisticated machine learning (ML) components. Their correctness, therefore, depends on properties of the inner ML modules. While learning algorithms aim to generalize from examples, they are only as good as the examples provided, and recent efforts have shown that they can produce inconsistent output under small adversarial perturbations. This raises the question: can the output from learning components lead to a failure of the entire CPS? In this work, we address this question by formulating it as a problem of falsifying signal temporal logic specifications for CPS with ML components. We propose a compositional falsification framework where a temporal logic falsifier and a machine learning analyzer cooperate with the aim of finding falsifying executions of the considered model. The efficacy of the proposed technique is shown on an automatic emergency braking system model with a perception component based on deep neural networks.