(1) Overview


Collection Date(s)



Although replication is a central tenet of science [2], replications are rarely performed in psychology [3]. Because of this, some researchers have started to question the validity of scientific research [4]. Additionally, effects observed in individuals from one culture may not generalize to individuals from other cultures [5]. The present project tested the replicability of 13 included effects in a large sample of participants across a variety of samples and contexts. The large aggregate N allows for a precise estimate of the effect size of the included effects, while testing the effects across numerous samples and settings allows for an examination of whether those factors influence the strength of the included effects.

(2) Methods


Our sample was comprised of 6,344 participants recruited from 36 different sources including university subject pools, Amazon Mechanical Turk, Project Implicit, and other sources. The aggregate sample had a mean age of 25.98. Participant ethnicity was: 65.1% White, 6.7% Black or African American, 6.5% East Asian, 4.5% South Asian, 17.2% Other or Unknown. Participant gender was 63.6% female, 29.9% male, 6.5% no response. Participants who did not complete the 15-25 minute study were excluded from the analysis.


The study materials consisted of 13 effects drawn from 12 papers, recreated as closely as possible to the original implementation (exact wording and implementation can be found in the online supplement: https://osf.io/wx7ck/):

  • Sunk costs Sunk costs (Oppenheimer, Meyvis, & Davidenko, 2009) [6].
  • Gain versus loss framing (Tversky & Kahneman, 1981) [7].
  • Anchoring (Jacowitz & Kahneman, 1995) [8].
  • Retrospective gambler’s fallacy (Oppenheimer & Monin, 2009) [9].
  • Low-vs.-high category scales (Schwarz, Hippler, Deutsch, & Strack, 1985) [10].
  • Norm of reciprocity (Hyman & Sheatsley, 1950) [11].
  • Allowed/Forbidden (Rugg, 1941) [12].
  • Quote Attribution (Lorge & Curtis, 1936) [13].
  • Flag Priming (Carter, Ferguson, & Hassin, 2011; Study 2) [14].
  • Currency priming (Caruso, Vohs, Baxter, & Waytz, 2013) [15].
  • Imagined contact (Husnu & Crisp, 2010; Study 1) [16].
  • Sex differences in implicit math attitudes (Nosek, Banaji, & Greenwald, 2002) [17].


The study was administered through an Internet link provided to all researchers. Researchers then brought participants into the lab to complete the study through that link, or facilitated an online collection where the link would be supplied directly to participants. The twelve studies were presented in random order, except that the study assessing sex differences in implicit and explicit math attitudes was always last. That study was last because we and the original authors were confident order would have no effect on that finding.

Quality control

The study was administered through a standardized online link to ensure consistency in presentation, and in addition each in-lab data collection site was required to film a “mock session” of the data collection. These mock session videos are available in the online supplement: https://osf. io/wx7ck/.

Ethical issues

IRB approval was obtained at each data collection site (in accordance with local rules). Informed consent was given to all participants. The shared dataset was stripped of any potentially identifying variables before being uploaded.

(3) Dataset description

Object name


Data type

Processed data. The .zip file includes a raw dataset with the data collected from each lab site after being stripped of identifying information. The .zip file also includes a “cleaned” dataset file with some variables added for ease of use (e.g. condition assignments are coded).

Format names and versions

Provided in both .sav (SPSS) and .dat (tab delimited) formats.


English – with some data in other languages (e.g. open response answers from non-English speaking individuals).





Repository location


Publication date

29 November 2013

(4) Reuse potential

This dataset could be used to more thoroughly investigate the specific replication studies (e.g., anchoring-and-adjustment). These data could also be used to investigate replication more broadly. For the 13 included effects, these data could be included in meta-analyses, or re-analyzed to identify moderating variables that were not investigated in the original analysis. Additionally, these data could be used to formulate new hypotheses about the conditions under which each effect will be stronger or weaker. Alternatively, these data could be used to investigate or teach replicability more broadly; for instance, by demonstrating how the result from any one experiment can be misleading when compared to a larger body of work (in this case, 35 other replications).