(1) Background

What is the percentage of psychological papers with open data? Is it more or less than 50%? Is it more or less than the last two digits of your phone number? What do you think is the percentage? Although neither of the suggestions is informative, subsequent estimates will be biased toward them. More generally, when people make numeric estimates and consider any number beforehand, their estimates are drawn towards the previously considered number. This phenomenon is called anchoring, anchoring effect, anchoring-and-adjustment (or even adjustment and anchoring, Tversky & Kahneman, 1974, p. 1128). As it has been shown that even entirely random numbers bias estimates (e.g., Bergman et al., 2010) and that even experts succumb to anchoring effects (e.g., Englich et al., 2006; Northcraft & Neale, 1987), anchoring has been termed one of the most robust phenomena of (social) psychology (e.g., Kahneman, 2012, p. 119).

Despite its robustness, there is currently no generally-accepted theoretical account for the wide range of different anchoring effects, a state of affairs not helped by contradictory findings and replication failures (e.g., Bahník, 2021a, 2021b; Harris et al., 2019). Furthermore, replication failures have drawn into question moderator findings (e.g., Big Five, Cheek & Norem, 2019; Schindler et al., 2021; intelligence, cognitive reflection, and self-control, Röseler, 2021; ego depletion, Röseler et al., 2020; and whether anchors need to be considered explicitly or whether an incidental presentation suffices, Röseler et al., 2021; Shanks et al., 2020; for a discussion, see also Röseler & Schütz, 2022).

To sum up, theories that explain anchoring and its moderators need to be developed, but the replicability of many moderator findings is uncertain. We set out to build a comprehensive empirical dataset upon which future researchers can build new anchoring theories. Specifically, we aggregated all openly available anchoring datasets that include numeric estimates from studies with at least two different anchors and supplemented these with datasets that we received from other researchers’ publications and file-drawers.

In aggregating the data, we tried to capture the full breadth of anchoring paradigms by coding numerous design features and potential moderators. Picking two different anchoring experiments will yield different procedural details almost every time as each researcher makes their own decision with respect to the absolute judgment question (e.g., How many words are there in this paragraph?), the anchors (e.g., Are there more or less than 10 words?), whether anchors are framed as random (e.g., Write down the last two digits of your phone number and think about whether there are more or less words) or potentially relevant (e.g., another participant estimated the number of words to be 90), whether participants are paid for accurate estimates or not, what the unit of the estimate is (e.g., meters or miles), and many more parameters, most of which have not received attention in previous research.

The primary goal for constructing the data set was to test whether susceptibility to anchors has been measured reliably. That is, we tested how likely it was that people who were susceptible to an anchor in one task were also susceptible to an anchor in another task. Measuring a person-specific susceptibility to anchoring effects is necessary for personality research. Only if susceptibility can be measured reliably as a trait does it make sense to expect that it may correlate with personality traits such as intelligence (e.g., Bergman et al., 2010; Cheek & Norem, 2022) or need for cognition (e.g., Epley & Gilovich, 2006). Additionally, we tested which features of the anchoring paradigm (e.g., anchor extremeness, type of task, response scale), of the study (e.g., incentives), and of the participants (age and gender) affect reliability. This is also why we chose to aggregate participant-level datasets instead of meta-analytical data (e.g., effect sizes only). The reliability of people’s susceptibility to anchoring in all paradigms with multiple items was tested and currently, there is no evidence that susceptibility to anchoring is a trait (Röseler et al., 2022). Note that psychometric properties such as reliability are rarely assessed in social psychological tasks and the lack of reliability might also apply to other tasks (e.g., Berthet, 2021; Hedge et al., 2018; Parsons et al., 2018). Possible reasons for the poor reliability of anchoring are discussed by Röseler et al. (2022). Nevertheless, the aggregated data allow for tests of numerous other moderators, such as the role of incentives, nationality, or specific paradigm features, to be assessed.

We plan to add more anchoring datasets in the foreseeable future. The dataset can be viewed, downloaded, and analyzed interactively via our ShinyApp available at https://metaanalyses.shinyapps.io/OpAQ/ to aid researchers with power analyses, study design, and literature (or data) search.

(2) Methods

2.1 Study design

Each row in the dataset represents one trial (i.e., an estimate) by a person (participant_id) for a given anchoring item (anchoring_item) and a given anchor (anchor). There may be multiple estimates per person and study (i.e., within-subjects manipulation of anchoring item) or only one (i.e., between-subjects manipulation of anchoring item). Studies included up to 30 anchoring items, but some included only one item. An item-wise version of the data with Hedges’s g per anchoring item per study is available online (https://osf.io/k745n/). Sample anchoring items with variable names and codings are provided in Figure 1.

Two Examples for Anchoring Items including Codings for Nine Variables
Figure 1 

Two Examples for Anchoring Items including Codings for Nine Variables.

Note: Other types of stimuli may be written information, images, videos, or combinations of such stimuli.

Estimates were aggregated from numerous cross-sectional studies, which is why they vary with respect to study-design and type (online versus lab study), and many other variables that we coded. An overview over all variables is provided in Table 1.

Table 1

Overview of variables with descriptions and examples included in the dataset.


id Unique ID per case Consecutive number per row

reference APA reference of the dataset or corresponding research article

reference_short Short APA reference with study number (if multiple studies were reported) e.g., “Author et al., 2022, Study 1”

link Link to download dataset

participant_id ID variable from the respective study

sex 1 = male, 2 = female, 3 = non-binary

age Participant age in years

anchoring_item Description of anchoring item from the study (e.g., year the telephone was invented) This is a brief explanation of the anchoring question.

true_value Correct answer for anchoring item (if none exists, unanchored mean estimate can be used) Unanchored mean estimates were computed on the basis of a condition where no anchor was presented, a pretest, or a previous study with a similar setting and similar participants. Unanchored mean estimates were not used if true values differed between participants (e.g., height of their grand- father).

anchor Anchor that was presented in the trial

anchorhigh 1 = high anchor, 0 = low anchor

anchortype 1 = explicitly random, 2 = fixed and provided without explanation, 3 = having some relevance with the target, 4 = self-generated Examples: 1 – Participants are involved in creating the random number, e.g., due to it being the last digits of their phone number or because they drew a number from an urn or threw a die. 2 – Was the telephone invented before or after 1830? 3 – The television was invented in 1900. When was the telephone invented? 4 – What is the boiling temperature of water on Mt Everest? (The self-generated anchor is 100°C, but 100 is not explicitly presented.

comparative_question 0 = no, 1 = yes Comparative question refers to the question asking “Is the distance between Berlin and Prague more or less than X?” For this variable, it does not matter whether participants had to give explicit responses to this question or which responses they gave. Incidental or subliminal anchors or anchors framed as “Hint: The true value Is more than 50 km” were coded as 0 (no).

direction 1 = direction of adjustment was known, 0 = direction was unknown Direction was coded as 0 if there was a comparative question (even if all participants gave the same answer to that question). Direction was coded as 1 if participants were told something like “Prices for this product in this store are given to compensate for decreases during negotiation” or “The true value is lower than $100”.

estimate Estimate that was given by participant in the respective trial

experiment_type 1 = online, 2 = lab, 3 = class, 4 = field, 5 = mixed Class refers to experiments conducted as part of a lecture or seminar in a classroom or in a synchronous online meeting. If the class was run online, it was coded as 1 (online).

incentive Monetary incentives are coded, only; 0 = not incentivized, 1 = incentivized for participation, 2 = incentivized for accurate estimation (can be coupled with incentives for participation); 3 = anchored estimate was a price participants would pay (WTP) or accept for some product (WTA); receiving feedback does not count as an incentive for participation); receiving feedback does not count as incentive Course credit and feedback were coded as 0, vouchers/coupons and ready money were coded as 1 or 2

preregistered “0” = no; if yes, the link was provided; if the pre-registration was under embargo, it was coded as “embargoed” until the embargo ends

preregtype Preregistration type AsPredicted = AsPredicted.org;
OSF-Standard Reg = OSF-Standard Pre-Data Collection Registration;
Repl Recipe Reg = Replication Recipe (Brandt et al., 2014): Pre-Registration;
Open-Ended Reg = Open-Ended Registration;
PreReg in SocPsy = Pre-Registration in Social (van ‘t Veer & Giner-Sorolla, 2016): Pre-Registration;
RegRep Protocol = Registered Report Protocol Preregistration;
OSF PreReg = OSF Preregistration

published 0 = no, 1 = yes; refers to whether the data or the corresponding article has been published in a peer-reviewed journal or conference (preprints = 0, data that is part of a published study = 1)

sampletype 1 = lay, 2 = professional or expert, 3 = mixed Examples: Professionals/experts include judges estimating punishments or car mechanics estimating values of cars.

scaletype 1 = open, 2 = closed, 3 = visual Examples: 1 – Textbox. 2 – Textbox with limited range of possible answers (e.g., probability that has to lay between 0 and 100%. 3 – visual analogue scale

stimulitype 1 = no additional information, 2 = additional information in text form, 3 = image/audio/video, 4 = 2&3, 9 = other Examples: 1 – What is the population of Chicago? [no additional information] 2 – Newspaper article about Chicago prior to the question that does not state the true value. 3 – Image or Map of Chicago. 4 – Newspaper article with an image of Chicago. 9 – People who have visited Chicago during the last year are asked.

tasktype Type of estimation task (e.g. price, quantity, age, distance, …)

adjustment Difference between estimate and anchor

absadjustment Absolute difference between estimate and anchor

score Difference between estimate and anchor divided by difference between true value and anchor

restr_score The above score but with cut-offs at 0 and 1

Descriptions of individual studies are available for all data that was part of a published research article or pre-print (variables: reference, link).

2.2 Time of data collection

Secondary data were collected from May 2021 through September 2022. Original data were collected between 2010 and 2022. The variable yearofpublication states the latest year of collection for unpublished datasets.

2.3 Location of data collection

Data were collected worldwide and stem from European, Asian, North-American, and South-American participants.

2.4 Sampling, sample and data collection

The dataset includes k = 96 studies from 57 references. The total sample size is N = 21,359 participants who provided estimates for some of 412 unique anchoring items, yielding a total of 88,914 trials.

There are 6,941 male, 9,243 female, and 81 non-binary participants. Data for gender of the remaining 5,094 participants are not available. Mean age of the participants with available data for age was 32.69 years (median = 28, N = 15,322). 8,978 did not receive monetary incentives for participating in the respective anchoring study, 11,255 received monetary incentives for participating in the study and 694 received monetary incentives for accurate estimates. For 432 participants, estimates were coupled to prices they would pay or get for products.

2.5 Materials/Survey instruments

The dataset includes 412 anchoring items. There are true values available for 355 (86.2%) of these items. Adjustment and absolute adjustment susceptibility scores were computed for all estimates. 0–1-scores and restricted 0–1-scores could be computed for items with true values, only. A list of all items is available online (https://osf.io/g95hp/). Links to single datasets are in the variable “link” in the dataset and are available for 74 studies (77.1%).

2.6 Quality Control

  • All study-level data and the first trial for all trial-level data were checked by one of the authors. All trial-level data was furthermore checked by the respective resources contributors.
  • We checked whether anchoring effects differed between published and unpublished studies or between preregistered and non-preregistered studies and found no differences.

2.7 Data anonymisation and ethical issues

No ethical approval was obtained for the data collection as only secondary data that had already been anonymized were used. No further steps to anonymize the data were taken.

2.8 Existing use of data

In cases where the original data has been published, the reference is visible in the variable reference. A full list of included studies is available online (https://osf.io/r9h7c/).

Based on the dataset, three presentations have been held and three preprints have been published:


  • Röseler, L. (2021, March). Are some people more susceptible to anchoring effects than others? Talk (online) at the 63rd Conference of Experimental Psychologists, Ulm, Germany.
  • Röseler, L., Weber, L., Stich, E., Günther, M., & Schütz, A. (2021, September). The Open Anchoring Quest (OpAQ): Tackling the reliability problem and boosting the power of anchoring research. Talk (online) at the Biennial Conference of the German Psychological Society – Personality Psychology and Psychological Diagnostics (DPPD) Section, Ulm, Germany.
  • Röseler, L., Weber, L., Stich, E., Günther, M., & Schütz, A. (2022, March). The Open Anchoring Quest (OpAQ): Explaining variance of the heterogeneous but large anchoring effects. Talk (online) at the 64th Conference of Experimental Psychologists, Köln, Germany.

Research Articles

(3) Dataset description and access

All datasets and associated materials are shared via the OpAQ’s OSF-project (https://osf.io/ygnvb/).

3.1 Repository location

DOI of the OSF-project: https://dx.doi.org/10.17605/OSF.IO/YGNVB.

Link to the OSF-project: https://osf.io/ygnvb.

3.2 Object/file name

OPAQ_JOPD.csv Available at https://osf.io/5gkf9.

3.3 Data type

Secondary data, processed data, aggregated data.

3.4 Format names and versions

Datasets are available in .csv and .xlsx formats. We recommend opening both with GNU R (version 4 or above; R Core Team, 2018) or Microsoft Office Excel (Version 2004 or above).

3.5 Language


3.6 License

CC-By Attribution 4.0 International.

3.7 Limits to sharing

The data are not under embargo and do not contain identifying information. The data may be updated with further anchoring data at a later date.

3.8 Publication date

The first version of the dataset including data from four anchoring studies was published on 23/06/2021. The latest version has been available since 01/04/2022.

3.9 FAIR data/Codebook

The datasets have been posted publicly on the Open Science Framework (OSF), documented with meta-data, and assigned a DOI. Code with which the datasets have been created is available and can be run with open source software (e.g., GNU-R).

(4) Reuse potential

Researchers can use the data for different questions related to anchoring effects but also more generally numeric estimation, advice-taking, or judgment and decision making.

As the data provide detailed information about anchoring paradigms such as true values of anchoring items (where applicable), researchers can use different anchoring scores (e.g., absolute difference between anchor and estimate) but also new scores to study the influence of any of participant-, item-, or study-features. In contrast to previous meta-analyses (Bystranowski et al., 2021; Li et al., 2021; Orr & Guthrie, 2006; Shanks et al., 2020; Townson, 2019), we did not find evidence of publication bias and there was no difference in effect size between published and unpublished studies. We plan to maintain the dataset for the foreseeable future and will add data from new studies. Thus, the dataset may become a starting-point for reviews on anchoring research but also a solid base upon which researchers can build to develop new theoretical accounts on the topic.