Collection date

The data were collected between July 2014 and September 2014.


Meta-analysis provides an important tool to synthesize evidence across different studies on the same topic. For the analysis the researcher typically needs to adopt either a fixed-effect model or a random-effects model. A fixed-effect meta-analysis assumes one true underlying effect size with variation in observed effect sizes arising solely because of sampling error. In contrast, a random-effects meta-analysis assumes a distribution of true effect sizes; consequently, observed differences in effect sizes arise both from sampling error and from true differences between effect sizes. The latter form of variation, or between-study heterogeneity, is the focus of this data set.

The random-effects model for i = 1, …, n independent effect size estimates y from n studies is given by:


where the sampling error εi is assumed to be normally distributed, i.e., εi~N(0,σε2), and the effect sizes θi are assumed to follow a normal distribution with mean μ and variance τ2, i.e., θi ~ N(μ, τ2). The estimate of the between-study heterogeneity, or τ2, is an interesting parameter since it provides information about the robustness of the effect across different contexts. If the heterogeneity is large, the effect shows considerable variation across studies, and the researcher might consider to investigate possible moderators. For example, the meta-analysis by Snyder (2013) on cognitive impairments in patients with major depressive disorder found that effect sizes for verbal working memory were larger for patients receiving a specific type of medication, compared to patients who did not receive this medication.

It is unclear, however, what constitutes a “large” heterogeneity in the field of psychology. Therefore, it is important to obtain an overview of between-study heterogeneity estimates encountered in meta-analyses in psychology. Such an overview allows researchers to compare the τ2 estimate in their meta-analysis to the general distribution of estimates and gauge the size of their between-study heterogeneity. In addition, when conducting a Bayesian meta-analysis, this overview can provide a basis for the construction of an informed prior distribution for between-study heterogeneity.

In the medical literature, several attempts have been made to present an overview of between-study heterogeneity estimates from a large number of meta-analyses and construct an informed prior distribution based on these estimates [1, 2, 3]. In the field of psychology, an early overview of meta-analyses on the efficacy of psychological, educational, and behavioral treatments is given by [4]. Based on the resulting 302 meta-analyses, [4] investigated several characteristics, including effect size, methodological quality, publication bias, and sample size. A second overview in psychology is given by [5], who focused on the use of fixed-effect and random-effects models in meta-analyses reported in Psychological Bulletin from 1978 to 2006 (169 studies). However, none of the existing overviews of meta-analyses in psychology include between-study heterogeneity. Therefore, the aim of this study was to provide a general distribution of between-study heterogeneity estimates, based on τ2 estimates extracted from a large number of meta-analyses in the field of psychology. The data set presented here is the result of this literature review.



We extracted 705 between-study heterogeneity estimates τ2 from meta-analyses reported in 61 articles published in Psychological Bulletin from 1990–2013. Most articles featured different outcome measures and therefore provided multiple estimates. For example, when considering sex differences in sexuality, Petersen and Hyde (2010) considered 14 sexual behaviors and 16 sexual attitudes, resulting in 30 heterogeneity estimates. Therefore, the heterogeneity estimates are not fully independent of each other.

The selected time frame featured 255 articles reporting meta-analyses, but we included only studies that provided estimates of the between-study variance τ2 or standard deviation τ. Some studies performed the meta-analysis with multiple models (e.g., fixed and random effects, with and without moderators). In these cases, we included only the random effects analysis without moderators, in order to obtain a maximum indication of the between-study heterogeneity.


No materials were used, apart from a laptop to search for meta-analyses.


First, we searched for articles containing overviews of meta-analyses; however, these articles almost never reported the estimated τ2 values. Instead, we focused on separate meta-analyses published in Psychological Bulletin, a journal that primarily publishes qualitative and quantitative reviews in psychology, as is evident from the journal description:

Psychological Bulletin publishes evaluative and integrative research reviews and interpretations of issues in scientific psychology. Both qualitative (narrative) and quantitative (meta-analytic) reviews will be considered, depending on the nature of the database under consideration for review.” [6].

Through the search engine Ebscohost, we searched each issue from 1990 up to and including 2013 by selecting “meta-analysis” as methodology or by searching on the keyword “meta-analysis”. If this gave no results, all titles in the issue were scanned for the keywords “meta-analysis”, “review”, or “synthesis”. The resulting 255 articles were scanned to determine whether a meta-analysis was performed and an estimate of τ2 provided. This was the case in 61 articles. Since most articles included multiple analyses (such as separate meta-analyses for multiple outcome variables), we extracted several heterogeneity estimates per article, resulting in a final data set of 705 estimates. Note that different estimators for τ2 exist (e.g., Hunter-Schmidt, DerSimonian-Laird, maximum likelihood), which can provide varying or even conflicting estimates of the between-study heterogeneity [7]. Unfortunately, most articles did not report which estimator for τ2 was used. All meta-analyses were based on the random-effects model in equation 1. When articles provided the between-study standard deviation (variance), we computed the variance (standard deviation) so that the data set includes both τ and τ2 estimates for each meta-analysis. The data extraction and coding were performed manually by the first author.

Quality control

The data were collected with the utmost care and conscientiousness. In addition, the Appendix provides an overview of all articles included in the data set so that the full data set can be easily checked.

Ethical issues

Only secondary data were used to construct the data set.

Data set description

Figure 1 shows a screenshot of the data file. For each article, the following information was recorded: (1) the variables in the meta-analysis; (2) the number of effect sizes per analysis; (3) the between-study standard deviation τ; (4) the between-study variance τ2; (5) the Q-statistic (i.e., the test statistic of a commonly used statistical test to determine whether there exists significant heterogeneity; [8]); (6) the I2-statistic (i.e., the percentage of total variation in effect sizes due to true between-study heterogeneity; computed based on the Q-statistic as: QdfQ*100; [9]); (7) the type of effect size; (8) the test(s) used to assess publication bias; and (9) whether or not publication bias was present. We have included a recoded variable for publication bias with ‘No publication bias’ = 0 and ‘Publication bias’ = 1. Publication bias is coded as ‘NA’ if no test for publication bias was reported in the article. This was the case in 385 instances. Histograms of the between-study standard deviation for mean differences and correlations are available in Figure 2 and Figure 3, respectively. Note that 18 heterogeneity estimates were not based on Cohen’s d, Hedges’ g, or Pearson r effect sizes and these estimates have not been included in the histograms. Figure 4 shows the histogram for the I2-statistic, which does not depend on the type of effect size used. Note that the I2 statistic is fixed to zero when Q is smaller than the degrees of freedom (i.e., the number of effect sizes – 1).

Figure 1 

Screenshot of the data file “Data 1990–2013 with tau values.xlsx”.

Figure 2 

Histogram of the between-study standard deviation for mean difference effect sizes (Cohen’s d and Hedges’ g).

Figure 3 

Histogram of the between study standard deviation for correlation effect sizes (Pearson’s r).

Figure 4 

Histogram of the I2 statistic for all types of effect sizes.

Object name

The name of the file is “Data 1990–2013 with tau values.xlsx”.

Data type

Secondary data.

Format names and versions

This is the third and final version of the data, in Excel, comma-separated values (CSV), and text format. In the first version, the type of effect size was missing for one article (Hartwig and Bond Jr., 2011). In the second version, we included the columns “Entry in reference”, “Analysis description”, “I2”, and “Test for publication bias” and removed a column in which the type of effect size was recoded. Moreover, we discovered coding errors for the column “Publication bias?”. Specifically, several articles which did not test for publication bias were coded as having no publication bias. This has been changed to NA for the articles: Else-quest et al. (2012); Hagger et al. (2010); van Zomeren et al. (2008); Postmes and Spears (1998); Ambady and Rosenthal (1992); and Raz and Raz (1990). In addition, we changed the recoded publication bias variable to remain NA if information regarding publication bias was not reported in the article, instead of coding this as 2.

Reading and cleaning the data

The following R code can be used to read in the data and clean it:

require(xlsx) #to read in xlsx files
setwd() #specify directory where the data is located
dat <- read.xlsx(“Data 1990–2013 with tau values.xlsx”, sheetIndex = 1)
#remove missing estimates:
dat.noNA <- dat[-c(which(is.na(dat$tau.2))), ] attach(dat.noNA)
sum(tau.2 == 0) #estimates equal to 0

Some heterogeneity estimates are missing for articles in which meta-analyses were conducted for multiple variables. In these cases, it sometimes happened that only one study was found for some reported variables and thus only one effect size estimate is available. These effect sizes have been included in the data set, but naturally there is no corresponding heterogeneity estimate available. An example is the article by DeNeve and Cooper (1998). In addition, heterogeneity estimates were sometimes missing despite the fact that multiple effect sizes were available. This occurred once in the article by DeNeve and Cooper (1998), who did not provide an explanation. It occurred multiple times in the article by Mathieu and Zajac (1990) when the between study heterogeneity was completely attributable to statistical artifacts.

A total of 90 heterogeneity estimates are equal to zero. When there exists no between-study variance, a fixed-effect meta-analysis is appropriate. However, some articles reported random-effects meta-analyses by default or reported both fixed-effect and random-effects meta-analyses, in which case the heterogeneity estimate can be zero.

Data collectors

Sara van Erp (Tilburg University) collected the data.




CC0 1.0 Universal.


The data set is not under embargo.

Repository location

The data set is published in the Open Science Framework repository, available at https://osf.io/wyhve/.

Publication date

The data set was originally published on 29/11/2015. The second version was published on 08/04/2017 and the final version was published on 17/07/2017.

Reuse potential

The data set provides an overview of between-study heterogeneity estimates based on 61 meta-analyses in psychology. The main application of this data set is to facilitate the construction of an informed prior for the between-study heterogeneity in Bayesian meta-analysis, which has been done in [10] for heterogeneity in mean difference effect sizes. Researchers interested in a Bayesian meta-analysis on correlations can use the heterogeneity estimates for the correlation effect sizes to construct an informed prior. More generally, the description of between-study heterogeneity available in this data set is useful as reference. For example, researchers can compare an obtained τ2 estimate to the distribution of estimates and assess the relative size of the between-study heterogeneity in their meta-analysis.

In addition to the τ2 estimates of the between-study heterogeneity, the data set contains information on the number of studies in each meta-analysis, the Q-statistic and presence of publication bias. This information can be used for example in Monte Carlo simulation studies to determine a realistic range of population values for these variables. Moreover, the data set can be used by researchers interested in the effects of publication bias on heterogeneity estimates or the Q-test. Finally, as new methods to test for publication bias become available, researchers might want to use the information on publication bias as reference point to compare the new methods to.

Additional Files

The additional files for this article can be found as follows:


List of included meta-analyses (all studies are published in Psychological Bulletin). DOI: https://doi.org/10.5334/jopd.33.s1


Data set containing 705 between-study heterogeneity estimates from meta-analyses published in Psychological Bulletin from 1990-2013. DOI: https://doi.org/10.5334/jopd.33.s2