Estimates of between-study heterogeneity for 705 meta-analyses reported in Psychological Bulletin from 1990-2013

We present a data set containing 705 between-study heterogeneity estimates τ 2 as reported in 61 articles published in Psychological Bulletin from 1990–2013. The data set also includes information about the number and type of effect sizes, the Q- and I 2 -statistics, and publication bias. The data set is stored in the Open Science Framework repository (https://osf.io/wyhve/) and can be used for several purposes: (1) to compare a specific heterogeneity estimate to the distribution of between-study heterogeneity estimates in psychology; (2) to construct an informed prior distribution for the between-study heterogeneity in psychology; (3) to obtain realistic population values for Monte Carlo simulations investigating the performance of meta-analytic methods.

The data were collected between July 2014 and September 2014.

Background
Meta-analysis provides an important tool to synthesize evidence across different studies on the same topic. For the analysis the researcher typically needs to adopt either a fixed-effect model or a random-effects model. A fixedeffect meta-analysis assumes one true underlying effect size with variation in observed effect sizes arising solely because of sampling error. In contrast, a random-effects meta-analysis assumes a distribution of true effect sizes; consequently, observed differences in effect sizes arise both from sampling error and from true differences between effect sizes. The latter form of variation, or between-study heterogeneity, is the focus of this data set.
The random-effects model for i = 1, …, n independent effect size estimates y from n studies is given by: where the sampling error ε i is assumed to be normally distributed, i.e., ε i ∼ N(0, σ ε 2 ), and the effect sizes θ i are assumed to follow a normal distribution with mean μ and variance τ 2 , i.e., θ i ∼ N(μ, τ 2 ). The estimate of the between-study heterogeneity, or τ 2 , is an interesting parameter since it provides information about the robustness of the effect across different contexts. If the heterogeneity is large, the effect shows considerable variation across studies, and the researcher might consider to investigate possible moderators. For example, the meta-analysis by Snyder (2013) on cognitive impairments in patients with major depressive disorder found that effect sizes for verbal working memory were larger for patients receiving a specific type of medication, compared to patients who did not receive this medication.
It is unclear, however, what constitutes a "large" heterogeneity in the field of psychology. Therefore, it is important to obtain an overview of between-study heterogeneity estimates encountered in meta-analyses in psychology. Such an overview allows researchers to compare the τ 2 estimate in their meta-analysis to the general distribution of estimates and gauge the size of their between-study heterogeneity. In addition, when conducting a Bayesian meta-analysis, this overview can provide a basis for the construction of an informed prior distribution for between-study heterogeneity.
In the medical literature, several attempts have been made to present an overview of between-study heterogeneity estimates from a large number of metaanalyses and construct an informed prior distribution based on these estimates [1,2,3]. In the field of psychology, an early overview of meta-analyses on the efficacy of psychological, educational, and behavioral treatments is given by [4]. Based on the resulting 302 meta-analyses, [4] investigated several characteristics, including effect size, methodological quality, publication bias, and sample size. A second overview in psychology is given by [5], who focused on the use of fixed-effect and random-effects models in meta-analyses reported in Psychological Bulletin from 1978 to 2006 (169 studies). However, none of the existing overviews of meta-analyses in psychology include between-study heterogeneity. Therefore, the aim of this study was to provide a general distribution of betweenstudy heterogeneity estimates, based on τ 2 estimates extracted from a large number of meta-analyses in the field of psychology. The data set presented here is the result of this literature review.

Sample
We extracted 705 between-study heterogeneity estimates τ 2 from meta-analyses reported in 61 articles published in Psychological Bulletin from 1990-2013. Most articles featured different outcome measures and therefore provided multiple estimates. For example, when considering sex differences in sexuality, Petersen and Hyde (2010) considered 14 sexual behaviors and 16 sexual attitudes, resulting in 30 heterogeneity estimates. Therefore, the heterogeneity estimates are not fully independent of each other.
The selected time frame featured 255 articles reporting meta-analyses, but we included only studies that provided estimates of the between-study variance τ 2 or standard deviation τ. Some studies performed the meta-analysis with multiple models (e.g., fixed and random effects, with and without moderators). In these cases, we included only the random effects analysis without moderators, in order to obtain a maximum indication of the between-study heterogeneity.

Materials
No materials were used, apart from a laptop to search for meta-analyses.

Procedures
First, we searched for articles containing overviews of meta-analyses; however, these articles almost never reported the estimated τ 2 values. Instead, we focused on separate meta-analyses published in Psychological Bulletin, a journal that primarily publishes qualitative and quantitative reviews in psychology, as is evident from the journal description: "Psychological Bulletin publishes evaluative and integrative research reviews and interpretations of issues in scientific psychology. Both qualitative (narrative) and quantitative (meta-analytic) reviews will be considered, depending on the nature of the database under consideration for review." [6].
Through the search engine Ebscohost, we searched each issue from 1990 up to and including 2013 by selecting "meta-analysis" as methodology or by searching on the keyword "meta-analysis". If this gave no results, all titles in the issue were scanned for the keywords "meta-analysis", "review", or "synthesis". The resulting 255 articles were scanned to determine whether a meta-analysis was performed and an estimate of τ 2 provided. This was the case in 61 articles. Since most articles included multiple analyses (such as separate meta-analyses for multiple outcome variables), we extracted several heterogeneity estimates per article, resulting in a final data set of 705 estimates. Note that different estimators for τ 2 exist (e.g., Hunter-Schmidt, DerSimonian-Laird, maximum likelihood), which can provide varying or even conflicting estimates of the between-study heterogeneity [7]. Unfortunately, most articles did not report which estimator for τ 2 was used. All meta-analyses were based on the random-effects model in equation 1. When articles provided the between-study standard deviation (variance), we computed the variance (standard deviation) so that the data set includes both τ and τ 2 estimates for each meta-analysis. The data extraction and coding were performed manually by the first author.

Quality control
The data were collected with the utmost care and conscientiousness. In addition, the Appendix provides an overview of all articles included in the data set so that the full data set can be easily checked.

Ethical issues
Only secondary data were used to construct the data set. Figure 1 shows a screenshot of the data file. For each article, the following information was recorded: (1) the variables in the meta-analysis; (2) the number of effect sizes per analysis; (3) the between-study standard deviation τ; (4) the between-study variance τ 2 ; (5) the Q-statistic (i.e., the test statistic of a commonly used statistical test to determine whether there exists significant heterogeneity; [8]); (6) the I 2 -statistic (i.e., the percentage of total variation in effect sizes due to true between-study heterogeneity; computed based on the Q-statistic as:

Data set description
; [9]); (7) the type of effect size; (8) the test(s) used to assess publication bias; and (9) whether or not publication bias was present. We have included a recoded variable for publication bias with 'No publication bias' = 0 and 'Publication bias' = 1. Publication bias is coded as 'NA' if no test for publication bias was reported in the article. This was the case in 385 instances. Histograms of the between-study standard deviation for mean differences and correlations are available in Figure 2 and Figure 3, respectively. Note that 18 heterogeneity estimates were not based on Cohen's d, Hedges' g, or Pearson r effect sizes and these estimates have not been included in the histograms. Figure 4 shows the histogram for the I 2 -statistic, which does not depend on the type of effect size used. Note that the I 2 statistic is fixed to zero when Q is smaller than the degrees of freedom (i.e., the number of effect sizes -1).

Object name
The name of the file is "Data 1990-2013 with tau values.xlsx".

Data type
Secondary data.

Format names and versions
This is the third and final version of the data, in Excel, comma-separated values (CSV), and text format. In the first version, the type of effect size was missing for one article (Hartwig and Bond Jr., 2011). In the second version, we included the columns "Entry in reference", "Analysis description", "I 2 ", and "Test for publication bias" and removed a column in which the type of effect size was recoded. Moreover, we discovered coding errors for the column "Publication bias?". Specifically, several articles which did not test for publication bias were coded as having no publication bias. This has been changed to NA for the articles: Else-quest et al.

Reading and cleaning the data
The following R code can be used to read in the data and clean it: require(xlsx) #to read in xlsx files setwd() #specify directory where the data is located dat <-read.xlsx("Data 1990-2013 with tau values.xlsx", sheetIndex = 1) #remove missing estimates: dat.noNA <-dat[-c(which(is.na(dat$tau.2))), ] attach(dat.noNA) sum(tau.2 == 0) #estimates equal to 0 Some heterogeneity estimates are missing for articles in which meta-analyses were conducted for multiple variables. In these cases, it sometimes happened that only one study was found for some reported variables and thus only one effect size estimate is available. These effect sizes have been included in the data set, but naturally there is no corresponding heterogeneity estimate available. An example is the article by DeNeve and Cooper (1998). In addition, heterogeneity estimates were sometimes missing despite the fact that multiple effect sizes were available. This occurred once in the article by DeNeve and Cooper (1998), who did not provide an explanation. It occurred multiple times in the article by Mathieu and Zajac (1990) when the between study heterogeneity was completely attributable to statistical artifacts.
A total of 90 heterogeneity estimates are equal to zero. When there exists no between-study variance, a fixed-effect meta-analysis is appropriate. However, some articles reported random-effects meta-analyses by default or reported both fixed-effect and random-effects meta-analyses, in which case the heterogeneity estimate can be zero.

Data collectors
Sara van Erp (Tilburg University) collected the data.

Embargo
The data set is not under embargo.

Repository location
The data set is published in the Open Science Framework repository, available at https://osf.io/wyhve/.

Publication date
The data set was originally published on 29/11/2015. The second version was published on 08/04/2017 and the final version was published on 17/07/2017.

Reuse potential
The data set provides an overview of between-study heterogeneity estimates based on 61 meta-analyses in psychology. The main application of this data set is to facilitate the construction of an informed prior for the between-study heterogeneity in Bayesian meta-analysis, which has been done in [10] for heterogeneity in mean difference effect sizes. Researchers interested in a Bayesian meta-analysis on correlations can use the heterogeneity estimates for the correlation effect sizes to construct an informed prior. More generally, the description of between-study heterogeneity available in this data set is useful as reference. For example, researchers can compare an obtained τ 2 estimate to the distribution of estimates and assess the relative size of the between-study heterogeneity in their meta-analysis.
In addition to the τ 2 estimates of the between-study heterogeneity, the data set contains information on the number of studies in each meta-analysis, the Q-statistic and presence of publication bias. This information can be used for example in Monte Carlo simulation studies to determine a realistic range of population values for these variables. Moreover, the data set can be used by researchers interested in the effects of publication bias on heterogeneity estimates or the Q-test. Finally, as new methods to test for publication bias become available, researchers might want to use the information on publication bias as reference point to compare the new methods to.

Additional Files
The additional files for this article can be found as follows: