Questionnaire Data From the Revision of a Chinese Version of Free Will and Determinism Plus Scale

Qing-Lan Liu1, Fei Wang2, Wenjing Yan3, Kaiping Peng2, Jie Sui4 and Chuan-Peng Hu2,5 1 Department of Psychology, Hubei University, Wuhan, CN 2 Department of Psychology, Tsinghua University, Beijing, CN 3 Institute of Psychology and Behaviour Sciences, Wenzhou University, Wenzhou, CN 4 Department of Psychology, University of Bath, Bath, UK 5 German Resilience Center (DRZ), Mainz, DE Corresponding author: Chuan-Peng Hu (hcp4715@gmail.com)


Background
The data were accumulated in a project that aimed at investigating the reliability and validity of a Chinese version of the Free Will and Determinism Plus (FAD+) [1]. FAD+ is a widely-used scale to measure people's belief in free will [2][3][4][5][6][7][8] and has been translated and revised in other languages, such as Japanese [9], French [10], and Polish [11]. We translated and back-translated the scale and then collected data in three cities (Beijing, Wuhan, and Wenzhou) of China. We used these data to estimate the psychometric properties of the Chinese version of FAD+ [12].
Part of the data were collected along with a series of laboratory experiments. These experiments aimed at exploring the perceptual prioritization of the morally positive self (the good-self) [13]. An associative learning paradigm [14] was used in these experiments. Participants first learned associations between labels (e.g., good-self, bad-self, good-other, and bad-other) and geometric shapes (e.g., triangle, square, circle, and pentagon). After remembering these associations, participants then finished a perceptual matching task in which they were instructed to press one of two buttons to indicate whether a pair of label and shape presented on the screen matched the original association. After the behavioural task, participants filled a battery of questionnaires in the laboratory. Questionnaires were presented online. We only reported the questionnaire data here, the reaction times and accuracy will be opened with primary reports of these experiments [13]. See Table 1 for details about questionnaires included in each dataset.
(see Table 2). These data were accumulated in 4 waves, as being described below ( Table 1).
The dataset 1 was collected from students above 17-year-old at Hubei University, Wuhan, China in 2014. Participants were recruited through advertisement on campus. All participants in this dataset voluntarily participated the study without any material compensation.
The dataset 2 was collected through the online course Introduction of Psychology, provided at XueTangZaiXian (http://www.xuetangx.com) in 2015. Advertisement was posted on the online forum of the course, attendees voluntarily chose to take part in this study. Given that the course was open to the public, participants were from diverse background. Participants who finished the questionnaire were compensated by course credits.
The dataset 3 was collected from undergraduates in Tsinghua University who enrolled in an introductory course of psychology in 2015. Participants who took this study were compensated by course credits.
The dataset 4 was collected after participants finishing the perceptual matching task in laboratories from 2015 to 2018. These participants were from two university communities: Tsinghua University in Beijing and Wenzhou University in Wenzhou, China. Note that this dataset was reported in Liu et al. [12] as the dataset 5. Monetary bonus were paid to participants who finished all the experiment tasks and questionnaires.
All procedures performed in all waves of data collection were in accordance with the ethical standards of the local research committee at Department of Psychology, Tsinghua University, and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. Informed consent was obtained from every participant. Participants were informed about the objectives of the study and assured that all sensitive information would be removed (e.g., IP addresses) once the data were downloaded for analysis.

Materials
Dataset 1, 2 and 3 included the translated version of FAD+, family socioeconomic status (family SES), which included both parents' educational attainments and occupations,  and participants' demographic information (age, gender, education). In the retest of dataset 2 and 3, the translated version of Dualism/anti-reduction subscale from the Free Will Inventory [15] was added to the battery. In dataset 4, more questionnaires were measured: the FAD+, the Big Five Inventory [16,17], the Multidimensional Locus of Control (MLOC) inventory [18,19], Rosenberg Self-Esteem Scale [20,21], the Justice Sensitivity-Short Form [22,23], the Cognitive Reflection Test [24], the Interpersonal Reactivity Index [25], the Relational Self-Esteem Scale [26,27], the Chinese Disgust Sensitivity Scale [28], the translated version of General and Personal belief in a Just World Scale [29], psychological distance task [30], the translated version of Moral Self-Image Scale [31], Moral Identity [32,33], family income (monthly income per capita), and the MacArthur Scale of Subjective Social Status (MacArthur SSS Scale) [34] for subjective socioeconomic status measurement. The sample size for each scale are shown in Table 3.
The FAD+ [1] is a 27-item scale with four subscales: Fatalistic Determinism, Scientific Determinism, Unpredictability, and Free Will. Participants rated to what extent they agree with each statement (such as "I believe that the future has already been determined by fate"), using a 5-point Likert scale (1 = strongly disagree, 5 = strongly agree) as the original. The translated and back-translated version of FAD+ was used. In the revision of this questionnaire, we found that three items (item 6, item 17, and item 18) may need to be removed because of low item-rest correlations and factor loadings [12]. However, we kept these data in the dataset to provide a complete dataset for interested readers. Note that after we preprinted of our manuscript about the revision of FAD+, two teams in China contacted us that they both independently translated FAD+ and collected their own data. To further improve the scientific rigor of our revision, we three teams decided to collaborate and re-do the revision. Based on this new revision, the items of FAD+ used in data collection in the current manuscript became the version "Old_V1.1", see the project page for more details: https:// osf.io/2kbyz/wiki/home/.
The Dualism/anti-reduction scale, a subscale of Free Will Inventory [15], has 5 items, such as "the fact that we have souls that are distinct from our material bodies is what makes humans unique". It measures the belief in dualism and anti-reduction. We translated this subscale into Chinese, without back-translation. As the original scale, 5-point Likert scale (1 = strongly disagree, 5 = strongly agree) was used.
The Big Five Inventory is translated and revised into Chinese by Niu (2011) [17] from John and Srivastava (1999) [16], including 44 items in 5 dimensions: Agreeableness, Conscientiousness, Neuroticism, Openness, Extraversion. All items were rated using a 5-point Likert scale, from "strongly disagree" to "strongly agree" as in the original scale.
The Multidimensional Locus of Control Inventory [18,19] measures locus of control. It includes 24 items, 8 items for each of three subscales (Internal, Powerful Others, and Chance). Participants responded in a 6-point Likert scale, from -3 (strongly disagree) to 3 (strongly agree) as in the original scale. The Rosenberg Self-Esteem Scale [20,21] is a 10-item scale that measures both positive and negative feelings about the self (e.g. "On the whole, I am satisfied with myself"). All items were rated using a 4-point Likert scale, from "strongly disagree" to "strongly agree" as in the original scale.
The Justice Sensitivity was initially developed by Schmitt et al. (2010) [22] to measure justice sensitivity. Wu et al. (2014) [23] developed the scale into an 8-item Chinese version. A total of four components are included in the scale: victim sensitivity, observer sensitivity, beneficiary sensitivity, perpetrator sensitivity. An example of items could be "It worries me when I have to work hard for things that come easily to others". Participants answered each item on a 6-point rating scale ranging from 0 (not at all) to 5 (exactly) as in the original scale.
The Cognitive Reflection Test was used to assess cognitive ability [24]. It included 3 mathematical problems, each has an intuitive but erroneous answer [35]. For example, "A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the ball cost?" The intuitive answer probably is $0.10, yet the correct answer is $0.05. Suppression of the intuitive answer was required to reach the correct answer. The number of erroneous answers is the score of intuitive thinking [24], from 0~3.
The Interpersonal Reactivity Index [25,36,37] was developed to measure the individual difference of empathy. It was revised and shorted into Chinese version by Rong et al. (2010) [25], resulting in a 14-item scale. As the original scale, the Chinese version IRI also has four dimensions: fantasy, empathic concern, perspective taking, personal distress. Different with original scale, in which response options were 0-4, Rong et al. [25] used "1, 2, 3, 4, 5" (1 = not at all, 5 = exactly) to indicate the extent participants agree with each statement, i.e.,"I often have tender, concerned feelings for people less fortunate than me". The Relational Self-Esteem Scale [26,27] measured selfworth relationships with significant others using 8 items, i.e., "In general, most people think my family is very good". It contains two dimensions: the type of relationship and the perspective of evaluation. All items used a 4-point Likert scale from 1 (strongly disagree) to 4 (strongly agree) as in the original scale. The higher mean score indicated higher relational self-esteem.
The Chinese Disgust Sensitivity Scale [28] is a 30-item scale to measure disgust sensitivity. Six factors are included: body products, sex, animal, magical thinking, death, and hygiene. Each statement was rated using a 4-point Likert scale as in the original scale, for items 1-17, "1" means strongly disagree, "4" means strongly agree, for items 18-30, "1" means not disgusting at all and "4" means very disgusting.
The General and Personal belief in a Just World Scale [29] measures general (i.e., "I believe that, by and large, people get what they deserve") and personal (i.e., "I believe that, by and large, I deserve what happens to me") belief in a just world. A total of 13 items were rated using a 6-point scale (1 = strongly disagree, 6 = strongly agree) as in the original scale. The Moral Self-Image Scale was translated from Jordan et al. [31]. Participants are presented with nine traits ("caring", "compassionate", "fair", "friendly", "generous", "hard-working", "helpful", "honest", "kind") to indicate how they rate themselves as compared to their ideal moral self. A total of 9 items were answered using 9-point Likert Scale (1 = much less than the X person I want to be; 9 = much more than the X person I want to be; X is replaced by nine moral traits in the test) as in the original scale, e.g. "Compared to the caring person I want to be, I am: 1 (much less caring than the person I want to be), or 5 (Exactly as caring as the person I want to be), or 9(much more than the person I want to be)".
The Moral Identity Scale measures moral identity with 16 items [32,33]. Participants were firstly showed 10 positive moral-related adjectives ("faithful", "honest", "filial", "responsible", "generous", "polite", "kind", "helpful", "fair", "loyal"). Then, they were asked to imagine a person who has these characteristics. The person could be the participants themselves or it could be someone else. Participants thought about how the person would think, feel, and act when they answered moral identity items (e.g. "It would make me feel good to be a person who has these characteristics"). Their responses were rated from -2, strongly disagree, to 2, strongly agree as the Chinese version used. Family SES was measured by self-reported parents' educational attainments and occupations according to Shi and Shen [38]. The educational attainments were reported in one of six levels: 1 = no education at all, 2 = primary school, 3 = middle school, 4 = high school or secondary specialized school, 5 = college or equivalent, 6 = postgraduate. For the occupations, participants reported their father's and mother's occupations by choosing one of five categories, based on the standard from Lin and Bian [39]: from the lowest paid and least social reputation to highest paid and best social reputation. As in Shi and Shen [38], the score of family SES is the sum of both parent's education and occupation, ranging from 4 to 22. The higher family SES score indicates higher family socioeconomic status.
The subjective SES was collected using the MacArthur Scale of Subjective Social Status (MacArthur SSS Scale) [34]. The MacArthur SSS Scale is a single-item with a drawing of a ladder with 10 rungs that measures a person's perceived rank relative to others in their group. Participants were asked to choose a number from 1-10 to indicate the relative social standing of his/her family in society, in which 1 means the lowest rung that represents people who are the worst off, have the least money, least education, worst jobs, or no job; 10 means highest rung that represents people who are the best off, have the most money, most education, and best jobs.
We also included a psychological distance task to measure the mental distance between two persons in dataset 4. As in previous study [30], participants were asked to mark two points on a straight line to represent where the two individuals in each question (i.e., self and a good-person, self and a bad-person) fall in relation to one another. The distance between the two marks (in mm) then serves as a measure of the perceived closeness between the individuals. This method was used to measure the closeness between different people, self, a good person, a bad person, a neutral person, or a stranger. Each pair of labels were presented four times.
Participants' own education levels were also recorded. Instead of using a six-level measures as describe in family SES, we further divided the graduate level into master and doctorate level. Thus, participants chose one in 7 levels: primary school or less, middle school or equivalent, high school or equivalent, some college (vocational school after high school), college graduate (with bachelor degree or in college/university), master (with master degree or in a master program), doctor and higher (with doctor degree or in a PhD program).

Procedures
All the data were collected by online questionnaires. Note that the retest data were collected in different ways. In the data collection of dataset 2, participants were asked whether they were willing to take the test for a second time one month later. Participants who answered yes were invited to take the retest around one month later. As for dataset 3, we wrote in the informed consent that participants in this study were expected to answer the questionnaires twice and the time interval of the two tests was around 4 weeks. For the dataset 4, because the data were accumulated across different experiments, which focused on behavioural tasks, the FAD+, and the personal distance, so the other scales measured during experiments were varied across different time. And some of these experiments included the retest of these questionnaires in the task, while some didn't.

Quality Control
We added one minimal attention checking item to the FAD+ scale in dataset 1, 2, and 3 to check whether the participants filled the questionnaires with the minimal attention. This checking item required participants to choose a fixed option, i.e. the 3 rd option. If the participants didn't select that option, these participants' data will be regarded as invalid. Note that in our shared data, these participants' data were kept.
We calculated the reliabilities of these questionnaires based on data available. All scales showed relatively good reliability: (Cronbach's alpha: .52-.87, McDonald's omega: .63-.91 (see Table 3). For questionnaires with retest data, the test-retest correlations range from .34 to .87.

Ethical issues
The project was approved by Institutional Review board at Department of Psychology, Tsinghua University. All participants were informed and signed the consent before the study.
To further anonymize the data, we didn't share the subject number that was assigned to each participant in the laboratory experiment. This information is available upon request to the first author or the last author.

(3) Dataset description Data preprocessing
Data downloaded from the online investigation platform were pre-processed. We renamed all the variable names to make them more straightforward. Also, we corrected some minor error by participants (e.g., participant might fill two experiment id in test and retest). We matched all the test and retest, and analysed reliabilities for scales. Finally, we removed all sensitive information (e.g., IP address). The preprocessed data was named with a postfix "_clean".

File names
In total, there are nine files in the OSF repository: • Four data files ("FADGS_dataset1_clean.csv", "FADGS_ dataset2_clean.csv", "FADGS_dataset3_clean.csv", and "FADGS_dataset4_clean.csv") that contain all the data in each dataset. • Three codebooks ("FADGS_codebook_dataset1.xlsx", "FADGS_codebook_dataset2&3.xlsx", "FADGS_code-book_dataset4.xlsx") that included all the necessary information to understand the data files are also shared. In these codebooks, the column with "Variable_Names" contains the column names in the data files. Additional information included the references of questionnaires, names of each questionnaire, the exact description of all items, value range and meaning of these values, and the scoring rules of each questionnaire. All codebooks were recorded in both English and Chinese (in separated sheets).
• One R script ("FADGS_reliability.r") which we used to calculate reliabilities for scales as reported in the current descriptor. • One readme file ("Readme.txt") describes content or function of each file mentioned above.

Data type
Self-report survey data from 1232 participants.

Format names and versions
The data are stored in CSV format. The codebooks are in EXCEL format.

Data Collectors
Hu C-P collected the data. Liu Q-L, Wang F, and Yan W assisted part of the data collection. The data was published on 2 July 2019.

(4) Reuse potential
All the scales we used in the data collection are available in our codebooks. Interested readers can refer to the original papers or book chapters for each of those questionnaires. Our codebook are licensed under CC-BY-NC 4.0. This dataset includes a variety of questionnaires, many of them are widely-used questionnaires (e.g., BFI, Rosenberg Self-Esteem). Thus, it can be re-used for both research and educational purposes.
For example, this dataset can be used in cross-cultural studies. The data reported here were collected from a young Chinese sample, it can be re-used by researchers who had measured the same questionnaires but from other populations (e.g., FAD+ data collected in the US). Also, this dataset can be used to test the cross-cultural measurement invariance of the scales, which is crucial for cross-cultural studies [40].
Besides, our dataset included test-retest data for many scales, these test-retest may be reused for the purpose of estimating longitudinal measurement invariance.
These data can also be used to examine the psychometrical properties of some scales. For example, we only translated and back-translated the moral self-image scale but haven't examined its reliability and validity. Our data can be used for researchers who are interested in this topic. Finally, this dataset can be used for educational purpose