Participants consisted of a convenience sample of students at the University of Amsterdam. The study was approved by the Ethics Review Board of the department of Psychology at the University of Amsterdam (2017-SP-7871) and performed in accordance with relevant institutional guidelines. The budget allowed for scanning of a maximum of 60 participants. After applying the exclusion criteria, the total sample consisted of 54 participants, including 38 women (Mage = 22.4, SD = 2.9) and 16 men (Mage = 23.8, SD = 1.8).
This study used a 2 (choice: active-choice vs. passive-viewing) × 2 (phase: induction vs. relief) × 2 (valence: negative vs. positive) mixed design. The variable choice was varied between participants and consisted of an active-choice condition and a passive-viewing condition. The variable phase was varied within participants and reflected the presentation of the cue (i.e., induction) vs. the presentation of the image (i.e., relief). The variable valence was varied within participants and reflected the negative vs. positive content of the cues/images.
The present study utilized a choice task7 that presented participants with verbal cues describing negative and positive images, and offered them a choice to see these images or blurred versions. We used a yoked design that isolated the effects of choice, controlling for general affective, semantic and visual processing (see also32,33). This yoked design resulted in two conditions: the active-choice condition and the passive-viewing condition. Participant in the passive-viewing condition did not make choices, but were confronted with the choice profile of a yoked participant in the choice condition. Tasks in both conditions were programmed in Neurobs Presentation (https://www.neurobs.com/presentation). Behavioral data preprocessing was done using Python 3.5 (Python Software Foundation; https://www.python.org/) and analyzed using IBM SPSS Statistics 22.0.
In the active-choice condition, participants were presented with 35 negative cues (e.g., rescue workers treat a wounded man; a soldier kicks a civilian against his head) and 35 positive cues (e.g., children throw flower petals at a wedding; partying people carry a crowd surfer) that described images, in random order. In each trial participants could choose, based on the cue, whether they wanted to view the corresponding image or not. The choice task consisted of a total of 70 trials. Each trial started with a fixation cross, presented for 500 ms, followed by the cue, presented for 3,000 ms. The presentation of the cue was labelled as the induction phase (see also21,22). The cue was followed by a jittered interval varying between 500 and 2000 ms. Subsequently, participants saw the words ‘yes’ and ‘no’ on the screen, and chose whether they wanted to see the image that was described by the cue, or not, by pressing one of two pre-specified buttons. Immediately following their response, the word ‘yes’ or ‘no’ turned green, indicating that their response was registered. Participants had a 2000 ms. time window to make their choice. If they had not made a choice after 2000 ms, the choice was automatically set to ‘no’. The response phase was followed by a jittered interval varying between 500 and 2000 ms. The interval was followed by the relief phase, in which the participants were presented with the image (1,024 × 768 pixels) when they chose ‘yes’. When participants chose not to see the corresponding image, they were presented with a blurred version of the image that was unrecognizably distorted (filter). Images were blurred with the software IrfanView (version 4.44; https://www.irfanview.com/) using the fast Gaussian blur (filter = 150 pixels). Both the image and the blurred image were presented for 3,000 ms. The relief phase was followed by a jittered inter-trial-interval varying between 2000 and 4,000 ms. For a visual representation of the paradigm, please see Fig. 1.
In the passive-viewing condition, participants were presented with the choice profile of a participant in the active-choice condition (i.e., the exact pattern of ‘yes’ and ‘no’ responses to the positive and negative cues for each participant in the active-choice condition was saved, and then re-used once as the computer generated pre-determined choice pattern for a participant in the passive-viewing condition). Participants were told in the introduction to the study that the computer would determine which images would be shown. The trial setup was identical to the active-choice condition, except for the following aspect. After participants were presented with the cue, the word ‘yes’ or ‘no’ turned green, indicating the choice of the computer. Participants were asked to confirm the choice made by the computer by pressing one of two pre-specified buttons, to mirror the motor response made in the active-choice condition.
The active-choice condition came with a filling problem: because participants could choose whether they wanted to view an image or not, some participants would see many more images than others. This filling problem can pose problems for modelling the BOLD response, due to lower efficiency in estimating contrasts for one subject over the other. Based on individual differences in choosing to view social negative information7, we formulated an a-priori defined and preregistered eligibility criterion that only participants in the active-choice condition who chose negative and/or positive images in 40% or more of the trials (14/35 stimuli) would be paired with a subject in the passive-viewing condition. Based on this criterion five out of 33 participants in the active-choice condition were excluded from the sample. One other participant was excluded, because the functional scan was stopped prematurely. This resulted in 27 participants in the active-choice condition. The choice profiles of these 27 participants were yoked with 27 participants in the passive-viewing condition.
Cues were written to describe positive and negative images in one sentence. In a pilot study the cues were rated on valence (0 = negative to 100 = positive), and arousal (0 = low arousal to 100 = high arousal). Negative cues were rated more negatively than positive cues (M = 20.69, SD = 8.42 vs. M = 77.99, SD = 4.49), t(68) = − 35.51, p < 0.001, and more arousing than positive cues (M = 68.45, SD = 6.38 vs. M = 28.93, SD = 5.99), t(68) = 26.71, p < 0.001. Negative and positive cues were matched in terms of valence extremity. An analysis of mean-centered valence scores demonstrated that, on average, positive cues were perceived as equally positive (M = 29.31, SD = 8.43) as negative stimuli were perceived as negative (M = 27.99, SD = 4.49), t(68) = 0.82, p = 0.417.
Images were selected from the International Affective Picture System (IAPS;35) and the Nencki Affective Picture System (NAPS;36); image codes are presented in the Supplementary Materials (Table S4). We selected negative images that portrayed situations of interpersonal violence, or social scenes involving a dead body or a harmed person. Negative images were selected when they had a valence rating below 4 (on a scale from 1 = negative to 9 = positive) and an arousal rating above 4.5 (on a scale from 1 = not arousing to 9 = extremely arousing). We selected positive images that portrayed joyful, loving or exciting interpersonal interactions. Positive images were selected when they had a valence rating above 6 (on a scale from 1 = negative to 9 = positive) and an arousal rating above 3 (on a scale from 1 = not arousing to 9 = extremely arousing). Negative and positive images differed significantly in terms of valence (M = 2.58, SD = 0.53 vs. M = 7.43, SD = 0.36), t(68) = − 44.88, p < 0.001, and arousal (M = 6.18, SD = 0.75 vs. M = 4.78, SD = 0.90), t(68) = 7.07, p < 0.001. Negative and positive images were matched in terms of valence extremity. An analysis of mean-centered valence scores demonstrated that, on average, positive images were perceived as equally positive (M = 2.42, SD = 0.53) as negative stimuli were perceived as negative (M = 2.44, SD = 0.37), t(68) = − 0.22, p = 0.824.
After the scanning session was completed, participants filled in the ‘Morbid curiosity in daily-life’ questionnaire7 and the Dutch version of the Interpersonal Reactivity Index62. A short exit questionnaire asked participants two questions regarding the task they performed in the scanner. Participants in the active-choice condition were asked to rate to what extent they followed their curiosity when making choices for negative cues, and when making choices for positive cues, on a 1 (not at all) to 7 (very much) point scale. Participants in the passive-viewing condition were asked to rate to what extent they were curious about the negative cues, and the positive cues, on a 1 (not at all) to 7 (very much) point scale. The exit questionnaire concluded with demographic questions.
After signing the informed consent form, each participant received a thorough instruction. The active-choice condition was introduced as a study on how the brain represents choice. Participants were explained how they could make their choice, that they would always see the image of their choice, and that there were no right or wrong answers. Furthermore, participants were presented with an example of a negative and a positive cue, combined with the corresponding full image and blurred image, so that they knew what to expect when choosing the yes or no option. No mention was made of curiosity in the instruction. The passive-viewing condition was introduced as a study on the brain processes involved in reading image descriptions and viewing images. Participants were explained that the computer determined whether a description would be followed by a corresponding image. As in the active-choice condition, participants were presented with an example of a negative and a positive cue, combined with the corresponding full image and blurred image, so that they knew what to expect when the computer determined the yes or no option.
When comfortable and instructed, a structural T1-weighted anatomical scan was made. Then the participant performed the choice task or the passive task during fMRI acquisition in the scanner. After the scanning session, the participant filled in the questionnaires and received a thorough debriefing.
In the active-choice condition, we compared the extent to which participants followed their curiosity when making choices for negative cues and positive cues. In the passive-viewing condition, we compared the extent to which participants were curious about negative cues and positive cues. For the active-choice condition, a Kolmogorov–Smirnov test indicated a normality violation (active-choice: skewness = 0.640, kurtosis = − 0.375, D(25) = 0.197, p = 0.013; passive-viewing: skewness = − 0.399, kurtosis = − 0.055, D(27) = 0.161, p = 0.071). We report two paired sample t-tests (two-tailed) to analyze the difference between cues, but results were fully corroborated with non-parametric Wilcoxon signed-rank tests. Effect sizes (Cohen’s dz) were calculated using Lakens’63 spreadsheet.
Image acquisition. Participants were tested using a Philips Achieva 3 T MRI scanner and a 32-channel SENSE headcoil. A survey scan was made for spatial planning of the subsequent scans. Following the survey scan, a 3-min structural T1-weighted scan was acquired using 3D fast field echo (TR: 82 ms, TE: 38 ms, flip angle: 8°, FOV: 240 × 188 mm, in-plane resolution 240 × 188, 220 slices acquired using single-shot ascending slice order and a voxel size of 1.0 × 1.0 × 1.0 mm). After the T1-weighted scan, functional T2*-weighted sequences were acquired using single shot gradient echo, echo planar imaging (TR = 2000 ms, TE = 27.63 ms, flip angle: 76.1°, FOV: 240 × 240 mm, in-plane resolution 64 × 64, 37 slices with ascending acquisition, slice thickness 3 mm, slice gap 0.3 mm, voxel size 3 × 3 × 3 mm), covering the entire brain. For the functional run, 495 volumes were acquired. After the functional run, a “B0” fieldmap scan (based on the phase difference between two consecutive echos) was acquired using 3D fast field echo (TR: 11 ms, TE: 3 ms and 8 ms, flip angle: 8°, FOV: 256 × 208, in-plane resolution 128 × 104, 128 slices).
Results included in this manuscript come from preprocessing performed using FMRIPREP version 1.0.064,65, a Nipype66,67 based tool. Each T1 weighted volume was corrected for bias field using N4BiasFieldCorrection v2.1.068 and skullstripped using antsBrainExtraction.sh v2.1.0 (using OASIS template). Cortical surface was estimated using FreeSurfer v6.0.069. The skullstripped T1w volume was segmented (using FSL FAST;70) and coregistered to the skullstripped ICBM 152 Nonlinear Asymmetrical template version 2009c71 using nonlinear transformation implemented in ANTs v2.1.072.
Functional data was motion corrected using MCFLIRT v5.0.973. Distortion correction was performed using phase-difference fieldmaps processed with FUGUE (74; FSL v5.0.9). This was followed by co-registration to the corresponding T1w using boundary-based registration75 with 9 degrees of freedom, using bbregister (FreeSurfer v6.0.0). Motion correcting transformations, field distortion correcting warp, BOLD-to-T1w transformation and T1w-to-template (MNI) warp were concatenated and applied in a single step using antsApplyTransforms (ANTs v2.1.0) using Lanczos interpolation.
Many internal operations of FMRIPREP use nilearn76, principally within the BOLD-processing workflow. For more details of the pipeline see https://fmriprep.readthedocs.io/en/1.0.0/workflows.html.
We modeled the participants’ preprocessed time series in a “first-level” GLM using FSL FEAT (77; FSL v6.0.0). The first-level modeling procedure was exactly the same for the participants in the active choice and passive viewing condition. As predictors, we included regressors for both the induction phase (i.e., the written description) and the relief phase (i.e., the full image). We separated trials with positive descriptions/images from trials with negative descriptions/images and separated trials in which participants saw the full version of the image from trials in which they saw a blurred version of the image. Note that in the active choice condition participants chose to see the full or blurred image, whereas in the passive viewing condition it was predetermined whether participants saw the full or blurred image. The final model held six predictors: 2 (phase: induction vs. relief) × 2 (valence: negative vs. positive) × 2 (seen: full image vs. blurred image). If participants did not have any blurred image trials, the associated predictors were left out. Additionally, we added a single predictor for the actual decision (i.e., modelled at the onset the button press) and six motion predictors based on the estimated motion correction parameters.
Before model estimation, we applied a high-pass filter (σ = 50 s) and spatially smoothed the data (FWHM = 5 mm.). Standard prewhitening, as implemented in FSL, was applied. First-level contrasts only involved predictors associated with full image trials; that is, predictors associated with blurred image trials were not used for further analysis. For the remaining four predictors of interest—2 (phase|full image) × 2 (valence|full image)—we defined contrasts against baseline, i.e., βpredictor ≠ 0, and valence contrasts, i.e., (βneg|induction − βpos|induction) ≠ 0 and (βneg|relief − βpos|relief) ≠ 0. The results (images with parameter and variance estimates) were subsequently registered to FSL’s default template (“MNI152NLin6Asym”) using a translation-only (3 parameter) affine transform using FSL Flirt (which is part of FSL FEAT) for group analysis.
ROI-based group analysis
We tested two confirmatory hypotheses in this ROI-based group analysis, separately for the induction and relief phase:
(βneg|active − βneg|passive) > 0
(βneg|active − βneg|passive) − (βpos|active − βpos|passive) > 0
Note that the parameters (e.g., βneg|active) reflect the average of the first-level parameters (e.g., βneg) for a particular condition (e.g., active choice). As such, we tested four different group-level contrasts—2 (phase) × 2 (hypothesis)—across two ROIs (striatum and IFG) in our group-level model.
For these confirmatory ROI-based group analyses, we used nonparametric permutation-based inference in combination with Threshold-Free Cluster Enhancement (TFCE;39) as implemented in FSL randomise40. We ran randomise with 5,000 permutations, corrected for multiple comparisons using the maximum statistic method (the method’s default multiple comparison correction procedure), and thresholded voxelwise results at p < 0.025 (correction for two ROIs). Note that this analysis allows for voxel-wise inference (i.e., no cluster-based correction is used).
In these ROI-based analyses, we restricted the analysis to voxels within two a-priori specified ROIs: bilateral striatum and bilateral inferior frontal gyrus (IFG). The ROIs are based on the Harvard–Oxford Subcortical Atlas (striatum; caudate, putamen and nucleus accumbens) and the Harvard–Oxford Cortical Atlas (IFG; pars opercularis and pars triangularis) with a threshold for probabilistic ROIs > 038.
Whole-brain group analysis. In addition to the confirmatory ROI-based analysis, we conducted an exploratory whole-brain group-analysis. Besides the two hypotheses mentioned in the previous section, we tested the following hypotheses, again for both the induction and relief phase:
(βpos|active − βpos|passive) > 0
(βpos|active − βpos|passive) − (βneg|active − βneg|passive) > 0
(βneg|active − βneg|passive) ⋂ (βpos|active − βpos|passive)
The ⋂ symbol in hypothesis 5 represents a conjunction analysis between two contrasts. For these exploratory whole-brain group analyses, we used FSL FEAT78 with a FLAME1 mixed-effects model and automatic outlier detection79. Resulting brain maps were thresholded with cluster-based correction80 using an initial (one-tailed) voxel-wise p-value cutoff of 0.005 (corresponding to a z-value above 2.576) and a cluster-wise significance level of 0.05. For the conjunction analysis (hypothesis 5), we used the minimum statistic approach81 in combination with cluster-based correction using the same cutoff and significance value as for the other two (non-conjunction based) hypotheses.
Further exploratory analyses
To aid interpretation of the results, we “decoded” the brain maps resulting from the whole-brain analysis using Neurosynth41 (analyzed on March 4, 2019). In Supplementary Table 1, we list the ten Neurosynth terms (excluding anatomical terms) with the highest overall spatial correlation with our unthresholded brain maps (which are available on Neurovault, see below).