Re-imagining fMRI for awake behaving infants


New data were collected in two cohorts of infants: Cohort I from the Scully Center for the Neuroscience of Mind and Behavior at Princeton University and Cohort II from the Magnetic Resonance Research Center at Yale University. In Cohort I, 11 unique infants (5 females) aged 6 to 33 months were scanned across 23 sessions (1–8 sessions per participant). Not included in this total were five sessions without fMRI data because the infant would not lie down (4 additional unique infants, 1 infant included above who contributed a usable session on another occasion). In Cohort II, 15 unique infants (8 females) aged 4 to 10 months were scanned across 22 sessions (1–2 sessions per participant). One other session was excluded because the infant would not lie down (1 additional unique infant). The parent(s) and/or guardian(s) of each infant in a cohort provided informed consent to a protocol approved by the Institutional Review Board of each university. Previously published data21 collected from 16 adults at Princeton University were re-analyzed for SFNR comparison.

Orientation session

When a family expressed interest in participating, we first brought them in for an orientation session. This involved meeting a member of our team to discuss research goals, review procedures and safety measures, answer questions, and complete forms (informed consent and preliminary metal screening).

We typically then introduced the family to the scanning environment using a mock scanner, which consisted of a plastic shell that looked like a scanner but lacked the magnet and other hardware (Psychology Software Tools). This system also allowed for playback of scanner sounds to let parents know what the scanner sounded like. A parent placed the infant on their back on a motorized scanner table, which we then slid into the simulated bore. This helped us judge the infant’s ability to lie still. This also helped us judge the parent’s comfort level with separation, though the infant always remained within arm’s reach. We have now shifted away from the mock scanner to a simpler, less cumbersome system that has proven equally effective in infants. Specifically, we created a simulated bore from a 55-gallon white plastic barrel that was sawed in half lengthwise (into a half circle tunnel), into which we inserted a clear plastic window to show a screen. The parent places the infant on a changing mat inside the tunnel.

We did not use the infant or parent response to the mock scanner/tunnel for formal screening purposes. Although we monitored infant behavior and parental comfort throughout, these casual observations were not predictive of future scanning success. Mock scanning can be helpful in children older than two years31,32, although not always33. Indeed, our experience with infants has been that success is primarily determined by factors that are variable from session to session. This is likely because of dramatic developmental changes every few weeks and because of idiosyncratic factors related to sleep, hunger, illness, teething, time of day, etc.

Hearing protection

The final part of the orientation was to familiarize the infant with hearing protection, which parents were encouraged to continue practicing at home. Our goal was to reduce the sound level they experienced during scanning to the range of daily experiences (e.g., musical toys, daycare environments, walking on the street). Sounds around 70 dB are thought to be safe, whereas sounds at or above 85 dB (roughly a loud school environment34) could cause damage after extended exposure without protection35. Although it is theoretically possible for MRI machines to exceed 110 dB (roughly a loud sports stadium36), sound insulation and sequence selection can result in lower levels. Indeed, in our scanning environments and for the sequences we used, the measured sound pressure level reached a maximum of 90 dB.

With hearing protection, it is possible to reduce the sound level by 33 dB, bringing even the loudest possible scanner sounds to safe levels for the duration of the scan. To achieve this noise reduction, we combined three forms of hearing protection: first, silicone earplugs (Mack’s Pillow Soft Kids Silicone Earplugs) were inserted into the opening of each ear and expanded over the ear canal (they are tacky and so stay in place better than foam plugs). Although extremely effective and the best option we have found, these do take a couple of minutes to apply and can cause the infant to become fussy; if this occurred during the orientation session, we typically provided parents with a sample to practice with at home. Second, soft foam cups with hydrogel adhesive at the rim (MiniMuffs, Natus) were placed over the earplugs and attached to the outer ear. Third, MRI-safe passive circumaural headphones (MRI Pediatric Earmuffs, Magmedix) were placed over the foam cups. These three layers were intended to provide redundancy, so that if the earplugs became dislodged the headphones would provide adequate protection and vice versa. Whenever there was a break in scanning, we verified that the hearing protection was intact and re-applied if not. As anecdotal evidence that the final sound level was comfortable, infants did not startle when scans started and sometimes fell asleep. We initially piloted other types of hearing protection outside of the scanner, such as stickers over the tragus, or circumaural headphones that were held around the head by an elastic band, but found that they could not be applied securely and were more prone to failure. The parent(s) in the scanning room were given traditional hearing protection throughout the scan: foam ear plugs and circumaural headphones. Although the scanner is loud, we were able to talk over the noise during the scan to communicate with parents.

Scan sessions

We scheduled a scan as soon as possible after the orientation. We left it up to the parent to decide when in the day they thought their child would be best able to participate. They tended to choose times after napping and feeding (often in the morning), though work and childcare constraints and scanner availability also influenced scan time.

Upon arrival, we performed extensive metal screening. Every session, parents filled out a metal screening form for themselves and on behalf of their child, which checked for a medical or occupational history of metal in or on their body. After removing clothing, shoes, and jewelry with metal and emptying their pockets, the parent carried the child into a walk-through metal detector. If the detector sounded, often because of a small amount of metal on the parent (e.g., a clasp), one of the experimenters walked with the infant through the metal detector. To double check the child, we passed a high-sensitivity metal-detecting wand (Adams ER300), able to find small internal or ingested metal, over their front and back. We additionally asked the parent if they had seen their child eat anything metallic in the past few days and did not proceed if that was a possibility. Finally, we encouraged parents to bring metal-free toys, pacifiers, and blankets into the scanner to comfort the child; we screened all of these items with the wand before taking them in.

The parent(s) and infant entered the scanner room with one or two experimenters. The hearing protection was put on the infant by one of the experimenters while the other experimenter and parent(s) entertained the child. The infant was then placed on the scanner table. The infant’s head was rested on a pillowcase covering a foam pad in the bottom half of a 20-channel Siemens head/neck coil. The headphones formed a somewhat snug fit reducing the amount of lateral head motion possible, but no additional padding or restraint was used around the head. The infant’s body from the neck down was rested on a vacuum pillow filled with soft foam beads (S & S Technology) covered by a sheet. The edges of the vacuum pillow were lifted and loosely wrapped around the infant to form a taco shape, and the air was pumped out of the pillow until it conformed to the infant’s body shape. This prevented the infant from rolling off the table or turning over, while also reducing body motion during scans. On occasion, and with the recommendation of the parent(s), we swaddled young infants in a muslin blanket before placing them on the vacuum pillow. Overall, however, we found that infants tended to move their head and body less when snug but not constricted. The infant’s eyes were covered by an experimenter’s hand while the head was isocentered in the scanner with a laser.

Unlike typical fMRI studies, we did not attach the top half of the head coil. This decision was made for several reasons. It would have obscured the view of the infant’s face from outside the bore, limiting the ability of the parents and experimenters to monitor the infant. The top of the coil would have also blocked the line of sight between the infant and the ceiling of the bore, interfering with our eye-tracking camera and preventing infants from seeing the entire screen projected on the bore ceiling. We also worried that covering the infant’s face would induce unnecessary anxiety and that the hard plastic of the top coil presented an injury risk if the child attempted to raise their head (which occurred regularly).

Fig. 7 illustrates the configuration of the research team during infant scanning. We found that it was critical for an experimenter with exceptional bedside manner to remain inside the scanner room adjacent to the parent(s). They monitored and supported infant comfort using a combination of physical contact, viewing the infant in the bore directly, and watching them on the video camera. They additionally provided explanations and directions to the parent(s). This experimenter also adjusted and focused the video camera (12M-i camera, MRC Systems) that was attached to the ceiling of the bore in order to get a clear view of the infant’s eyes. The video feed from the camera was streamed to a monitor, which further helped the experimenter and parent(s) in the scanner room monitor the infant. We placed the monitor against the glass of the window between the control room and scanner room, though as an alternative the video feed can be displayed on a screen or projector inside the scanner room.

The experimenter in the scanner room communicated with the research team in the control room about which tasks to run, when to start and stop scans, and how data quality was looking. For Cohort I, the control room spoke over an intercom to the experimenter in the scanner room wearing headphones (Slimline, Siemens), who, in turn, communicated back with the control room using hand signals visible through the window. For Cohort II, a two-way communication system was used, allowing the experimenter in the scanner room to listen to the control room over headphones (OptoActive II, Optoacoustics) and speak to them through a microphone affixed with velcro tape to the front of the scanner bore (FOMRI III, Optoacoustics). The two experimenters in the control room operated computers and equipment. One experimenter controlled the Siemens console computer (e.g., setting up sequences, adjusting alignment, monitoring data quality) and was responsible for communicating with the experimenter in the scanner room. The other experimenter controlled another computer running experimental tasks, the eye-tracker, and the vacuum pump.

Experiment menu

Given the unpredictability of working with infants, we developed an experiment menu software system that provides complete flexibility in running cognitive tasks during fMRI. This system dynamically generates and executes experimental code in Psychtoolbox37 for MATLAB (MathWorks). The experimenter could easily navigate to an experiment from a library of tasks and choose a specific starting block (allowing tasks to be interrupted and resumed), or they could review the progress of an experiment so far. The code coordinated all timing information, receiving and organizing triggers from the scanner, and starting and stopping eye-tracker recordings. After each block ended, there was a short delay before the next block started, allowing the experimenter to determine whether to continue to the next block in the same task, switch to a new task, or stop altogether. It was also possible to rapidly switch to a movie at any point, which we showed during anatomical scans, to regain infant interest and attention, and for certain naturalistic experiments.

This integrated and semi-automated framework for experiments and eye-tracking reduces the burden on experimenters and the possibility of manual errors during already complex procedures. The ability to switch between tasks efficiently within a single ecosystem reduced downtime where the infant was lying in the scanner without any task. This not only increased the amount of time during which usable fMRI data could be collected, but also reduced fussing out that was more likely to occur when nothing was on the screen. Although we developed this system for infants, it could also be used for patient testing and other special populations who present similar complications.

The experiment menu system can flexibly incorporate a range of cognitive tasks. Any task that can be designed in Psychtoolbox can be ported to this system, regardless of consideration to response inputs, experiment duration, display parameters, or other factors. To help users understand how the experiment menu works, we have provided two sample experiments in the software release that interact with the system in different ways and can be easily modified.


Different types of eye-trackers can be integrated with our software architecture (e.g., EyeLink from SR Research, iViewX from SMI). For Cohort I, we used the frame-grabber capabilities of iViewX eye-tracker software to receive and record input from the MRC video camera. This set-up required manually starting and stopping eye-tracking. For Cohort II, the same camera fed a dedicated eye-tracking computer via a frame-grabber (DVI2USB 3.1, Epiphan). This additional computer ran Python code to save every video frame with a time stamp and was connected to the main experiment computer via ethernet to receive messages, start and stop recording, and perform handshakes. These frames were corrected for acquisition lag and manually coded offline by two or more raters.

To facilitate manual gaze coding, the provided software includes a tool to display relevant video frames offline and convert coded responses into a format compatible with the analysis pipeline. The system was designed to make this laborious task more efficient, allowing coders to quit and resume, accelerate their coding speed, and adjust their FOV. Response code options are flexible (e.g., eyes open vs. closed, fixation vs. saccade, gaze left/center/right, etc.) and could even be used for non-eye behavior (e.g., head motion). This tool also computes coding reliability across raters.

Manual gaze coding is a time-intensive and somewhat crude procedure compared to modern eye-tracking standards. However, in our experience, the currently available automated eye-tracking systems are infeasible for infant fMRI. Most such systems used in infant behavioral or adult fMRI studies require a calibration phase in which visual transients appear in different locations. In our experience, infant gaze is captured with insufficient reliability in the scanner to make these calibrations viable. Additional problems, such as needing to adjust the infrared emitter and FOV whenever the infant moves, make automated eye-tracking difficult to manage during a protocol that is already challenging. However, we remain optimistic that computer vision algorithms may be capable of automating some of the gaze coding humans currently perform in our protocol38, reducing the burden and potentially increasing the reliability of this approach.

Ceiling projection

We developed a stimulus display system for infants. When using a typical rear-projection system for fMRI, the stimulus is projected on a screen at the back of the bore and the screen is viewed on an angled mirror attached to the top of the head coil. As a result, the stimulus usually covers a small part of the visual field and requires a specific vantage point through the mirror. We could not be sure that these displays would grab the attention of infants. At a more basic level, mirrors may confuse or distract infants. Another approach could be to use a goggle system22, which guarantees that the infant can see the stimulus. However, it is hard to monitor the infant with such a system and taking it off can be disorienting.

Instead, we projected visual stimuli onto the ceiling of the scanner bore over the infant’s face. We mounted a projector (Hyperion, Psychology Software Tools) approximately six feet high on the back wall behind the scanner, tilted downward to project at the back of the bore. A large mirror placed in the back of the bore behind the scanner table at a low angle reflected the image up onto the bore ceiling, as shown in Fig. 7b. This provided a high resolution display (1080p) and wide FOV stimulation (approximately 115 degrees of visual angle). The thrown image suffered from keystone, elliptical, and stretching geometric distortions, as a result of the angled projection, reflection, and curved bore, but these were corrected automatically in software by a preset screen calibration in the experiment menu code. For Cohort I, we projected directly on the white plastic surface of the bore ceiling. For Cohort II, we taped a piece of white paper to the ceiling to hide the plastic grain. We believe that this large and direct display kept the infants engaged and was natural for them to view. It also gave the parent(s) and experimenters a clear view of the child’s face and allowed for seamless video eye-tracking without calibration. The experiment menu code can be used to set-up and calibrate ceiling projection, but is also compatible with other display types, including rear-projection screens or goggles; tools are included to equate stimulus sizes across display formats.

Fig. 7: Schematic of the scanning environment.

a Overview of the setup and wiring diagram of the equipment and communications. b Key elements inside the scanner room as well as a view of the screen projected onto the ceiling of the bore. The camera is adjacent to the screen projection. The video feed is depicted on a screen on the wall but could be shown a monitor through the window. 3D rendering created using Sweet Home 3D from Sweet Home 3D assets shared under a Free Art 1.3 licence and CC-BY 3.0 license. Additional assets are from Trimble 3d Warehouse in accordance with their license. All rights reserved.


We report visual responses from fMRI data combined across a variety of stimuli used in several ongoing experiments, including blocks of: looming colorful fractals (Cohort I: 11 runs, Cohort II: 14 runs; 14.6° max size), looming toy photographs (Cohort I: 7 runs, Cohort II: 6 runs; 8°), looming face photographs (Cohort I: 5 runs, Cohort II: 0 runs; 8°), and moving shapes (Cohort I: 9 runs, Cohort II: 6 runs; 10–15°).

Each experimental task was designed to be short, entertaining, and modular. Task blocks generally lasted less than 40 s, though sometimes were longer, as in the case of movies. The tasks used visual effects to maintain attention, including fast motion and onsets (e.g., looming), high-contrast textures, bright colors, and relevant stimuli (e.g., faces). Our goal for each session was to obtain up to three full experiments worth of data, which we achieved on occasion. However, the tasks were designed and counterbalanced internally to provide useful data even when incomplete. Infants sometimes found a given task boring and began fussing or moving, and in such circumstances, we adapted by changing to a new experiment (sometimes later returning to the original experiment). We found that fussing out of one task did not predict that the child would fuss out of other tasks, hence being able to switch tasks within participant increased data yield. The menu system automatically handled timing, scheduled rest periods between blocks/tasks, counterbalanced conditions, and tracked stimulus order and novelty.

At some point during the session, typically after at least one attempted experiment, we collected an anatomical scan. This scan was used for registration of functional data and alignment to anatomical templates. Obtaining a high-quality scan was especially difficult because the infant had to remain still for the entire duration of 3.25 min (whereas for functional scans, discrete motion only impacted a small number of 2-s volumes). If the infant was awake, we did our best to keep them entertained with either a compelling visual task (e.g., fireworks appearing in different parts of the display) or a movie (e.g., Daniel Tiger, Sesame Street). If asleep, we blanked the screen. We attempted as many anatomical scans as needed to obtain one of sufficient quality (and as time allowed), though often succeeded in one try.


Our goal was to make the session as fun and as enjoyable as possible but it was inevitable that some infants got fussy. In our experience, this happened most often at one of three stages: when putting the hearing protection on the infant, when first laying the infant down on the scanner table, and/or when the infant got bored with a task. It was rare that other events, such as the scanner starting, triggered unhappiness. In fact, many infants seemed to be soothed by the scanner sounds and vibrations, and some enjoyed the visual displays so much that they fussed only when removed from the scanner. We did find that talking (loudly) to the infant between scans and patting or holding their hands was soothing. Neither the parents nor experimenters climbed into the bore with the infant. We did not encourage this to avoid distracting the infant and inducing motion or potential confounds. Without such distraction, we found that infants on their own (within arm’s reach) quickly became enraptured by the visual display. We did allow infants to use pacifiers, soothers, bottles, or blankets while in the scanner, which generally had a soothing effect. Although the movement of their jaw while sucking on a pacifier could add noise, this noise was less than that from the motion of an unhappy infant and outweighed the negative impact of otherwise collecting much less data in some sessions.

If a fussy infant could not be soothed or attempted to roll over or climb out, or if the parent(s) asked for a break, we took the infant out of the scanner until they were calm again and ready to resume. The parent would often nurse the infant or give them a bottle or snack, and would change their diaper if needed. In some cases, we had to start and stop 3–4 times before the infant became sufficiently comfortable to provide high-quality data. When the infant had completed all planned experiments, had been in the scanner room for an hour, became too fussy, or fell asleep for a long time (after we completed anatomical scans) we ended the session. In addition to monetary compensation for the family’s time and travel, and a board book for the infant, we also printed a 3-D model of the infant’s brain whenever possible (using Ultimaker 2+ to print surface reconstructions from FreeSurfer39). We encouraged families to come in for multiple sessions, and many were happy to do so, generally with one month or more between visits.

Inter-session variability

Our protocol was designed to be flexible across participants and within-participant across sessions, in order to account for infant temperament, reaction to tasks, and non-experimental disruptions (e.g., play, feeding, diapers, sleeping, etc.). This flexibility has the benefit of increasing retention and maximizing the amount of data and number of tasks that can be administered per session. However, it comes at the cost of increased variability across sessions, which complicates the analysis and interpretation of data. For example, the sequence of tasks is hard to control, raising the possibility of order effects and habituation.

We have taken several steps to attempt to mitigate this variability: First, we do our best to avoid introducing unnecessary variability. We try to be as consistent as possible across sessions about the scanner environment (location, waiting room, toys, etc.), personnel (parents and researchers), head coil and padding, hearing protection supplies and application, MRI sequences, presentation stimuli, and preprocessing parameters. Second, all tasks are designed using within-subject manipulations for which there are no parsimonious accounts about how prior tasks or habituation could drive condition-wise differences. Third, we use different categories of stimuli (faces, objects, shapes, cartoons, etc.) and presentation styles (looming, oscillating, dynamic, etc.) across tasks to minimize habituation. Fourth, we present tasks in a pseudo-random order across sessions and participants, which would serve to counterbalance task history effects under ideal circumstances. Finally, it is worth noting that although attempting multiple tasks within a session may pose these complications, collecting multiple measures from a sufficient number of infants makes it possible to test for order effects, habituation, and more theoretically, how different cognitive capacities relate and interact.

Data acquisition

Infant data were acquired on a 3T Siemens Skyra MRI in Cohort I and on a 3T Siemens Prisma MRI in Cohort II using anatomical and functional sequences (see Supplementary Table 1 for a summary of parameters). For anatomical scans, we used a T1-weighted PETRA sequence in all participants (({rm{TR}}_{1}) = 3.32 ms, ({{rm{TR}}}_{2}) = 2250 ms, TE = 0.07 ms, flip angle = 6°, matrix = 320 × 320, slices = 320, resolution = 0.94 mm isotropic, radial lines = 30,000). In two young infants, we additionally piloted a T2-weighted SPACE sequence (TR = 3200 ms, TE = 563 ms, flip angle = 120°, matrix = 192 × 192, slices = 176, resolution = 1 mm isotropic). For functional scans, we used a T2*-weighted gradient-echo EPI sequence in all participants (TR = 2000 ms, TE = 28 ms, flip angle = 71°, matrix = 64 × 64, slices = 36, resolution = 3 mm isotropic, interleaved slice acquisition). The FOV covered the whole brain during slice positioning, but occasionally parts of the brain were cropped as a result of head motion, typically the cerebellum and brain stem. We did not use multi-band slice acceleration (common in modern adult fMRI) because of concerns about peripheral nerve stimulation that could not be reported by our preverbal infants. The adult data for SFNR comparison (Fig. 2) were acquired on the same scanner as Cohort I and with the same functional sequence, except that the top of the head coil was attached. We had whole-brain coverage in adults, although the cerebellum and brain stem were cropped in some participants.


We developed an efficient analysis pipeline for preprocessing infant fMRI data. This software has been released publicly with this paper and pairs particularly well with the experiment menu system described above. The code is modular and easily editable, while also largely unsupervised. Indeed, any task-based fMRI experiment could in principle be analyzed with this pipeline. Despite the variability noted above in the order and duration of tasks and the amount of data collected per session, many of the consistent aspects of our protocol (e.g., experiment menu system, within-task experimental design and timing, MRI acquisition parameters, etc.) standardize the processing such that it takes less than 2 h on average to run raw infant fMRI data through the pipeline. Supplementary Fig. 1 depicts the overall preprocessing pipeline schematically. To help users learn how this pipeline works, we include extensive documentation in a step-by-step tutorial with the pipeline.

Videos of the infant’s face collected during scanning were blindly coded offline for eye-gaze by 2–7 naive coders, based on task-specific criteria40. Tasks that only required fixation (e.g., movie watching) were coded for whether the eyes were on-screen or off-screen/closed. Tasks that involved viewing images on the left and right of the display were coded for the direction of looking. Gaze location was labeled by calculating the modal response across coders for a time window of five video frames (100 ms). A tie in the coding was resolved by assigning the label from the most recent frame that was not a tie. We calculated inter-rater reliability by comparing the consistency of responses across coders.

After the data from the scanner were converted into NIFTI format, we calculated motion parameters for the functional data. The movement behavior of infants was different from adults because it was often punctate, large in magnitude, and unpredictable, rather than slow and drifting. Hence, the best reference volume to use for motion correction might be different in infants and adults. Specifically, rather than using the first, last, or middle time-point, as is typical in adults, we selected the volume with the minimum average absolute euclidean distance from all other volumes (the centroid volume) as the reference. We used FSL41 (5.0.9 predominantly) for calculating frame-to-frame translations and identified time-points that ought to be excluded because of motion greater than our threshold (3 mm).

The stimulus and timing information from each task were converted into FSL timing files using a script. Epochs of data (trials, blocks, or runs) were marked for exclusion at this stage if there was excessive motion and/or if the infant’s eyes were off-screen/closed for more than half of the time-points in the epoch or during a critical part of the epoch for the task. Manual exclusions of data were also specified here, such as when the infant moved out of the FOV of the scan. The anatomical data were preprocessed using AFNI’s homogenization tools, combined with other anatomical data if available, and skull stripped (AFNI’s 3dSkullStrip42).

If more than one task was tested within a run, we created pseudo-runs in which time-points corresponding to the different tasks were extracted and used to create new run data. This happened more often in Cohort II, in part because we realized between cohorts that we obtained more usable data with less downtime when we scanned continuously rather than stopping arbitrarily when infants finished an experiment. Indeed, Supplementary Table 2 shows that we tended to terminate more runs when an experiment finished in Cohort I than in Cohort II. Centroid volumes and motion exclusions were recomputed for these pseudo-runs, which were then input to the preprocessing analyses as if they were collected as separate runs.

First-level analyses were performed to preprocess each run. We started from FSL’s FEAT but added modifications to better accommodate infant fMRI data. We discarded three burn-in volumes from the beginning of each run. We interpolated any time-points that were excluded due to motion, so that they did not bias the linear detrending (in later analyses these time-points were again excluded). We performed motion correction using MCFLIRT in FSL, referenced to the centroid volume as described above. The slices in each volume were acquired in an interleaved order, and so we realigned them with slice-time correction. To create the mask of brain and non-brain voxels we calculated SFNR20 for each voxel. This produced a bimodal distribution of SFNR values reflecting the signal properties of brain and non-brain voxels. We thresholded the brain voxels based on the trough between these two peaks. The data were spatially smoothed with a Gaussian kernel (5 mm FWHM) and linearly detrended in time. AFNI’s despiking algorithm was used to attenuate aberrant time-points within voxels.

We registered each participant’s functional volumes to their anatomical scan using FLIRT in FSL with a normalized mutual information cost function. However, we found that this automatic registration was insufficient for infants. With this as a starting point, we used mrAlign (mrTools, Gardner lab) to perform manual registration (6 degrees of freedom). One functional run from each session was aligned to the anatomical scan and then each additional run was aligned to the anatomically aligned functional data, all in native resolution. This process was repeated as necessary to improve alignment.

As with registration of functional data to anatomical space, a combination of automatic and manual alignment steps (9 degrees of freedom) were usually needed to register the anatomical scan to standard space (using Freeview from FreeSurfer39). The standard space for each infant was chosen to be the infant MNI template closest to their age43. These infant templates were then aligned to the adult MNI standard (MNI152 1 mm). This alignment step into adult standard space was performed for two reasons. First, it ensured that data were analyzed in a common space across the age span (even if the detailed anatomy does not fully correspond). Second, it allowed us to define anatomical ROIs, which are based on templates in adult space. For the present work, alignment to standard space was performed only after generating the statistic maps in native resolution.

After preprocessing and registration, the data can be reorganized into individual experiments. We did not perform this step for the data reported here because we wanted to include as much data as possible. That is, we analyzed runs from multiple visual tasks and collapsed across these runs for session-wise analyses. Nevertheless, we describe this step in detail below because it has been implemented in our shared software pipeline and will generally be useful for future studies of a particular task. Reorganizing data into individual experiments helps account for the fact any given experiment could be spread across multiple runs (e.g., because of breaks or fussiness). Pseudo-runs of the same task (extracted from runs with multiple tasks) and entire runs in which only that task was tested would be concatenated into a single experiment dataset per participant. This dataset can then be checked for counterbalancing across task conditions to prevent biases or confounds related to run number or time. For instance, if an experiment has two conditions, an equal number of epochs from each condition can be selected per run. The voxel time-series for the usable epochs should be z-scored within run prior to concatenation, to eliminate generic run-wise differences in the mean and variance of BOLD activity. The corresponding timing and motion information would also get concatenated. These datasets can then be used as inputs for analyses of individual experiments.


SFNR was calculated from the raw infant and adult fMRI data. For each voxel in the brain mask, the mean activity was divided by the standard deviation of the detrended activity. The detrending was performed with a second-order polynomial to account for low-frequency drift20. We analyzed data from 16 adults with one run each (16 total) containing 260 volumes and from 19 infant sessions with 1–6 runs each (64 total) containing 6–335 volumes (M = 112.3). One run (6 TRs) was excluded because of severe aliasing. To quantify posterior-to-anterior changes in signal, SFNR was estimated for each coronal slice. These slices were taken along the y-axis of the acquisition slab, and thus were not precisely aligned with the posterior-to-anterior axis in the reference frame of the head or brain. The average SFNR for each coronal slice (with at least 1000 brain voxels) was computed by sampling the SFNR values of 1000 voxels in that slice. We used this subsampling approach to control for the number of voxels used in averaging across infant and adult brain sizes. The coronal slices were then median split into posterior and anterior halves, which served as a within-subject factor in a repeated measures ANOVA of SFNR, along with age group (infants or adults) as a between-subject factor. Note that since infant brains are smaller on average this means that fewer slices of their brains will be included, particularly at the posterior and anterior edges. Sampling fewer voxels (e.g., 100) and thus including more slices led to consistent results.

To understand how head motion impacted SFNR, the translational motion between each TR was first computed using MCFLIRT in FSL and averaged across the run. Participants were selected as low-motion if they had average translational motion of less than 0.2 mm across a run. The relationship of motion to SFNR was quantified by correlating the average motion for each run with the average SFNR across all coronal slices for each run. Both average motion and average SFNR were non-normal, with long tails, and so we estimated the non-parametric Spearman’s rank correlation coefficient (ρ). The p-value was computed by resampling participants with replacement 10,000 times, recomputing ρ for each sample, and then calculating the proportion of samples with a positive sign (as the true relationship was negative).

The analysis of visual evoked activity included all of the runs/pseudo-runs that contained at least two task blocks, excluding epochs that were not usable because of motion, eye-gaze, or inappropriate data type (e.g., movie watching or resting state). We fit a univariate GLM across the whole brain, modeling the response with a task regressor convolved with a double-gamma hemodynamic response function as the basis function. Nuisance regressors were specified for motion relative to the centroid volume (six degrees of motion, including x, y, z translation and yaw, pitch, roll rotation), as well as a single time-point regressor for each excluded time-point. The z-statistic map for the task regressor in every run was then aligned into adult standard space.

For the ROI analyses, we defined anatomical ROIs for V1, LOC, and A1 using the Harvard-Oxford atlas. For each ROI and run/pseudo-run, we quantified the proportion of voxels with a z-score corresponding to p < 0.05. The significance of these proportions across runs relative to chance (0.05) was calculated with bootstrap resampling44. We sampled with replacement the same number of run proportions from each ROI 10,000 times to produce a sampling distribution of the mean (or mean difference between ROIs). The p-value corresponded to the proportion of resampling iterations with mean below chance (or below zero for differences between ROIs). For exploratory voxelwise analyses, we used randomise in FSL to compute a t-statistic in each voxel and then applied an uncorrected threshold of p < 0.005.

In the analyses above, runs/pseudo-runs were treated as independent samples, to assess the statistical reliability of visual responses at the run level. However, there was often more than one run per session, allowing us to additionally examine reliability at the session level. Hence we repeated the same analyses above after concatenating all runs/pseudo-runs within the session so that there was only one GLM and set of voxelwise z-scores per participant. Despite the smaller number of values entered into the final statistical analyses, the results were largely unchanged (Supplementary Fig. 5), likely the result of including more data in computing each value and thus obtaining cleaner estimates.

To explore how preprocessing decisions affected these results, we repeated the univariate analyses above while varying several parameters in our pipeline one at a time: the threshold for excluding individual time-points based on the head motion; whether to exclude time-points after instances of motion (after a brain moves there is an imbalance in net magnetization of each slice that can take several seconds to correct); the FWHM of the smoothing kernel; whether to exclude components of the data extracted with independent components analysis (ICA, MELODIC in FSL) that correlate with the six motion parameters (we varied the minimum correlation threshold for regressing out components); whether to use voxelwise despiking to remove aberrant data that can result from motion in voxels near the skull or ventricles; and whether to include the temporal derivatives of the regressors in the GLMs, to account for latency differences.

Some of the preprocessing decisions (e.g., motion threshold) affected our time-point exclusion procedure and, in turn, the amount of data retained. To calculate these retention rates, we excluded individual time-points because of the above-threshold motion and entire blocks when the majority (>50%) of its time-points were excluded. That is, time-points that are themselves usable but part of an unusable block become unusable. For reference, 100% would mean that in runs with at least two usable task blocks, all time-points from all participants were usable. Even when no motion exclusions are performed, some blocks will still not be usable due to eye exclusions or because we terminated the block prior to completion. Note that these rates do not account for usable data that were excluded from the analysis of visual responses because the corresponding task was unsuitable for estimating evoked responses (e.g., movie watching).

To compare across parameters for each type of preprocessing, we ran a linear mixed model with condition (parameter) as a fixed effect and run (or session) as a random effect. This approach was chosen instead of a repeated measures ANOVA in order to deal fairly with missing data in cases where a run/session was excluded (e.g., because there were no longer two usable blocks). The Wald Chi-Square test was used as an omnibus test to determine whether there were any significant differences in the model, and simple effect comparisons were used to test our default parameter setting against the other options.


Cohort II served as a replication and generalization sample of Cohort I. Although the ages and precise composition of tasks differed between cohorts, the pattern of results was identical, with significant visual evoked activity in V1 and LOC, but not in A1.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *