We provide evidence that auditory WM for frequency is an important factor in SiN perception. In this study, WM precision for frequency correlated significantly with thresholds for reporting sentences in multi-talker babble. The association with frequency precision highlights the importance of frequency cues in the perception of target speech from background noise. In contrast to some previous studies, peripheral audiometric hearing measures4 and phonological working memory11 measures were not significantly associated with SiN1,2. In addition, performance in the WM task for frequency precision correlated with years of musical training, potentially implicating musical ability as a modifiable factor for auditory WM.
SiN and hearing loss
Previous studies4 have identified that hearing loss, as measured by the pure-tone audiogram, is a strong correlate of SiN ability among people with hearing loss, and may also explain some of the variability in SiN performance among people who have audiometric thresholds at clinically measured frequencies (0.25–8 kHz) in the normal hearing range. There is also evidence that extended high-frequency hearing (8–20 kHz) can enhance SiN perception19. Although pure-tone audiometric thresholds may be used synonymously with ‘hearing loss’, this test measures tone-sensitivity in quiet, which is different from SiN. For example, even when tone-sensitivity thresholds are corrected by a hearing aid, up to 15% of people report difficulty when communicating in noisy environments17. Given that neuroimaging studies demonstrate that SiN perception evokes cortical activity, peripheral damage may be anatomically distinct from some SiN deficits18.
Here, we were interested in SiN in people with normal hearing and we utilised a stringent criterion to exclude participants with mild tone sensitivity impairment in the high-frequency range. We did not measure audiometric thresholds above 8 kHz, which may have some effect on SiN19 because these thresholds are not measured routinely in clinical practice. Although, there was a relationship between high-frequency (4–8 kHz) audiometric thresholds and SiN in the full sample, consistent with the results of Holmes and Griffiths4, we did not find a significant correlation between audiometric hearing thresholds and SiN performance after excluding participants with audiometric thresholds > 15 dB HL. This may be because our participants were younger than in that previous experiment. One previous experiment that used a similar age range found no correlation20, and one previous experiment that used a similar threshold criterion found no correlation21. In addition, our sample size was smaller than the experiment reported by Holmes and Griffiths4 and therefore the absence of a correlation may be explained by lower power for detecting significant correlations, particularly after excluding participants.
SiN performance and auditory working memory
The novel auditory WM tasks used here tested specific aspects of WM and did not include any linguistic aspects, which could affect performance22. We found that frequency WM precision correlated significantly with SiN thresholds, but AM WM precision did not. However, the difference in correlations between frequency precision and SiN, and AM precision and SiN was not statistically significant. Our findings indicate that people who are best able to hold in mind frequency over time, arguably a source property (as opposed to the temporal envelope related to events), have an advantage in SiN perception.
There is little information on how higher precision for frequency may aid SiN perception, but clues can be drawn from some previous studies. Previous studies using a small number of competing talkers have shown that talkers who are separated from maskers in fundamental frequency are more intelligible than talkers who are closer in pitch to maskers23,24. Thus, at a broad scale, it seems plausible that the ability to hold frequency information in memory contributes to speech intelligibility in noise. The frequency of sounds in an auditory scene may also help group them into an auditory object and the precision at which one may be able to do this over time could aid the sequential segregation needed for successful SiN perception21. We may have expected AM WM to correlate with SiN. However, it is worth noting here that the SiN masker we used was 16-talker babble and therefore contained less prominent AM than a single-talker masker. AM WM may potentially come into play when a fluctuating masker means that temporal glimpses of the target are more prevalent.
We designed our task to specifically test auditory WM and attempted to minimise the influence of perception during the matching phase of the task. For example, both the frequency and AM rate could only be altered by a step increment of either + 2% or − 2% in the matching phase of a trial, which is above the discrimination thresholds for frequency and AM rate25. However, we did not formally measure the difference limens for each participant. Further work is needed to study how frequency discrimination is related to working memory precision with our task. This may also be relevant for participants with tone deafness who may have lower perceptual thresholds than the normal population and may therefore have been impaired in our task.
We suspect that the WM measures used here tap into different processes than the fundamental grouping processes measured by Holmes and Griffiths4. Those processes are carried out over hundreds of milliseconds as opposed to the working memory precision examined here that examines a process carried out over seconds. It will be interesting in future studies to examine the extent to which the figure ground task and auditory working memory tasks explain separate variance in SiN.
SiN performance and cognitive measures
Several studies have shown that that cognitive function is correlated with SiN ability26,27. These studies have found relevance in measures of fluid intelligence, processing speed, and working memory batteries. A large UK Biobank looked at correlations between the Triple-Digits Task (TDT), as a measure of SiN, and the above measures in half a million participants in midlife27. They found a significant relationship with all cognitive measures independent of tone-sensitivity, as age increases from 40 to 70 years. A large meta-analysis28 looking at the link between cognitive function and SiN perception found that when all cognitive domains and SiN perception tasks were collapsed across all studies, they correlated significantly (r = 0.31). Inhibitory control, WM, episodic memory and processing speed were all deemed to be important correlates of SiN.
From first principles, there are multiple cognitive factors that could plausibly affect SiN perception. A listener must attend to the correct auditory stream, then track and remember this over time. There may also be top-down effects of language which may help anticipate and resolve any ambiguities that are encountered. However, the extent to which individual differences in these cognitive abilities relate to individual differences in speech-in-noise perception among young and healthy participants is equivocal. A review by Akeroyd2 suggested that no single test produces reliable correlations, although there is perhaps a more reliable link with phonological WM than with other measures.
We found that a verbal reading test correlated significantly with SiN performance. However, non-verbal measures, such as block design, did not correlate and neither did measures of long-term memory, such as list learning (see Table 1). This is consistent with evidence suggesting that linguistic experience can have an effect on speech discrimination due to advantages in lexical access29. Previous work28 has shown a wide variation when measures of crystallised intelligence have been used. The overall pooled association to SiN perception tasks that have been used in the literature was 0.19, however, this increased when SiN tasks which required a listener to correctly identify sentences, as opposed to words, was used. Within this meta-analysis, one study found a much stronger correlation than ours of 0.70 when the scores from the vocabulary section of participants from the Weschler Adult Intelligence Scale was correlated with performance on a SiN perception task using a background of multi-talker babble with 20 speakers. However, some other studies have not found a significant relationship with tests of intelligence26.
Considering phonological working memory tests, we found no correlation between SiN performance and forward or backward digit span although this has been identified as a significant correlate of SiN performance in large studies with participants who are older and/or have hearing loss2. Researchers have used a variety of tests, including the reading span, forward and backward digit span. Some but not all studies have found significant correlations between the reading span and SiN performance. There have also been studies finding a positive association with the digit span and SiN measures26. Neither the forward or backward digit span correlated with SiN performance in our study but the strength of association between the backward digit span (r = 0.32) was similar to that in meta-analyses (r = 0.31)28. One of the reasons for this may be the smaller sample size of our participants compared to the studies discussed above. The fine-grained measure of WM precision may also be more sensitive to detecting trends that the traditional phonological measures. Additionally, the mean level of hearing loss in these studies was around 40 dB HL whereas it was 7.1 dB HL in our study, which could have influenced this finding as suggested by Füllgrabe and Rosen3.
Role of musical instrument experience in working memory and SiN performance
There is a large body of literature which has identified that musical instrument experience modulates cognitive ability. Playing a musical instrument has been associated with better global cognitive scores as one gets older30, whereas this relationship is absent in musicians not actively playing an instrument. A twin study31 has also shown that the twin that played an instrument were less likely to develop dementia later in life. Musical training involves the simultaneous use of auditory, motor and visuospatial abilities and these use of these abilities in tandem over time may have general protective effects on the brain.
Although our auditory working memory task for frequency precision requires the maintenance and comparison of tones in the musical range, over seconds, it does not include other aspects of music including melody or rhythm. However, playing music does require a sustained period of working memory for sound which is reflected in our task. Therefore, performance in our task may have been directly related to this. Further work is necessary to clarify whether the nature of musical expertise, the intensity of training and/or the type of instrument that is learned affects frequency working memory precision. It is also possible that the relationship may be simply driven by a tendency for people with better sensitivity for frequency to persist in musical activities.
Musical training has also been linked specifically with better SiN performance and frequency discrimination although this is controversial13,15,25,26. A study by Parbery-Clark et al.32 found that years of musical experience correlated with phonological working memory and frequency discrimination, as well as SiN. It is worth noting that their frequency discrimination task may contain elements of working memory. They used an adaptive paradigm to obtain a threshold for identifying the ‘higher’ sound out of two tones played in a sequence. Although this is described as a perceptual task, this task requires participants to maintain both tones over a few seconds, and compare them in memory (however, explicit details about timing are not mentioned in the study). It is plausible that the expertise gained from musical training in analysing and attending to auditory streams on a background of different music is transferable into more general and non-musical domains of SiN perception and cognition. To this end, further work is needed to establish whether expertise in attending to auditory streams of music in a background of other music (i.e. playing an instrument in an orchestra) adds an advantage above doing so without background music (i.e. soloists). This is a limitation of our study as the participants did not provide further details about the nature of their previous experience with a musical instrument (e.g. amateur vs. professional, self-taught vs. formal musical training). There is also other work that has not found a direct link between musical ability and SiN33.