Refereed Journal Publications

2021

2021

September 2021
Bonnie K Lau Andrew J Oxenham Lynne A Werner
Adult listeners perceive pitch with fine precision, with many adults capable of discriminating less than a 1 % change in fundamental frequency (F0). Although there is variability across individuals, this precise pitch perception is an ability ascribed to cortical functions that are also important for speech and music perception. Infants display neural immaturity in the auditory cortex, suggesting...
August 2021
Chhayakanta Patro Heather A Kreft Magdalena Wojtczak
Older adults often experience difficulties understanding speech in adverse listening conditions. It has been suggested that for listeners with normal and near-normal audiograms, these difficulties may, at least in part, arise from age-related cochlear synaptopathy. The aim of this study was to assess if performance on auditory tasks relying on temporal envelope processing reveal age-related...
December 2020
Alice E Milne Roberta Bianco Katarina C Poole Sijia Zhao Andrew J Oxenham Alexander J Billig Maria Chait
Online experimental platforms can be used as an alternative to, or complement, lab-based research. However, when conducting auditory experiments via online methods, the researcher has limited control over the participants' listening environment. We offer a new method to probe one aspect of that environment, headphone use. Headphones not only provide better control of sound presentation but can...
July 2021
Coral E Dirks Peggy B Nelson Andrew J Oxenham
CONCLUSION: Adjusting the FAT to optimize sensitivity to interaural temporal-envelope disparities did not improve localization or speech perception. The clinical frequency-to-place alignment may already be sufficient, given the inherently poor spectral resolution of CIs. Alternatively, other factors, such as temporal misalignment between the two ears, may need to be addressed before any benefits...
July 2021
Anahita H Mehta Lei Feng Andrew J Oxenham
The perception of sensory events can be enhanced or suppressed by the surrounding spatial and temporal context in ways that facilitate the detection of novel objects and contribute to the perceptual constancy of those objects under variable conditions. In the auditory system, the phenomenon known as auditory enhancement reflects a general principle of contrast enhancement, in which a target sound...
April 2021
Juraj Mesik Lucia Ray Magdalena Wojtczak
Speech-in-noise comprehension difficulties are common among the elderly population, yet traditional objective measures of speech perception are largely insensitive to this deficit, particularly in the absence of clinical hearing loss. In recent years, a growing body of research in young normal-hearing adults has demonstrated that high-level features related to speech semantics and lexical...
April 2021
Sara M K Madsen Torsten Dau Andrew J Oxenham
Differences in fundamental frequency (F0) or pitch between competing voices facilitate our ability to segregate a target voice from interferers, thereby enhancing speech intelligibility. Although lower-numbered harmonics elicit a stronger and more accurate pitch sensation than higher-numbered harmonics, it is unclear whether the stronger pitch leads to an increased benefit of pitch differences...
March 2021
Hao Lu Martin F McKinney Tao Zhang Andrew J Oxenham
Although beamforming algorithms for hearing aids can enhance performance, the wearer's head may not always face the target talker, potentially limiting real-world benefits. This study aimed to determine the extent to which eye tracking improves the accuracy of locating the current talker in three-way conversations and to test the hypothesis that eye movements become more likely to track the...
February 2021
Erin R O'Neill Morgan N Parke Heather A Kreft Andrew J Oxenham
This study assessed the impact of semantic context and talker variability on speech perception by cochlear-implant (CI) users and compared their overall performance and between-subjects variance with that of normal-hearing (NH) listeners under vocoded conditions. Thirty post-lingually deafened adult CI users were tested, along with 30 age-matched and 30 younger NH listeners, on sentences with and...

2020

2020

December 2020
Juraj Mesik Magdalena Wojtczak
Recent studies on amplitude modulation (AM) detection for tones in noise reported that AM-detection thresholds improve when the AM stimulus is preceded by a noise precursor. The physiological mechanisms underlying this AM unmasking are unknown. One possibility is that adaptation to the level of the noise precursor facilitates AM encoding by causing a shift in neural rate-level functions to...
Item ID: 50,351
December 2020
Alice E Milne Roberta Bianco Katarina C Poole Sijia Zhao Andrew J Oxenham Alexander J Billig Maria Chait
Online experimental platforms can be used as an alternative to, or complement, lab-based research. However, when conducting auditory experiments via online methods, the researcher has limited control over the participants' listening environment. We offer a new method to probe one aspect of that environment, headphone use. Headphones not only provide better control of sound presentation but can...
Item ID: 50,346
October 2020
Erin R O'Neill Morgan N Parke Heather A Kreft Andrew J Oxenham
Purpose The goal of this study was to develop and validate a new corpus of sentences without semantic context to facilitate research aimed at isolating the effects of semantic context in speech perception. Method The newly developed corpus contains nonsensical sentences but is matched in vocabulary and syntactic structure to the existing Basic English Lexicon (BEL) corpus. It consists of 20...
Item ID: 50,341
September 2020
Kelly L Whiteford Heather A Kreft Andrew J Oxenham
Natural sounds convey information via frequency and amplitude modulations (FM and AM). Humans are acutely sensitive to the slow rates of FM that are crucial for speech and music. This sensitivity has long been thought to rely on precise stimulus-driven auditory-nerve spike timing (time code), whereas a coarser code, based on variations in the cochlear place of stimulation (place code), represents...
Item ID: 50,336
August 2020
Erin R O'Neill Heather A Kreft Andrew J Oxenham
An error in interpreting the statistical analysis output led to reporting errors in some of the effect sizes for the three-way repeated-measures ANOVAs in Experiment 1.
Item ID: 50,331
June 2020
Hao Lu Anahita H Mehta Hari M Bharadwaj Barbara G Shinn-Cunningham Andrew J Oxenham
No abstract
Item ID: 50,326
June 2020
Coral E Dirks Peggy B Nelson Matthew B Winn Andrew J Oxenham
For cochlear-implant users with near-normal contralateral hearing, a mismatch between the frequency-to-place mapping in the two ears could produce a suboptimal performance. This study assesses tonotopic matches via binaural interactions. Dynamic interaural time-difference sensitivity was measured using bandpass-filtered pulse trains at different rates in the acoustic and implanted ear, creating...
Item ID: 50,321
May 2020
Andrew J Oxenham
We are generally able to identify sounds and understand speech with ease, despite the large variations in the acoustics of each sound, which occur due to factors such as different talkers, background noise, and room acoustics. This form of perceptual constancy is likely to be mediated in part by the auditory system's ability to adapt to the ongoing environment or context in which sounds are...
Item ID: 50,316
May 2020
Anahita H Mehta Andrew J Oxenham
This study investigated the relationship between fundamental frequency difference limens (F0DLs) and the lowest harmonic number present over a wide range of F0s (30-2000 Hz) for 12-component harmonic complex tones that were presented in either sine or random phase. For fundamental frequencies (F0s) between 100 and 400 Hz, a transition from low (∼1%) to high (∼5%) F0DLs occurred as the lowest...
Item ID: 50,311
February 2020
Anahita H Mehta Hao Lu Andrew J Oxenham
Cochlear implant (CI) listeners typically perform poorly on tasks involving the pitch of complex tones. This limitation in performance is thought to be mainly due to the restricted number of active channels and the broad current spread that leads to channel interactions and subsequent loss of precise spectral information, with temporal information limited primarily to temporal-envelope cues....
Item ID: 50,306
December 2019
Lei Feng Andrew J Oxenham
CONCLUSIONS: Only the spectral contrasts used by listeners contributed to the spectral contrast effects in vowel identification. These results explain why CI users can experience larger-than-normal context effects under specific conditions. The results also suggest that adaptation to new spectral cues can be very rapid for vowel discrimination, but may follow a longer time course to influence...
Item ID: 50,301

2019

2019

Related Articles

Spectral Contrast Effects Reveal Different Acoustic Cues for Vowel Recognition in Cochlear-Implant Users.

Ear Hear. 2019 Dec 02;:

Authors: Feng L, Oxenham AJ

Abstract
OBJECTIVES: The identity of a speech sound can be affected by the spectrum of a preceding stimulus in a contrastive manner. Although such aftereffects are often reduced in people with hearing loss and cochlear implants (CIs), one recent study demonstrated larger spectral contrast effects in CI users than in normal-hearing (NH) listeners. The present study aimed to shed light on this puzzling finding. We hypothesized that poorer spectral resolution leads CI users to rely on different acoustic cues not only to identify speech sounds but also to adapt to the context.
DESIGN: Thirteen postlingually deafened adult CI users and 33 NH participants (listening to either vocoded or unprocessed speech) participated in this study. Psychometric functions were estimated in a vowel categorization task along the /I/ to /ε/ (as in "bit" and "bet") continuum following a context sentence, the long-term average spectrum of which was manipulated at the level of either fine-grained local spectral cues or coarser global spectral cues.
RESULTS: In NH listeners with unprocessed speech, the aftereffect was determined solely by the fine-grained local spectral cues, resulting in a surprising insensitivity to the larger, global spectral cues utilized by CI users. Restricting the spectral resolution available to NH listeners via vocoding resulted in patterns of responses more similar to those found in CI users. However, the size of the contrast aftereffect remained smaller in NH listeners than in CI users.
CONCLUSIONS: Only the spectral contrasts used by listeners contributed to the spectral contrast effects in vowel identification. These results explain why CI users can experience larger-than-normal context effects under specific conditions. The results also suggest that adaptation to new spectral cues can be very rapid for vowel discrimination, but may follow a longer time course to influence spectral contrast effects.

PMID: 31815819 [PubMed - as supplied by publisher]

Timestamp: 6:25am
Item ID: 47,611
Related Articles

Auditory enhancement under forward masking in normal-hearing and hearing-impaired listeners.

J Acoust Soc Am. 2019 Nov;146(5):3448

Authors: Kreft HA, Oxenham AJ

Abstract
A target within a spectrally notched masker can be enhanced by a preceding copy of the masker. Enhancement can also increase the effectiveness of the target as a forward masker. Enhancement has been reported in hearing-impaired listeners under simultaneous but not forward masking. However, previous studies of enhancement under forward masking did not fully assess the potential effect of differences in sensation level or spectral resolution between the normal-hearing and hearing-impaired listeners. This study measured enhancement via forward masking in hearing-impaired and age-matched normal-hearing listeners with different spectral notches in the masker, to account for potential differences in frequency selectivity, and with levels equated by adding a background masking noise to equate both sensation level and sound pressure level or by reducing the sound pressure level of the stimuli to equate sensation level. Hearing-impaired listeners showed no significant enhancement, regardless of spectral notch width. Normal-hearing listeners showed enhancement at high levels, but showed less enhancement when sensation levels were reduced to match those of the hearing-impaired group, either by reducing sound levels or by adding a masking noise. The results confirm a lack of forward-masked enhancement in hearing-impaired listeners but suggest this may be partly due to reduced sensation level.

PMID: 31795651 [PubMed - in process]

Timestamp: 6:28pm
Item ID: 47,591

Short- and long-term memory for pitch and non-pitch contours: Insights from congenital amusia.

Brain Cogn. 2019 Sep 20;136:103614

Authors: Graves JE, Pralus A, Fornoni L, Oxenham AJ, Caclin A, Tillmann B

Abstract
Congenital amusia is a neurodevelopmental disorder characterized by deficits in music perception, including discriminating and remembering melodies and melodic contours. As non-amusic listeners can perceive contours in dimensions other than pitch, such as loudness and brightness, our present study investigated whether amusics' pitch contour deficits also extend to these other auditory dimensions. Amusic and control participants performed an identification task for ten familiar melodies and a short-term memory task requiring the discrimination of changes in the contour of novel four-tone melodies. For both tasks, melodic contour was defined by pitch, brightness, or loudness. Amusic participants showed some ability to extract contours in all three dimensions. For familiar melodies, amusic participants showed impairment in all conditions, perhaps reflecting the fact that the long-term memory representations of the familiar melodies were defined in pitch. In the contour discrimination task with novel melodies, amusic participants exhibited less impairment for loudness-based melodies than for pitch- or brightness-based melodies, suggesting some specificity of the deficit for spectral changes, if not for pitch alone. The results suggest pitch and brightness may not be processed by the same mechanisms as loudness, and that short-term memory for loudness contours may be spared to some degree in congenital amusia.

PMID: 31546175 [PubMed - as supplied by publisher]

Timestamp: 5:30am
Item ID: 47,036

No effects of attention or visual perceptual load on cochlear function, as measured with stimulus-frequency otoacoustic emissions.

J Acoust Soc Am. 2019 Aug;146(2):1475

Authors: Beim JA, Oxenham AJ, Wojtczak M

Abstract
The effects of selectively attending to a target stimulus in a background containing distractors can be observed in cortical representations of sound as an attenuation of the representation of distractor stimuli. The locus in the auditory system at which attentional modulations first arise is unknown, but anatomical evidence suggests that cortically driven modulation of neural activity could extend as peripherally as the cochlea itself. Previous studies of selective attention have used otoacoustic emissions to probe cochlear function under varying conditions of attention with mixed results. In the current study, two experiments combined visual and auditory tasks to maximize sustained attention, perceptual load, and cochlear dynamic range in an attempt to improve the likelihood of observing selective attention effects on cochlear responses. Across a total of 45 listeners in the two experiments, no systematic effects of attention or perceptual load were observed on stimulus-frequency otoacoustic emissions. The results revealed significant between-subject variability in the otoacoustic-emission measure of cochlear function that does not depend on listener performance in the behavioral tasks and is not related to movement-generated noise. The findings suggest that attentional modulation of auditory information in humans arises at stages of processing beyond the cochlea.

PMID: 31472524 [PubMed - in process]

Timestamp: 5:28am
Item ID: 46,951
Icon for American Institute of Physics Icon for PubMed Central Related Articles

Cognitive factors contribute to speech perception in cochlear-implant users and age-matched normal-hearing listeners under vocoded conditions.

J Acoust Soc Am. 2019 Jul;146(1):195

Authors: O'Neill ER, Kreft HA, Oxenham AJ

Abstract
This study examined the contribution of perceptual and cognitive factors to speech-perception abilities in cochlear-implant (CI) users. Thirty CI users were tested on word intelligibility in sentences with and without semantic context, presented in quiet and in noise. Performance was compared with measures of spectral-ripple detection and discrimination, thought to reflect peripheral processing, as well as with cognitive measures of working memory and non-verbal intelligence. Thirty age-matched and thirty younger normal-hearing (NH) adults also participated, listening via tone-excited vocoders, adjusted to produce mean performance for speech in noise comparable to that of the CI group. Results suggest that CI users may rely more heavily on semantic context than younger or older NH listeners, and that non-auditory working memory explains significant variance in the CI and age-matched NH groups. Between-subject variability in spectral-ripple detection thresholds was similar across groups, despite the spectral resolution for all NH listeners being limited by the same vocoder, whereas speech perception scores were more variable between CI users than between NH listeners. The results highlight the potential importance of central factors in explaining individual differences in CI users and question the extent to which standard measures of spectral resolution in CIs reflect purely peripheral processing.

PMID: 31370651 [PubMed - in process]

Timestamp: 4:40pm
Item ID: 45,886
Icon for Nature Publishing Group Icon for PubMed Central Related Articles

Speech perception is similar for musicians and non-musicians across a wide range of conditions.

Sci Rep. 2019 Jul 18;9(1):10404

Authors: Madsen SMK, Marschall M, Dau T, Oxenham AJ

Abstract
It remains unclear whether musical training is associated with improved speech understanding in a noisy environment, with different studies reaching differing conclusions. Even in those studies that have reported an advantage for highly trained musicians, it is not known whether the benefits measured in laboratory tests extend to more ecologically valid situations. This study aimed to establish whether musicians are better than non-musicians at understanding speech in a background of competing speakers or speech-shaped noise under more realistic conditions, involving sounds presented in space via a spherical array of 64 loudspeakers, rather than over headphones, with and without simulated room reverberation. The study also included experiments testing fundamental frequency discrimination limens (F0DLs), interaural time differences limens (ITDLs), and attentive tracking. Sixty-four participants (32 non-musicians and 32 musicians) were tested, with the two groups matched in age, sex, and IQ as assessed with Raven's Advanced Progressive matrices. There was a significant benefit of musicianship for F0DLs, ITDLs, and attentive tracking. However, speech scores were not significantly different between the two groups. The results suggest no musician advantage for understanding speech in background noise or talkers under a variety of conditions.

PMID: 31320656 [PubMed - in process]

Timestamp: 4:40pm
Item ID: 45,881
Icon for American Institute of Physics Related Articles

The role of pitch and harmonic cancellation when listening to speech in harmonic background sounds.

J Acoust Soc Am. 2019 May;145(5):3011

Authors: Guest DR, Oxenham AJ

Abstract
Fundamental frequency differences (ΔF0) between competing talkers aid in the perceptual segregation of the talkers (ΔF0 benefit), but the underlying mechanisms remain incompletely understood. A model of ΔF0 benefit based on harmonic cancellation proposes that a masker's periodicity can be used to cancel (i.e., filter out) its neural representation. Earlier work suggested that an octave ΔF0 provided little benefit, an effect predicted by harmonic cancellation due to the shared periodicity of masker and target. Alternatively, this effect can be explained by spectral overlap between the harmonic components of the target and masker. To assess these competing explanations, speech intelligibility of a monotonized target talker, masked by a speech-shaped harmonic complex tone, was measured as a function of ΔF0, masker spectrum (all harmonics or odd harmonics only), and masker temporal envelope (amplitude modulated or unmodulated). Removal of the masker's even harmonics when the target was one octave above the masker improved speech reception thresholds by about 5 dB. Because this manipulation eliminated spectral overlap between target and masker components but preserved shared periodicity, the finding is consistent with the explanation for the lack of ΔF0 benefit at the octave based on spectral overlap, but not with the explanation based on harmonic cancellation.

PMID: 31153349 [PubMed - in process]

Timestamp: 4:40pm
Item ID: 45,876
Icon for Springer Related Articles

Exploring the Role of Medial Olivocochlear Efferents on the Detection of Amplitude Modulation for Tones Presented in Noise.

J Assoc Res Otolaryngol. 2019 Aug;20(4):395-413

Authors: Wojtczak M, Klang AM, Torunsky NT

Abstract
The medial olivocochlear reflex has been hypothesized to improve the detection and discrimination of dynamic signals in noisy backgrounds. This hypothesis was tested here by comparing behavioral outcomes with otoacoustic emissions. The effects of a precursor on amplitude-modulation (AM) detection were measured for a 1- and 6-kHz carrier at levels of 40, 60, and 80 dB SPL in a two-octave-wide noise masker with a level designed to produce poor, but above-chance, performance. Three types of precursor were used: a two-octave noise band, an inharmonic complex tone, and a pure tone. Precursors had the same overall level as the simultaneous noise masker that immediately followed the precursor. The noise precursor produced a large improvement in AM detection for both carrier frequencies and at all three levels. The complex tone produced a similarly large improvement in AM detection at the highest level but had a smaller effect for the two lower carrier levels. The tonal precursor did not significantly affect AM detection in noise. Comparisons of behavioral thresholds and medial olivocochlear efferent effects on stimulus frequency otoacoustic emissions measured with similar stimuli did not support the hypothesis that efferent-based reduction of cochlear responses contributes to the precursor effects on AM detection.

PMID: 31140010 [PubMed - in process]

Timestamp: 4:40pm
Item ID: 45,871
Icon for Atypon Icon for PubMed Central Related Articles

Comparing Rapid and Traditional Forward-Masked Spatial Tuning Curves in Cochlear-Implant Users.

Trends Hear. 2019 Jan-Dec;23:2331216519851306

Authors: Kreft HA, DeVries LA, Arenberg JG, Oxenham AJ

Abstract
A rapid forward-masked spatial tuning curve measurement procedure, based on Bekesy tracking, was adapted and evaluated for use with cochlear implants. Twelve postlingually-deafened adult cochlear-implant users participated. Spatial tuning curves using the new procedure and using a traditional forced-choice adaptive procedure resulted in similar estimates of parameters. The Bekesy-tracking method was almost 3 times faster than the forced-choice procedure, but its test-retest reliability was significantly poorer. Although too time-consuming for general clinical use, the new method may have some benefits in individual cases, where identifying electrodes with poor spatial selectivity as candidates for deactivation is deemed necessary.

PMID: 31134842 [PubMed - in process]

Timestamp: 4:40pm
Item ID: 45,866
Icon for American Institute of Physics Related Articles

Pitch discrimination with mixtures of three concurrent harmonic complexes.

J Acoust Soc Am. 2019 Apr;145(4):2072

Authors: Graves JE, Oxenham AJ

Abstract
In natural listening contexts, especially in music, it is common to hear three or more simultaneous pitches, but few empirical or theoretical studies have addressed how this is achieved. Place and pattern-recognition theories of pitch require at least some harmonics to be spectrally resolved for pitch to be extracted, but it is unclear how often such conditions exist when multiple complex tones are presented together. In three behavioral experiments, mixtures of three concurrent complexes were filtered into a single bandpass spectral region, and the relationship between the fundamental frequencies and spectral region was varied in order to manipulate the extent to which harmonics were resolved either before or after mixing. In experiment 1, listeners discriminated major from minor triads (a difference of 1 semitone in one note of the triad). In experiments 2 and 3, listeners compared the pitch of a probe tone with that of a subsequent target, embedded within two other tones. All three experiments demonstrated above-chance performance, even in conditions where the combinations of harmonic components were unlikely to be resolved after mixing, suggesting that fully resolved harmonics may not be necessary to extract the pitch from multiple simultaneous complexes.

PMID: 31046318 [PubMed - in process]

Timestamp: 4:40pm
Item ID: 45,861
Icon for Elsevier Science Icon for Elsevier Science Icon for PubMed Central Related Articles

The upper frequency limit for the use of phase locking to code temporal fine structure in humans: A compilation of viewpoints.

Hear Res. 2019 Jun;377:109-121

Authors: Verschooten E, Shamma S, Oxenham AJ, Moore BCJ, Joris PX, Heinz MG, Plack CJ

Abstract
The relative importance of neural temporal and place coding in auditory perception is still a matter of much debate. The current article is a compilation of viewpoints from leading auditory psychophysicists and physiologists regarding the upper frequency limit for the use of neural phase locking to code temporal fine structure in humans. While phase locking is used for binaural processing up to about 1500 Hz, there is disagreement regarding the use of monaural phase-locking information at higher frequencies. Estimates of the general upper limit proposed by the contributors range from 1500 to 10000 Hz. The arguments depend on whether or not phase locking is needed to explain psychophysical discrimination performance at frequencies above 1500 Hz, and whether or not the phase-locked neural representation is sufficiently robust at these frequencies to provide useable information. The contributors suggest key experiments that may help to resolve this issue, and experimental findings that may cause them to change their minds. This issue is of crucial importance to our understanding of the neural basis of auditory perception in general, and of pitch perception in particular.

PMID: 30927686 [PubMed - in process]

Timestamp: 4:40pm
Item ID: 45,856
Icon for Wolters Kluwer Related Articles

Mechanisms of Localization and Speech Perception with Colocated and Spatially Separated Noise and Speech Maskers Under Single-Sided Deafness with a Cochlear Implant.

Ear Hear. 2019 Mar 07;:

Authors: Dirks C, Nelson PB, Sladen DP, Oxenham AJ

Abstract
OBJECTIVES: This study tested listeners with a cochlear implant (CI) in one ear and acoustic hearing in the other ear, to assess their ability to localize sound and to understand speech in collocated or spatially separated noise or speech maskers.
DESIGN: Eight CI listeners with contralateral acoustic hearing ranging from normal hearing to moderate sensorineural hearing loss were tested. Localization accuracy was measured in five of the listeners using stimuli that emphasized the separate contributions of interaural level differences (ILDs) and interaural time differences (ITD) in the temporal envelope and/or fine structure. Sentence recognition was tested in all eight CI listeners, using collocated and spatially separated speech-shaped Gaussian noise and two-talker babble. Performance was compared with that of age-matched normal-hearing listeners via loudspeakers or via headphones with vocoder simulations of CI processing.
RESULTS: Localization improved with the CI but only when high-frequency ILDs were available. Listeners experienced no additional benefit via ITDs in the stimulus envelope or fine structure using real or vocoder-simulated CIs. Speech recognition in two-talker babble improved with a CI in seven of the eight listeners when the target was located at the front and the babble was presented on the side of the acoustic-hearing ear, but otherwise showed little or no benefit of a CI.
CONCLUSION: Sound localization can be improved with a CI in cases of significant residual hearing in the contralateral ear, but only for sounds with high-frequency content, and only based on ILDs. In speech understanding, the CI contributed most when it was in the ear with the better signal to noise ratio with a speech masker.

PMID: 30870240 [PubMed - as supplied by publisher]

Timestamp: 4:40pm
Item ID: 45,851
Icon for HighWire Related Articles

Cortical Correlates of Attention to Auditory Features.

J Neurosci. 2019 Apr 24;39(17):3292-3300

Authors: Allen EJ, Burton PC, Mesik J, Olman CA, Oxenham AJ

Abstract
Pitch and timbre are two primary features of auditory perception that are generally considered independent. However, an increase in pitch (produced by a change in fundamental frequency) can be confused with an increase in brightness (an attribute of timbre related to spectral centroid) and vice versa. Previous work indicates that pitch and timbre are processed in overlapping regions of the auditory cortex, but are separable to some extent via multivoxel pattern analysis. Here, we tested whether attention to one or other feature increases the spatial separation of their cortical representations and if attention can enhance the cortical representation of these features in the absence of any physical change in the stimulus. Ten human subjects (four female, six male) listened to pairs of tone triplets varying in pitch, timbre, or both and judged which tone triplet had the higher pitch or brighter timbre. Variations in each feature engaged common auditory regions with no clear distinctions at a univariate level. Attending to one did not improve the separability of the neural representations of pitch and timbre at the univariate level. At the multivariate level, the classifier performed above chance in distinguishing between conditions in which pitch or timbre was discriminated. The results confirm that the computations underlying pitch and timbre perception are subserved by strongly overlapping cortical regions, but reveal that attention to one or other feature leads to distinguishable activation patterns even in the absence of physical differences in the stimuli.SIGNIFICANCE STATEMENT Although pitch and timbre are generally thought of as independent auditory features of a sound, pitch height and timbral brightness can be confused for one another. This study shows that pitch and timbre variations are represented in overlapping regions of auditory cortex, but that they produce distinguishable patterns of activation. Most importantly, the patterns of activation can be distinguished based on whether subjects attended to pitch or timbre even when the stimuli remained physically identical. The results therefore show that variations in pitch and timbre are represented by overlapping neural networks, but that attention to different features of the same sound can lead to distinguishable patterns of activation.

PMID: 30804086 [PubMed - in process]

Timestamp: 4:40pm
Item ID: 45,846
Icon for Elsevier Science Icon for PubMed Central Related Articles

Corrigendum to "Learning for pitch and melody discrimination in congenital amusia" [Cortex 103 (2018) 167-178].

Cortex. 2019 Jun;115:371

Authors: Whiteford KL, Oxenham AJ

PMID: 30803741 [PubMed]

Timestamp: 4:40pm
Item ID: 45,841
Icon for Springer Related Articles

Speech Perception with Spectrally Non-overlapping Maskers as Measure of Spectral Resolution in Cochlear Implant Users.

J Assoc Res Otolaryngol. 2019 Apr;20(2):151-167

Authors: O'Neill ER, Kreft HA, Oxenham AJ

Abstract
Poor spectral resolution contributes to the difficulties experienced by cochlear implant (CI) users when listening to speech in noise. However, correlations between measures of spectral resolution and speech perception in noise have not always been found to be robust. It may be that the relationship between spectral resolution and speech perception in noise becomes clearer in conditions where the speech and noise are not spectrally matched, so that improved spectral resolution can assist in separating the speech from the masker. To test this prediction, speech intelligibility was measured with noise or tone maskers that were presented either in the same spectral channels as the speech or in interleaved spectral channels. Spectral resolution was estimated via a spectral ripple discrimination task. Results from vocoder simulations in normal-hearing listeners showed increasing differences in speech intelligibility between spectrally overlapped and interleaved maskers as well as improved spectral ripple discrimination with increasing spectral resolution. However, no clear differences were observed in CI users between performance with spectrally interleaved and overlapped maskers, or between tone and noise maskers. The results suggest that spectral resolution in current CIs is too poor to take advantage of the spectral separation produced by spectrally interleaved speech and maskers. Overall, the spectrally interleaved and tonal maskers produce a much larger difference in performance between normal-hearing listeners and CI users than do traditional speech-in-noise measures, and thus provide a more sensitive test of speech perception abilities for current and future implantable devices.

PMID: 30456730 [PubMed - in process]

Timestamp: 4:40pm
Item ID: 45,836

2018

2018

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Fundamental-frequency discrimination based on temporal-envelope cues: Effects of bandwidth and interference.

J Acoust Soc Am. 2018 Nov;144(5):EL423

Authors: Mehta AH, Oxenham AJ

Abstract
Both music and speech perception rely on hearing out one pitch in the presence of others. Pitch discrimination of narrowband sounds based only on temporal-envelope cues is rendered nearly impossible by introducing interferers in both normal-hearing listeners and cochlear-implant (CI) users. This study tested whether performance improves in normal-hearing listeners if the target is presented over a broad spectral region. The results indicate that performance is still strongly affected by spectrally remote interferers, despite increases in bandwidth, suggesting that envelope-based pitch is unlikely to allow CI users to perceive pitch when multiple harmonic sounds are presented at once.

PMID: 30522318 [PubMed - in process]

Icon for American Institute of Physics Related Articles

Examining replicability of an otoacoustic measure of cochlear function during selective attention.

J Acoust Soc Am. 2018 Nov;144(5):2882

Authors: Beim JA, Oxenham AJ, Wojtczak M

Abstract
Attention to a target stimulus within a complex scene often results in enhanced cortical representations of the target relative to the background. It remains unclear where along the auditory pathways attentional effects can first be measured. Anatomy suggests that attentional modulation could occur through corticofugal connections extending as far as the cochlea itself. Earlier attempts to investigate the effects of attention on human cochlear processing have revealed small and inconsistent effects. In this study, stimulus-frequency otoacoustic emissions were recorded from a total of 30 human participants as they performed tasks that required sustained selective attention to auditory or visual stimuli. In the first sample of 15 participants, emission magnitudes were significantly weaker when participants attended to the visual stimuli than when they attended to the auditory stimuli, by an average of 5.4 dB. However, no such effect was found in the second sample of 15 participants. When the data were pooled across samples, the average attentional effect was significant, but small (2.48 dB), with 12 of 30 listeners showing a significant effect, based on bootstrap analysis of the individual data. The results highlight the need for considering sources of individual differences and using large sample sizes in future investigations.

PMID: 30522315 [PubMed - in process]

Icon for Springer Related Articles

Speech Perception with Spectrally Non-overlapping Maskers as Measure of Spectral Resolution in Cochlear Implant Users.

J Assoc Res Otolaryngol. 2019 Apr;20(2):151-167

Authors: O'Neill ER, Kreft HA, Oxenham AJ

Abstract
Poor spectral resolution contributes to the difficulties experienced by cochlear implant (CI) users when listening to speech in noise. However, correlations between measures of spectral resolution and speech perception in noise have not always been found to be robust. It may be that the relationship between spectral resolution and speech perception in noise becomes clearer in conditions where the speech and noise are not spectrally matched, so that improved spectral resolution can assist in separating the speech from the masker. To test this prediction, speech intelligibility was measured with noise or tone maskers that were presented either in the same spectral channels as the speech or in interleaved spectral channels. Spectral resolution was estimated via a spectral ripple discrimination task. Results from vocoder simulations in normal-hearing listeners showed increasing differences in speech intelligibility between spectrally overlapped and interleaved maskers as well as improved spectral ripple discrimination with increasing spectral resolution. However, no clear differences were observed in CI users between performance with spectrally interleaved and overlapped maskers, or between tone and noise maskers. The results suggest that spectral resolution in current CIs is too poor to take advantage of the spectral separation produced by spectrally interleaved speech and maskers. Overall, the spectrally interleaved and tonal maskers produce a much larger difference in performance between normal-hearing listeners and CI users than do traditional speech-in-noise measures, and thus provide a more sensitive test of speech perception abilities for current and future implantable devices.

PMID: 30456730 [PubMed - in process]

Icon for American Institute of Physics Related Articles

Cortical markers of auditory stream segregation revealed for streaming based on tonotopy but not pitch.

J Acoust Soc Am. 2018 10;144(4):2424

Authors: Ruggles DR, Tausend AN, Shamma SA, Oxenham AJ

Abstract
The brain decomposes mixtures of sounds, such as competing talkers, into perceptual streams that can be attended to individually. Attention can enhance the cortical representation of streams, but it is unknown what acoustic features the enhancement reflects, or where in the auditory pathways attentional enhancement is first observed. Here, behavioral measures of streaming were combined with simultaneous low- and high-frequency envelope-following responses (EFR) that are thought to originate primarily from cortical and subcortical regions, respectively. Repeating triplets of harmonic complex tones were presented with alternating fundamental frequencies. The tones were filtered to contain either low-numbered spectrally resolved harmonics, or only high-numbered unresolved harmonics. The behavioral results confirmed that segregation can be based on either tonotopic or pitch cues. The EFR results revealed no effects of streaming or attention on subcortical responses. Cortical responses revealed attentional enhancement under conditions of streaming, but only when tonotopic cues were available, not when streaming was based only on pitch cues. The results suggest that the attentional modulation of phase-locked responses is dominated by tonotopically tuned cortical neurons that are insensitive to pitch or periodicity cues.

PMID: 30404514 [PubMed - in process]

Icon for HighWire Icon for HighWire Icon for PubMed Central Related Articles

Mammalian behavior and physiology converge to confirm sharper cochlear tuning in humans.

Proc Natl Acad Sci U S A. 2018 10 30;115(44):11322-11326

Authors: Sumner CJ, Wells TT, Bergevin C, Sollini J, Kreft HA, Palmer AR, Oxenham AJ, Shera CA

Abstract
Frequency analysis of sound by the cochlea is the most fundamental property of the auditory system. Despite its importance, the resolution of this frequency analysis in humans remains controversial. The controversy persists because the methods used to estimate tuning in humans are indirect and have not all been independently validated in other species. Some data suggest that human cochlear tuning is considerably sharper than that of laboratory animals, while others suggest little or no difference between species. We show here in a single species (ferret) that behavioral estimates of tuning bandwidths obtained using perceptual masking methods, and objective estimates obtained using otoacoustic emissions, both also employed in humans, agree closely with direct physiological measurements from single auditory-nerve fibers. Combined with human behavioral data, this outcome indicates that the frequency analysis performed by the human cochlea is of significantly higher resolution than found in common laboratory animals. This finding raises important questions about the evolutionary origins of human cochlear tuning, its role in the emergence of speech communication, and the mechanisms underlying our ability to separate and process natural sounds in complex acoustic environments.

PMID: 30322908 [PubMed - indexed for MEDLINE]

Icon for Atypon Icon for PubMed Central Related Articles

Hearing, Emotion, Amplification, Research, and Training Workshop: Current Understanding of Hearing Loss and Emotion Perception and Priorities for Future Research.

Trends Hear. 2018 Jan-Dec;22:2331216518803215

Authors: Picou EM, Singh G, Goy H, Russo F, Hickson L, Oxenham AJ, Buono GH, Ricketts TA, Launer S

Abstract
The question of how hearing loss and hearing rehabilitation affect patients' momentary emotional experiences is one that has received little attention but has considerable potential to affect patients' psychosocial function. This article is a product from the Hearing, Emotion, Amplification, Research, and Training workshop, which was convened to develop a consensus document describing research on emotion perception relevant for hearing research. This article outlines conceptual frameworks for the investigation of emotion in hearing research; available subjective, objective, neurophysiologic, and peripheral physiologic data acquisition research methods; the effects of age and hearing loss on emotion perception; potential rehabilitation strategies; priorities for future research; and implications for clinical audiologic rehabilitation. More broadly, this article aims to increase awareness about emotion perception research in audiology and to stimulate additional research on the topic.

PMID: 30270810 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Auditory enhancement and the role of spectral resolution in normal-hearing listeners and cochlear-implant users.

J Acoust Soc Am. 2018 08;144(2):552

Authors: Feng L, Oxenham AJ

Abstract
Detection of a target tone in a simultaneous multi-tone masker can be improved by preceding the stimulus with the masker alone. The mechanisms underlying this auditory enhancement effect may enable the efficient detection of new acoustic events and may help to produce perceptual constancy under varying acoustic conditions. Previous work in cochlear-implant (CI) users has suggested reduced or absent enhancement, due perhaps to poor spatial resolution in the cochlea. This study used a supra-threshold enhancement paradigm that in normal-hearing listeners results in large enhancement effects, exceeding 20 dB. Results from vocoder simulations using normal-hearing listeners showed that near-normal enhancement was observed if the simulated spread of excitation was limited to spectral slopes no shallower than 24 dB/oct. No significant enhancement was observed on average in CI users with their clinical monopolar stimulation strategy. The variability in enhancement between CI users, and between electrodes in a single CI user, could not be explained by the spread of excitation, as estimated from auditory nerve evoked potentials. Enhancement remained small, but did reach statistical significance, under the narrower partial-tripolar stimulation strategy. The results suggest that enhancement may be at least partially restored by improvements in the spatial resolution of current CIs.

PMID: 30180692 [PubMed - in process]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Effects of spectral resolution on spectral contrast effects in cochlear-implant users.

J Acoust Soc Am. 2018 06;143(6):EL468

Authors: Feng L, Oxenham AJ

Abstract
The identity of a speech sound can be affected by the long-term spectrum of a preceding stimulus. Poor spectral resolution of cochlear implants (CIs) may affect such context effects. Here, spectral contrast effects on a phoneme category boundary were investigated in CI users and normal-hearing (NH) listeners. Surprisingly, larger contrast effects were observed in CI users than in NH listeners, even when spectral resolution in NH listeners was limited via vocoder processing. The results may reflect a different weighting of spectral cues by CI users, based on poorer spectral resolution, which in turn may enhance some spectral contrast effects.

PMID: 29960500 [PubMed - in process]

Icon for American Psychological Association Related Articles

Spectral contrast effects produced by competing speech contexts.

J Exp Psychol Hum Percept Perform. 2018 Sep;44(9):1447-1457

Authors: Feng L, Oxenham AJ

Abstract
The long-term spectrum of a preceding sentence can alter the perception of a following speech sound in a contrastive manner. This speech context effect contributes to our ability to extract reliable spectral characteristics of the surrounding acoustic environment and to compensate for the voice characteristics of different speakers or spectral colorations in different listening environments to maintain perceptual constancy. The extent to which such effects are mediated by low-level "automatic" processes, or require directed attention, remains unknown. This study investigated spectral context effects by measuring the effects of two competing sentences on the phoneme category boundary between /i/ and /ε/ in a following target word, while directing listeners' attention to one or the other context sentence. Spatial separation of the context sentences was achieved either by presenting them to different ears, or by presenting them to both ears but imposing an interaural time difference (ITD) between the ears. The results confirmed large context effects based on ear of presentation. Smaller effects were observed based on either ITD or attention. The results, combined with predictions from a two-stage model, suggest that ear-specific factors dominate speech context effects but that the effects can be modulated by higher-level features, such as perceived location, and by attention. (PsycINFO Database Record

PMID: 29847973 [PubMed - indexed for MEDLINE]

Icon for Elsevier Science Icon for PubMed Central Related Articles

Learning for pitch and melody discrimination in congenital amusia.

Cortex. 2018 06;103:164-178

Authors: Whiteford KL, Oxenham AJ

Abstract
Congenital amusia is currently thought to be a life-long neurogenetic disorder in music perception, impervious to training in pitch or melody discrimination. This study provides an explicit test of whether amusic deficits can be reduced with training. Twenty amusics and 20 matched controls participated in four sessions of psychophysical training involving either pure-tone (500 Hz) pitch discrimination or a control task of lateralization (interaural level differences for bandpass white noise). Pure-tone pitch discrimination at low, medium, and high frequencies (500, 2000, and 8000 Hz) was measured before and after training (pretest and posttest) to determine the specificity of learning. Melody discrimination was also assessed before and after training using the full Montreal Battery of Evaluation of Amusia, the most widely used standardized test to diagnose amusia. Amusics performed more poorly than controls in pitch but not localization discrimination, but both groups improved with practice on the trained stimuli. Learning was broad, occurring across all three frequencies and melody discrimination for all groups, including those who trained on the non-pitch control task. Following training, 11 of 20 amusics no longer met the global diagnostic criteria for amusia. A separate group of untrained controls (n = 20), who also completed melody discrimination and pretest, improved by an equal amount as trained controls on all measures, suggesting that the bulk of learning for the control group occurred very rapidly from the pretest. Thirty-one trained participants (13 amusics) returned one year later to assess long-term maintenance of pitch and melody discrimination. On average, there was no change in performance between posttest and one-year follow-up, demonstrating that improvements on pitch- and melody-related tasks in amusics and controls can be maintained. The findings indicate that amusia is not always a life-long deficit when using the current standard diagnostic criteria.

PMID: 29655041 [PubMed - in process]

Icon for Elsevier Science Icon for PubMed Central Related Articles

Effect of age and hearing loss on auditory stream segregation of speech sounds.

Hear Res. 2018 07;364:118-128

Authors: David M, Tausend AN, Strelcyk O, Oxenham AJ

Abstract
Segregating and understanding speech in complex environments is a major challenge for hearing-impaired (HI) listeners. It remains unclear to what extent these difficulties are dominated by direct interference, such as simultaneous masking, or by a failure of the mechanisms of stream segregation. This study compared older HI listeners' performance with that of young and older normal-hearing (NH) listeners in stream segregation tasks involving speech sounds. Listeners were presented with sequences of speech tokens, each consisting of a fricative consonant and a voiced vowel (CV). The CV tokens were concatenated into interleaved sequences that alternated in fundamental frequency (F0) and/or simulated vocal tract length (VTL). Each pair of interleaved sequences was preceded by a "word" consisting of two random tokens. The listeners were asked to indicate whether the word was present in the following interleaved sequences. The word, if present, occurred within one of the interleaved sequences, so that performance improved if the listeners were able to perceptually segregate the two sequences. Although HI listeners' identification of the speech tokens in isolation was poorer than that of the NH listeners, HI listeners were generally able to use both F0 and VTL cues to segregate the interleaved sequences. The results suggest that the difficulties experienced by HI listeners in complex acoustic environments cannot be explained by a loss of basic stream segregation abilities.

PMID: 29602593 [PubMed - indexed for MEDLINE]

Icon for Wolters Kluwer Related Articles

A Dynamically Focusing Cochlear Implant Strategy Can Improve Vowel Identification in Noise.

Ear Hear. 2018 Nov/Dec;39(6):1136-1145

Authors: Arenberg JG, Parkinson WS, Litvak L, Chen C, Kreft HA, Oxenham AJ

Abstract
OBJECTIVES: The standard, monopolar (MP) electrode configuration used in commercially available cochlear implants (CI) creates a broad electrical field, which can lead to unwanted channel interactions. Use of more focused configurations, such as tripolar and phased array, has led to mixed results for improving speech understanding. The purpose of the present study was to assess the efficacy of a physiologically inspired configuration called dynamic focusing, using focused tripolar stimulation at low levels and less focused stimulation at high levels. Dynamic focusing may better mimic cochlear excitation patterns in normal acoustic hearing, while reducing the current levels necessary to achieve sufficient loudness at high levels.
DESIGN: Twenty postlingually deafened adult CI users participated in the study. Speech perception was assessed in quiet and in a four-talker babble background noise. Speech stimuli were closed-set spondees in noise, and medial vowels at 50 and 60 dB SPL in quiet and in noise. The signal to noise ratio was adjusted individually such that performance was between 40 and 60% correct with the MP strategy. Subjects were fitted with three experimental strategies matched for pulse duration, pulse rate, filter settings, and loudness on a channel-by-channel basis. The strategies included 14 channels programmed in MP, fixed partial tripolar (σ = 0.8), and dynamic partial tripolar (σ at 0.8 at threshold and 0.5 at the most comfortable level). Fifteen minutes of listening experience was provided with each strategy before testing. Sound quality ratings were also obtained.
RESULTS: Speech perception performance for vowel identification in quiet at 50 and 60 dB SPL and for spondees in noise was similar for the three tested strategies. However, performance on vowel identification in noise was significantly better for listeners using the dynamic focusing strategy. Sound quality ratings were similar for the three strategies. Some subjects obtained more benefit than others, with some individual differences explained by the relation between loudness growth and the rate of change from focused to broader stimulation.
CONCLUSIONS: These initial results suggest that further exploration of dynamic focusing is warranted. Specifically, optimizing such strategies on an individual basis may lead to improvements in speech perception for more adult listeners and improve how CIs are tailored. Some listeners may also need a longer period of time to acclimate to a new program.

PMID: 29529006 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Auditory enhancement under simultaneous masking in normal-hearing and hearing-impaired listeners.

J Acoust Soc Am. 2018 02;143(2):901

Authors: Kreft HA, Wojtczak M, Oxenham AJ

Abstract
Auditory enhancement, where a target sound within a masker is rendered more audible by the prior presentation of the masker alone, may play an important role in auditory perception under variable everyday acoustic conditions. Cochlear hearing loss may reduce enhancement effects, potentially contributing to the difficulties experienced by hearing-impaired (HI) individuals in noisy and reverberant environments. However, it remains unknown whether, and by how much, enhancement under simultaneous masking is reduced in HI listeners. Enhancement of a pure tone under simultaneous masking with a multi-tone masker was measured in HI listeners and age-matched normal-hearing (NH) listeners as function of the spectral notch width of the masker, using stimuli at equal sensation levels as well as at equal sound pressure levels, but with the stimuli presented in noise to the NH listeners to maintain the equal sensation level between listener groups. The results showed that HI listeners exhibited some enhancement in all conditions. However, even when conditions were made as comparable as possible, in terms of effective spectral notch width and presentation level, the enhancement effect in HI listeners under simultaneous masking was reduced relative to that observed in NH listeners.

PMID: 29495696 [PubMed - in process]

Icon for Elsevier Science Icon for PubMed Central Related Articles

Encoding of natural timbre dimensions in human auditory cortex.

Neuroimage. 2018 02 01;166:60-70

Authors: Allen EJ, Moerel M, Lage-Castellanos A, De Martino F, Formisano E, Oxenham AJ

Abstract
Timbre, or sound quality, is a crucial but poorly understood dimension of auditory perception that is important in describing speech, music, and environmental sounds. The present study investigates the cortical representation of different timbral dimensions. Encoding models have typically incorporated the physical characteristics of sounds as features when attempting to understand their neural representation with functional MRI. Here we test an encoding model that is based on five subjectively derived dimensions of timbre to predict cortical responses to natural orchestral sounds. Results show that this timbre model can outperform other models based on spectral characteristics, and can perform as well as a complex joint spectrotemporal modulation model. In cortical regions at the medial border of Heschl's gyrus, bilaterally, and regions at its posterior adjacency in the right hemisphere, the timbre model outperforms even the complex joint spectrotemporal modulation model. These findings suggest that the responses of cortical neuronal populations in auditory cortex may reflect the encoding of perceptual timbre dimensions.

PMID: 29080711 [PubMed - indexed for MEDLINE]

Icon for Atypon Icon for PubMed Central Related Articles

How We Hear: The Perception and Neural Coding of Sound.

Annu Rev Psychol. 2018 01 04;69:27-50

Authors: Oxenham AJ

Abstract
Auditory perception is our main gateway to communication with others via speech and music, and it also plays an important role in alerting and orienting us to new events. This review provides an overview of selected topics pertaining to the perception and neural coding of sound, starting with the first stage of filtering in the cochlea and its profound impact on perception. The next topic, pitch, has been debated for millennia, but recent technical and theoretical developments continue to provide us with new insights. Cochlear filtering and pitch both play key roles in our ability to parse the auditory scene, enabling us to attend to one auditory object or stream while ignoring others. An improved understanding of the basic mechanisms of auditory perception will aid us in the quest to tackle the increasingly important problem of hearing loss in our aging population.

PMID: 29035691 [PubMed - indexed for MEDLINE]

2017

2017

Authors: David M, Lavandier M, Grimault N, Oxenham AJ
Icon for Elsevier Science Icon for PubMed Central Related Articles

Sequential stream segregation of voiced and unvoiced speech sounds based on fundamental frequency.

Hear Res. 2017 02;344:235-243

Authors: David M, Lavandier M, Grimault N, Oxenham AJ

Abstract
Differences in fundamental frequency (F0) between voiced sounds are known to be a strong cue for stream segregation. However, speech consists of both voiced and unvoiced sounds, and less is known about whether and how the unvoiced portions are segregated. This study measured listeners' ability to integrate or segregate sequences of consonant-vowel tokens, comprising a voiceless fricative and a vowel, as a function of the F0 difference between interleaved sequences of tokens. A performance-based measure was used, in which listeners detected the presence of a repeated token either within one sequence or between the two sequences (measures of voluntary and obligatory streaming, respectively). The results showed a systematic increase of voluntary stream segregation as the F0 difference between the two interleaved sequences increased from 0 to 13 semitones, suggesting that F0 differences allowed listeners to segregate speech sounds, including the unvoiced portions. In contrast to the consistent effects of voluntary streaming, the trend towards obligatory stream segregation at large F0 differences failed to reach significance. Listeners were no longer able to perform the voluntary-streaming task reliably when the unvoiced portions were removed from the stimuli, suggesting that the unvoiced portions were used and correctly segregated in the original task. The results demonstrate that streaming based on F0 differences occurs for natural speech sounds, and that the unvoiced portions are correctly assigned to the corresponding voiced portions.

PMID: 27923739 [PubMed - indexed for MEDLINE]

Authors: Allen EJ, Burton PC, Olman CA, Oxenham AJ
Icon for HighWire Icon for PubMed Central Related Articles

Representations of Pitch and Timbre Variation in Human Auditory Cortex.

J Neurosci. 2017 02 01;37(5):1284-1293

Authors: Allen EJ, Burton PC, Olman CA, Oxenham AJ

Abstract
Pitch and timbre are two primary dimensions of auditory perception, but how they are represented in the human brain remains a matter of contention. Some animal studies of auditory cortical processing have suggested modular processing, with different brain regions preferentially coding for pitch or timbre, whereas other studies have suggested a distributed code for different attributes across the same population of neurons. This study tested whether variations in pitch and timbre elicit activity in distinct regions of the human temporal lobes. Listeners were presented with sequences of sounds that varied in either fundamental frequency (eliciting changes in pitch) or spectral centroid (eliciting changes in brightness, an important attribute of timbre), with the degree of pitch or timbre variation in each sequence parametrically manipulated. The BOLD responses from auditory cortex increased with increasing sequence variance along each perceptual dimension. The spatial extent, region, and laterality of the cortical regions most responsive to variations in pitch or timbre at the univariate level of analysis were largely overlapping. However, patterns of activation in response to pitch or timbre variations were discriminable in most subjects at an individual level using multivoxel pattern analysis, suggesting a distributed coding of the two dimensions bilaterally in human auditory cortex.
SIGNIFICANCE STATEMENT: Pitch and timbre are two crucial aspects of auditory perception. Pitch governs our perception of musical melodies and harmonies, and conveys both prosodic and (in tone languages) lexical information in speech. Brightness-an aspect of timbre or sound quality-allows us to distinguish different musical instruments and speech sounds. Frequency-mapping studies have revealed tonotopic organization in primary auditory cortex, but the use of pure tones or noise bands has precluded the possibility of dissociating pitch from brightness. Our results suggest a distributed code, with no clear anatomical distinctions between auditory cortical regions responsive to changes in either pitch or timbre, but also reveal a population code that can differentiate between changes in either dimension within the same cortical regions.

PMID: 28025255 [PubMed - indexed for MEDLINE]

Authors: Mehta AH, Jacoby N, Yasin I, Oxenham AJ, Shamma SA
Icon for PubMed Central Related Articles

An auditory illusion reveals the role of streaming in the temporal misallocation of perceptual objects.

Philos Trans R Soc Lond B Biol Sci. 2017 02 19;372(1714):

Authors: Mehta AH, Jacoby N, Yasin I, Oxenham AJ, Shamma SA

Abstract
This study investigates the neural correlates and processes underlying the ambiguous percept produced by a stimulus similar to Deutsch's 'octave illusion', in which each ear is presented with a sequence of alternating pure tones of low and high frequencies. The same sequence is presented to each ear, but in opposite phase, such that the left and right ears receive a high-low-high … and a low-high-low … pattern, respectively. Listeners generally report hearing the illusion of an alternating pattern of low and high tones, with all the low tones lateralized to one side and all the high tones lateralized to the other side. The current explanation of the illusion is that it reflects an illusory feature conjunction of pitch and perceived location. Using psychophysics and electroencephalogram measures, we test this and an alternative hypothesis involving synchronous and sequential stream segregation, and investigate potential neural correlates of the illusion. We find that the illusion of alternating tones arises from the synchronous tone pairs across ears rather than sequential tones in one ear, suggesting that the illusion involves a misattribution of time across perceptual streams, rather than a misattribution of location within a stream. The results provide new insights into the mechanisms of binaural streaming and synchronous sound segregation.This article is part of the themed issue 'Auditory and visual scene analysis'.

PMID: 28044024 [PubMed - indexed for MEDLINE]

Authors: Lu K, Xu Y, Yin P, Oxenham AJ, Fritz JB, Shamma SA
Icon for Nature Publishing Group Icon for PubMed Central Related Articles

Temporal coherence structure rapidly shapes neuronal interactions.

Nat Commun. 2017 01 05;8:13900

Authors: Lu K, Xu Y, Yin P, Oxenham AJ, Fritz JB, Shamma SA

Abstract
Perception of segregated sources is essential in navigating cluttered acoustic environments. A basic mechanism to implement this process is the temporal coherence principle. It postulates that a signal is perceived as emitted from a single source only when all of its features are temporally modulated coherently, causing them to bind perceptually. Here we report on neural correlates of this process as rapidly reshaped interactions in primary auditory cortex, measured in three different ways: as changes in response rates, as adaptations of spectrotemporal receptive fields following stimulation by temporally coherent and incoherent tone sequences, and as changes in spiking correlations during the tone sequences. Responses, sensitivity and presumed connectivity were rapidly enhanced by synchronous stimuli, and suppressed by alternating (asynchronous) sounds, but only when the animals engaged in task performance and were attentive to the stimuli. Temporal coherence and attention are therefore both important factors in auditory scene analysis.

PMID: 28054545 [PubMed - indexed for MEDLINE]

Authors: Wojtczak M, Mehta AH, Oxenham AJ
Icon for HighWire Icon for HighWire Icon for PubMed Central Related Articles

Rhythm judgments reveal a frequency asymmetry in the perception and neural coding of sound synchrony.

Proc Natl Acad Sci U S A. 2017 01 31;114(5):1201-1206

Authors: Wojtczak M, Mehta AH, Oxenham AJ

Abstract
In modern Western music, melody is commonly conveyed by pitch changes in the highest-register voice, whereas meter or rhythm is often carried by instruments with lower pitches. An intriguing and recently suggested possibility is that the custom of assigning rhythmic functions to lower-pitch instruments may have emerged because of fundamental properties of the auditory system that result in superior time encoding for low pitches. Here we compare rhythm and synchrony perception between low- and high-frequency tones, using both behavioral and EEG techniques. Both methods were consistent in showing no superiority in time encoding for low over high frequencies. However, listeners were consistently more sensitive to timing differences between two nearly synchronous tones when the high-frequency tone followed the low-frequency tone than vice versa. The results demonstrate no superiority of low frequencies in timing judgments but reveal a robust asymmetry in the perception and neural coding of synchrony that reflects greater tolerance for delays of low- relative to high-frequency sounds than vice versa. We propose that this asymmetry exists to compensate for inherent and variable time delays in cochlear processing, as well as the acoustical properties of sound sources in the natural environment, thereby providing veridical perceptual experiences of simultaneity.

PMID: 28096408 [PubMed - indexed for MEDLINE]

Authors: Lau BK, Ruggles DR, Katyal S, Engel SA, Oxenham AJ
Icon for Public Library of Science Icon for PubMed Central Related Articles

Sustained Cortical and Subcortical Measures of Auditory and Visual Plasticity following Short-Term Perceptual Learning.

PLoS One. 2017;12(1):e0168858

Authors: Lau BK, Ruggles DR, Katyal S, Engel SA, Oxenham AJ

Abstract
Short-term training can lead to improvements in behavioral discrimination of auditory and visual stimuli, as well as enhanced EEG responses to those stimuli. In the auditory domain, fluency with tonal languages and musical training has been associated with long-term cortical and subcortical plasticity, but less is known about the effects of shorter-term training. This study combined electroencephalography (EEG) and behavioral measures to investigate short-term learning and neural plasticity in both auditory and visual domains. Forty adult participants were divided into four groups. Three groups trained on one of three tasks, involving discrimination of auditory fundamental frequency (F0), auditory amplitude modulation rate (AM), or visual orientation (VIS). The fourth (control) group received no training. Pre- and post-training tests, as well as retention tests 30 days after training, involved behavioral discrimination thresholds, steady-state visually evoked potentials (SSVEP) to the flicker frequencies of visual stimuli, and auditory envelope-following responses simultaneously evoked and measured in response to rapid stimulus F0 (EFR), thought to reflect subcortical generators, and slow amplitude modulation (ASSR), thought to reflect cortical generators. Enhancement of the ASSR was observed in both auditory-trained groups, not specific to the AM-trained group, whereas enhancement of the SSVEP was found only in the visually-trained group. No evidence was found for changes in the EFR. The results suggest that some aspects of neural plasticity can develop rapidly and may generalize across tasks but not across modalities. Behaviorally, the pattern of learning was complex, with significant cross-task and cross-modal learning effects.

PMID: 28107359 [PubMed - indexed for MEDLINE]

Authors: Kreft HA, Oxenham AJ
Icon for Springer Icon for PubMed Central Related Articles

Auditory Enhancement in Cochlear-Implant Users Under Simultaneous and Forward Masking.

J Assoc Res Otolaryngol. 2017 Jun;18(3):483-493

Authors: Kreft HA, Oxenham AJ

Abstract
Auditory enhancement is the phenomenon whereby the salience or detectability of a target sound within a masker is enhanced by the prior presentation of the masker alone. Enhancement has been demonstrated using both simultaneous and forward masking in normal-hearing listeners and may play an important role in auditory and speech perception within complex and time-varying acoustic environments. The few studies of enhancement in hearing-impaired listeners have reported reduced or absent enhancement effects under forward masking, suggesting a potentially peripheral locus of the effect. Here, auditory enhancement was measured in eight cochlear-implant (CI) users with direct stimulation. Masked thresholds were measured under simultaneous and forward masking as a function of the number of masking electrodes, and the electrode spacing between the maskers and the target. Evidence for auditory enhancement was obtained under simultaneous masking, qualitatively consistent with results from normal-hearing listeners. However, no significant enhancement was observed under forward masking, in contrast to earlier results with normal-hearing listeners. The results suggest that the normal effects of auditory enhancement are partially but not fully experienced by CI users. To the extent that the CI users' results differ from normal, it may be possible to apply signal processing to restore the missing aspects of enhancement.

PMID: 28303412 [PubMed - indexed for MEDLINE]

Authors: Whiteford KL, Oxenham AJ
Icon for Elsevier Science Icon for PubMed Central Related Articles

Auditory deficits in amusia extend beyond poor pitch perception.

Neuropsychologia. 2017 05;99:213-224

Authors: Whiteford KL, Oxenham AJ

Abstract
Congenital amusia is a music perception disorder believed to reflect a deficit in fine-grained pitch perception and/or short-term or working memory for pitch. Because most measures of pitch perception include memory and segmentation components, it has been difficult to determine the true extent of pitch processing deficits in amusia. It is also unclear whether pitch deficits persist at frequencies beyond the range of musical pitch. To address these questions, experiments were conducted with amusics and matched controls, manipulating both the stimuli and the task demands. First, we assessed pitch discrimination at low (500Hz and 2000Hz) and high (8000Hz) frequencies using a three-interval forced-choice task. Amusics exhibited deficits even at the highest frequency, which lies beyond the existence region of musical pitch. Next, we assessed the extent to which frequency coding deficits persist in one- and two-interval frequency-modulation (FM) and amplitude-modulation (AM) detection tasks at 500Hz at slow (fm=4Hz) and fast (fm=20Hz) modulation rates. Amusics still exhibited deficits in one-interval FM detection tasks that should not involve memory or segmentation. Surprisingly, amusics were also impaired on AM detection, which should not involve pitch processing. Finally, direct comparisons between the detection of continuous and discrete FM demonstrated that amusics suffer deficits in both coding and segmenting pitch information. Our results reveal auditory deficits in amusia extending beyond pitch perception that are subtle when controlling for memory and segmentation, and are likely exacerbated in more complex contexts such as musical listening.

PMID: 28315696 [PubMed - indexed for MEDLINE]

Authors: Whiteford KL, Kreft HA, Oxenham AJ
Icon for Springer Icon for PubMed Central Related Articles

Assessing the Role of Place and Timing Cues in Coding Frequency and Amplitude Modulation as a Function of Age.

J Assoc Res Otolaryngol. 2017 Aug;18(4):619-633

Authors: Whiteford KL, Kreft HA, Oxenham AJ

Abstract
Natural sounds can be characterized by their fluctuations in amplitude and frequency. Ageing may affect sensitivity to some forms of fluctuations more than others. The present study used individual differences across a wide age range (20-79 years) to test the hypothesis that slow-rate, low-carrier frequency modulation (FM) is coded by phase-locked auditory-nerve responses to temporal fine structure (TFS), whereas fast-rate FM is coded via rate-place (tonotopic) cues, based on amplitude modulation (AM) of the temporal envelope after cochlear filtering. Using a low (500 Hz) carrier frequency, diotic FM and AM detection thresholds were measured at slow (1 Hz) and fast (20 Hz) rates in 85 listeners. Frequency selectivity and TFS coding were assessed using forward masking patterns and interaural phase disparity tasks (slow dichotic FM), respectively. Comparable interaural level disparity tasks (slow and fast dichotic AM and fast dichotic FM) were measured to control for effects of binaural processing not specifically related to TFS coding. Thresholds in FM and AM tasks were correlated, even across tasks thought to use separate peripheral codes. Age was correlated with slow and fast FM thresholds in both diotic and dichotic conditions. The relationship between age and AM thresholds was generally not significant. Once accounting for AM sensitivity, only diotic slow-rate FM thresholds remained significantly correlated with age. Overall, results indicate stronger effects of age on FM than AM. However, because of similar effects for both slow and fast FM when not accounting for AM sensitivity, the effects cannot be unambiguously ascribed to TFS coding.

PMID: 28429126 [PubMed - indexed for MEDLINE]

Authors: Mehta AH, Oxenham AJ
Icon for Springer Icon for PubMed Central Related Articles

Vocoder Simulations Explain Complex Pitch Perception Limitations Experienced by Cochlear Implant Users.

J Assoc Res Otolaryngol. 2017 Dec;18(6):789-802

Authors: Mehta AH, Oxenham AJ

Abstract
Pitch plays a crucial role in speech and music, but is highly degraded for people with cochlear implants, leading to severe communication challenges in noisy environments. Pitch is determined primarily by the first few spectrally resolved harmonics of a tone. In implants, access to this pitch is limited by poor spectral resolution, due to the limited number of channels and interactions between adjacent channels. Here we used noise-vocoder simulations to explore how many channels, and how little channel interaction, are required to elicit pitch. Results suggest that two to four times the number of channels are needed, along with interactions reduced by an order of magnitude, than available in current devices. These new constraints not only provide insights into the basic mechanisms of pitch coding in normal hearing but also suggest that spectrally based complex pitch is unlikely to be generated in implant users without significant changes in the method or site of stimulation.

PMID: 28733803 [PubMed - indexed for MEDLINE]

Authors: Lau BK, Mehta AH, Oxenham AJ
Icon for HighWire Icon for PubMed Central Related Articles

Superoptimal Perceptual Integration Suggests a Place-Based Representation of Pitch at High Frequencies.

J Neurosci. 2017 09 13;37(37):9013-9021

Authors: Lau BK, Mehta AH, Oxenham AJ

Abstract
Pitch, the perceptual correlate of sound repetition rate or frequency, plays an important role in speech perception, music perception, and listening in complex acoustic environments. Despite the perceptual importance of pitch, the neural mechanisms that underlie it remain poorly understood. Although cortical regions responsive to pitch have been identified, little is known about how pitch information is extracted from the inner ear itself. The two primary theories of peripheral pitch coding involve stimulus-driven spike timing, or phase locking, in the auditory nerve (time code), and the spatial distribution of responses along the length of the cochlear partition (place code). To rule out the use of timing information, we tested pitch discrimination of very high-frequency tones (>8 kHz), well beyond the putative limit of phase locking. We found that high-frequency pure-tone discrimination was poor, but when the tones were combined into a harmonic complex, a dramatic improvement in discrimination ability was observed that exceeded performance predicted by the optimal integration of peripheral information from each of the component frequencies. The results are consistent with the existence of pitch-sensitive neurons that rely only on place-based information from multiple harmonically related components. The results also provide evidence against the common assumption that poor high-frequency pure-tone pitch perception is the result of peripheral neural-coding constraints. The finding that place-based spectral coding is sufficient to elicit complex pitch at high frequencies has important implications for the design of future neural prostheses to restore hearing to deaf individuals.SIGNIFICANCE STATEMENT The question of how pitch is represented in the ear has been debated for over a century. Two competing theories involve timing information from neural spikes in the auditory nerve (time code) and the spatial distribution of neural activity along the length of the cochlear partition (place code). By using very high-frequency tones unlikely to be coded via time information, we discovered that information from the individual harmonics is combined so efficiently that performance exceeds theoretical predictions based on the optimal integration of information from each harmonic. The findings have important implications for the design of auditory prostheses because they suggest that enhanced spatial resolution alone may be sufficient to restore pitch via such implants.

PMID: 28821642 [PubMed - indexed for MEDLINE]

Authors: David M, Lavandier M, Grimault N, Oxenham AJ
Icon for American Institute of Physics Icon for PubMed Central Related Articles

Discrimination and streaming of speech sounds based on differences in interaural and spectral cues.

J Acoust Soc Am. 2017 09;142(3):1674

Authors: David M, Lavandier M, Grimault N, Oxenham AJ

Abstract
Differences in spatial cues, including interaural time differences (ITDs), interaural level differences (ILDs) and spectral cues, can lead to stream segregation of alternating noise bursts. It is unknown how effective such cues are for streaming sounds with realistic spectro-temporal variations. In particular, it is not known whether the high-frequency spectral cues associated with elevation remain sufficiently robust under such conditions. To answer these questions, sequences of consonant-vowel tokens were generated and filtered by non-individualized head-related transfer functions to simulate the cues associated with different positions in the horizontal and median planes. A discrimination task showed that listeners could discriminate changes in interaural cues both when the stimulus remained constant and when it varied between presentations. However, discrimination of changes in spectral cues was much poorer in the presence of stimulus variability. A streaming task, based on the detection of repeated syllables in the presence of interfering syllables, revealed that listeners can use both interaural and spectral cues to segregate alternating syllable sequences, despite the large spectro-temporal differences between stimuli. However, only the full complement of spatial cues (ILDs, ITDs, and spectral cues) resulted in obligatory streaming in a task that encouraged listeners to integrate the tokens into a single stream.

PMID: 28964066 [PubMed - indexed for MEDLINE]

Authors: Oxenham AJ, Boucher JE, Kreft HA
Icon for American Institute of Physics Icon for PubMed Central Related Articles

Speech intelligibility is best predicted by intensity, not cochlea-scaled entropy.

J Acoust Soc Am. 2017 09;142(3):EL264

Authors: Oxenham AJ, Boucher JE, Kreft HA

Abstract
Cochlea-scaled entropy (CSE) is a measure of spectro-temporal change that has been reported to predict the contribution of speech segments to overall intelligibility. This paper confirms that CSE is highly correlated with intensity, making it impossible to determine empirically whether it is CSE or simply intensity that determines speech importance. A more perceptually relevant version of CSE that uses dB-scaled differences, rather than differences in linear amplitude, failed to predict speech intelligibility. Overall, a parsimonious account of the available data is that the importance of speech segments to overall intelligibility is best predicted by their relative intensity, not by CSE.

PMID: 28964094 [PubMed - indexed for MEDLINE]

Authors: Madsen SMK, Whiteford KL, Oxenham AJ
Icon for Nature Publishing Group Icon for PubMed Central Related Articles

Musicians do not benefit from differences in fundamental frequency when listening to speech in competing speech backgrounds.

Sci Rep. 2017 10 03;7(1):12624

Authors: Madsen SMK, Whiteford KL, Oxenham AJ

Abstract
Recent studies disagree on whether musicians have an advantage over non-musicians in understanding speech in noise. However, it has been suggested that musicians may be able to use differences in fundamental frequency (F0) to better understand target speech in the presence of interfering talkers. Here we studied a relatively large (N = 60) cohort of young adults, equally divided between non-musicians and highly trained musicians, to test whether the musicians were better able to understand speech either in noise or in a two-talker competing speech masker. The target speech and competing speech were presented with either their natural F0 contours or on a monotone F0, and the F0 difference between the target and masker was systematically varied. As expected, speech intelligibility improved with increasing F0 difference between the target and the two-talker masker for both natural and monotone speech. However, no significant intelligibility advantage was observed for musicians over non-musicians in any condition. Although F0 discrimination was significantly better for musicians than for non-musicians, it was not correlated with speech scores. Overall, the results do not support the hypothesis that musical training leads to improved speech intelligibility in complex speech or noise backgrounds.

PMID: 28974705 [PubMed - indexed for MEDLINE]

Authors: Oxenham AJ
Icon for Atypon Icon for PubMed Central Related Articles

How We Hear: The Perception and Neural Coding of Sound.

Annu Rev Psychol. 2018 01 04;69:27-50

Authors: Oxenham AJ

Abstract
Auditory perception is our main gateway to communication with others via speech and music, and it also plays an important role in alerting and orienting us to new events. This review provides an overview of selected topics pertaining to the perception and neural coding of sound, starting with the first stage of filtering in the cochlea and its profound impact on perception. The next topic, pitch, has been debated for millennia, but recent technical and theoretical developments continue to provide us with new insights. Cochlear filtering and pitch both play key roles in our ability to parse the auditory scene, enabling us to attend to one auditory object or stream while ignoring others. An improved understanding of the basic mechanisms of auditory perception will aid us in the quest to tackle the increasingly important problem of hearing loss in our aging population.

PMID: 29035691 [PubMed - indexed for MEDLINE]

Authors: Graves JE, Oxenham AJ
Icon for Frontiers Media SA Icon for PubMed Central Related Articles

Familiar Tonal Context Improves Accuracy of Pitch Interval Perception.

Front Psychol. 2017;8:1753

Authors: Graves JE, Oxenham AJ

Abstract
A fundamental feature of everyday music perception is sensitivity to familiar tonal structures such as musical keys. Many studies have suggested that a tonal context can enhance the perception and representation of pitch. Most of these studies have measured response time, which may reflect expectancy as opposed to perceptual accuracy. We instead used a performance-based measure, comparing participants' ability to discriminate between a "small, in-tune" interval and a "large, mistuned" interval in conditions that involved familiar tonal relations (diatonic, or major, scale notes), unfamiliar tonal relations (whole-tone or mistuned-diatonic scale notes), repetition of a single pitch, or no tonal context. The context was established with a brief sequence of tones in Experiment 1 (melodic context), and a cadence-like two-chord progression in Experiment 2 (harmonic context). In both experiments, performance significantly differed across the context conditions, with a diatonic context providing a significant advantage over no context; however, no correlation with years of musical training was observed. The diatonic tonal context also provided an advantage over the whole-tone scale context condition in Experiment 1 (melodic context), and over the mistuned scale or repetition context conditions in Experiment 2 (harmonic context). However, the relatively small benefit to performance suggests that the main advantage of tonal context may be priming of expected stimuli, rather than enhanced accuracy of pitch interval representation.

PMID: 29062295 [PubMed]

Authors: Allen EJ, Moerel M, Lage-Castellanos A, De Martino F, Formisano E, Oxenham AJ
Icon for Elsevier Science Icon for PubMed Central Related Articles

Encoding of natural timbre dimensions in human auditory cortex.

Neuroimage. 2018 02 01;166:60-70

Authors: Allen EJ, Moerel M, Lage-Castellanos A, De Martino F, Formisano E, Oxenham AJ

Abstract
Timbre, or sound quality, is a crucial but poorly understood dimension of auditory perception that is important in describing speech, music, and environmental sounds. The present study investigates the cortical representation of different timbral dimensions. Encoding models have typically incorporated the physical characteristics of sounds as features when attempting to understand their neural representation with functional MRI. Here we test an encoding model that is based on five subjectively derived dimensions of timbre to predict cortical responses to natural orchestral sounds. Results show that this timbre model can outperform other models based on spectral characteristics, and can perform as well as a complex joint spectrotemporal modulation model. In cortical regions at the medial border of Heschl's gyrus, bilaterally, and regions at its posterior adjacency in the right hemisphere, the timbre model outperforms even the complex joint spectrotemporal modulation model. These findings suggest that the responses of cortical neuronal populations in auditory cortex may reflect the encoding of perceptual timbre dimensions.

PMID: 29080711 [PubMed - indexed for MEDLINE]

Authors: Wojtczak M, Beim JA, Oxenham AJ
Icon for HighWire Icon for PubMed Central Related Articles

Weak Middle-Ear-Muscle Reflex in Humans with Noise-Induced Tinnitus and Normal Hearing May Reflect Cochlear Synaptopathy.

eNeuro. 2017 Nov-Dec;4(6):

Authors: Wojtczak M, Beim JA, Oxenham AJ

Abstract
Chronic tinnitus is a prevalent hearing disorder, and yet no successful treatments or objective diagnostic tests are currently available. The aim of this study was to investigate the relationship between the presence of tinnitus and the strength of the middle-ear-muscle reflex (MEMR) in humans with normal and near-normal hearing. Clicks were used as test stimuli to obtain a wideband measure of the effect of reflex activation on ear-canal sound pressure. The reflex was elicited using a contralateral broadband noise. The results show that the reflex strength is significantly reduced in individuals with noise-induced continuous tinnitus and normal or near-normal audiometric thresholds compared with no-tinnitus controls. Due to a shallower growth of the reflex strength in the tinnitus group, the difference between the two groups increased with increasing elicitor level. No significant difference in the effect of tinnitus on the strength of the middle-ear muscle reflex was found between males and females. The weaker reflex could not be accounted for by differences in audiometric hearing thresholds between the tinnitus and control groups. Similarity between our findings in humans and the findings of a reduced middle-ear muscle reflex in noise-exposed animals suggests that noise-induced tinnitus in individuals with clinically normal hearing may be a consequence of cochlear synaptopathy, a loss of synaptic connections between inner hair cells (IHCs) in the cochlea and auditory-nerve (AN) fibers that has been termed hidden hearing loss.

PMID: 29181442 [PubMed - indexed for MEDLINE]

2016

2016

Icon for Elsevier Science Icon for PubMed Central Related Articles

Effects of auditory enhancement on the loudness of masker and target components.

Hear Res. 2016 Mar;333:150-156

Authors: Wang N, Oxenham AJ

Abstract
Auditory enhancement refers to the observation that the salience of one spectral region (the "signal") of a broadband sound can be enhanced and can "pop out" from the remainder of the sound (the "masker") if it is preceded by the broadband sound without the signal. The present study investigated auditory enhancement as an effective change in loudness, to determine whether it reflects a change in the loudness of the signal, the masker, or both. In the first experiment, the 500-ms precursor, an inharmonic complex with logarithmically spaced components, was followed after a 50-ms gap by the 100-ms signal or masker alone, the loudness of which was compared with that of the same signal or masker presented 2 s later. In the second experiment, the loudness of the signal embedded in the masker was assessed with and without a precursor using the same method, as was the loudness of the entire signal-plus-masker complex. The results suggest that the precursor does not affect the loudness of the signal or the masker alone, but enhances the loudness of the signal in the presence of the masker, while leaving the loudness of the surrounding masker unaffected. The results are consistent with an explanation based on "adaptation of inhibition" [Viemeister and Bacon (1982). J. Acoust. Soc. Am. 71, 1502-1507].

PMID: 26805025 [PubMed - indexed for MEDLINE]

Icon for Springer Icon for PubMed Central Related Articles

Induced Loudness Reduction and Enhancement in Acoustic and Electric Hearing.

J Assoc Res Otolaryngol. 2016 08;17(4):383-91

Authors: Wang N, Kreft H, Oxenham AJ

Abstract
The loudness of a tone can be reduced by preceding it with a more intense tone. This effect, known as induced loudness reduction (ILR), has been reported to last for several seconds. The underlying neural mechanisms are unknown. One possible contributor to the effect involves changes in cochlear gain via the medial olivocochlear (MOC) efferents. Since cochlear implants (CIs) bypass the cochlea, investigating whether and how CI users experience ILR should help provide a better understanding of the underlying mechanisms. In the present study, ILR was examined in both normal-hearing listeners and CI users by examining the effects of an intense precursor (50 or 500 ms) on the loudness of a 50-ms target, as judged by comparing it to a spectrally remote 50-ms comparison sound. The interstimulus interval (ISI) between the precursor and the target was varied between 10 and 1000 ms to estimate the time course of ILR. In general, the patterns of results from the CI users were similar to those found in the normal-hearing listeners. However, in the short-precursor short-ISI condition, an enhancement in the loudness of target was observed in CI subjects that was not present in the normal-hearing listeners, consistent with the effects of an additional attenuation present in the normal-hearing listeners but not in the CI users. The results suggest that the MOC may play a role but that it is not the only source of these loudness context effects.

PMID: 27033086 [PubMed - indexed for MEDLINE]

Icon for Springer Icon for PubMed Central Related Articles

Speech Masking in Normal and Impaired Hearing: Interactions Between Frequency Selectivity and Inherent Temporal Fluctuations in Noise.

Adv Exp Med Biol. 2016;894:125-132

Authors: Oxenham AJ, Kreft HA

Abstract
Recent studies in normal-hearing listeners have used envelope-vocoded stimuli to show that the masking of speech by noise is dominated by the temporal-envelope fluctuations inherent in noise, rather than just overall power. Because these studies were based on vocoding, it was expected that cochlear-implant (CI) users would demonstrate a similar sensitivity to inherent fluctuations. In contrast, it was found that CI users showed no difference in speech intelligibility between maskers with and without inherent envelope fluctuations. Here, these initial findings in CI users were extended to listeners with cochlear hearing loss and the results were compared with those from normal-hearing listeners at either equal sensation level or equal sound pressure level. The results from hearing-impaired listeners (and in normal-hearing listeners at high sound levels) are consistent with a relative reduction in low-frequency inherent noise fluctuations due to broader cochlear filtering. The reduced effect of inherent temporal fluctuations in noise, due to either current spread (in CI users) or broader cochlear filters (in hearing-impaired listeners), provides a new way to explain the loss of masking release experienced in CI users and hearing-impaired listeners when additional amplitude fluctuations are introduced in noise maskers.

PMID: 27080653 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Neural correlates of attention and streaming in a perceptually multistable auditory illusion.

J Acoust Soc Am. 2016 Oct;140(4):2225

Authors: Mehta AH, Yasin I, Oxenham AJ, Shamma S

Abstract
In a complex acoustic environment, acoustic cues and attention interact in the formation of streams within the auditory scene. In this study, a variant of the "octave illusion" [Deutsch (1974). Nature 251, 307-309] was used to investigate the neural correlates of auditory streaming, and to elucidate the effects of attention on the interaction between sequential and concurrent sound segregation in humans. By directing subjects' attention to different frequencies and ears, it was possible to elicit several different illusory percepts with the identical stimulus. The first experiment tested the hypothesis that the illusion depends on the ability of listeners to perceptually stream the target tones from within the alternating sound sequences. In the second experiment, concurrent psychophysical measures and electroencephalography recordings provided neural correlates of the various percepts elicited by the multistable stimulus. The results show that the perception and neural correlates of the auditory illusion can be manipulated robustly by attentional focus and that the illusion is constrained in much the same way as auditory stream segregation, suggesting common underlying mechanisms.

PMID: 27794350 [PubMed - indexed for MEDLINE]

Icon for Elsevier Science Icon for PubMed Central Related Articles

Sequential stream segregation of voiced and unvoiced speech sounds based on fundamental frequency.

Hear Res. 2017 02;344:235-243

Authors: David M, Lavandier M, Grimault N, Oxenham AJ

Abstract
Differences in fundamental frequency (F0) between voiced sounds are known to be a strong cue for stream segregation. However, speech consists of both voiced and unvoiced sounds, and less is known about whether and how the unvoiced portions are segregated. This study measured listeners' ability to integrate or segregate sequences of consonant-vowel tokens, comprising a voiceless fricative and a vowel, as a function of the F0 difference between interleaved sequences of tokens. A performance-based measure was used, in which listeners detected the presence of a repeated token either within one sequence or between the two sequences (measures of voluntary and obligatory streaming, respectively). The results showed a systematic increase of voluntary stream segregation as the F0 difference between the two interleaved sequences increased from 0 to 13 semitones, suggesting that F0 differences allowed listeners to segregate speech sounds, including the unvoiced portions. In contrast to the consistent effects of voluntary streaming, the trend towards obligatory stream segregation at large F0 differences failed to reach significance. Listeners were no longer able to perform the voluntary-streaming task reliably when the unvoiced portions were removed from the stimuli, suggesting that the unvoiced portions were used and correctly segregated in the original task. The results demonstrate that streaming based on F0 differences occurs for natural speech sounds, and that the unvoiced portions are correctly assigned to the corresponding voiced portions.

PMID: 27923739 [PubMed - indexed for MEDLINE]

Icon for Atypon Icon for PubMed Central Related Articles

Predicting the Perceptual Consequences of Hidden Hearing Loss.

Trends Hear. 2016 Jan-Dec;20:2331216516686768

Authors: Oxenham AJ

Abstract
Recent physiological studies in several rodent species have revealed that permanent damage can occur to the auditory system after exposure to a noise that produces only a temporary shift in absolute thresholds. The damage has been found to occur in the synapses between the cochlea's inner hair cells and the auditory nerve, effectively severing part of the connection between the ear and the brain. This synaptopathy has been termed hidden hearing loss because its effects are not thought to be revealed in standard clinical, behavioral, or physiological measures of absolute threshold. It is currently unknown whether humans suffer from similar deficits after noise exposure. Even if synaptopathy occurs in humans, it remains unclear what the perceptual consequences might be or how they should best be measured. Here, we apply a simple theoretical model, taken from signal detection theory, to provide some predictions for what perceptual effects could be expected for a given loss of synapses. Predictions are made for a number of basic perceptual tasks, including tone detection in quiet and in noise, frequency discrimination, level discrimination, and binaural lateralization. The model's predictions are in line with the empirical observations that a 50% loss of synapses leads to changes in threshold that are too small to be reliably measured. Overall, the model provides a simple initial quantitative framework for understanding and predicting the perceptual effects of synaptopathy in humans.

PMID: 28024462 [PubMed - indexed for MEDLINE]

Icon for HighWire Icon for PubMed Central Related Articles

Representations of Pitch and Timbre Variation in Human Auditory Cortex.

J Neurosci. 2017 02 01;37(5):1284-1293

Authors: Allen EJ, Burton PC, Olman CA, Oxenham AJ

Abstract
Pitch and timbre are two primary dimensions of auditory perception, but how they are represented in the human brain remains a matter of contention. Some animal studies of auditory cortical processing have suggested modular processing, with different brain regions preferentially coding for pitch or timbre, whereas other studies have suggested a distributed code for different attributes across the same population of neurons. This study tested whether variations in pitch and timbre elicit activity in distinct regions of the human temporal lobes. Listeners were presented with sequences of sounds that varied in either fundamental frequency (eliciting changes in pitch) or spectral centroid (eliciting changes in brightness, an important attribute of timbre), with the degree of pitch or timbre variation in each sequence parametrically manipulated. The BOLD responses from auditory cortex increased with increasing sequence variance along each perceptual dimension. The spatial extent, region, and laterality of the cortical regions most responsive to variations in pitch or timbre at the univariate level of analysis were largely overlapping. However, patterns of activation in response to pitch or timbre variations were discriminable in most subjects at an individual level using multivoxel pattern analysis, suggesting a distributed coding of the two dimensions bilaterally in human auditory cortex.
SIGNIFICANCE STATEMENT: Pitch and timbre are two crucial aspects of auditory perception. Pitch governs our perception of musical melodies and harmonies, and conveys both prosodic and (in tone languages) lexical information in speech. Brightness-an aspect of timbre or sound quality-allows us to distinguish different musical instruments and speech sounds. Frequency-mapping studies have revealed tonotopic organization in primary auditory cortex, but the use of pure tones or noise bands has precluded the possibility of dissociating pitch from brightness. Our results suggest a distributed code, with no clear anatomical distinctions between auditory cortical regions responsive to changes in either pitch or timbre, but also reveal a population code that can differentiate between changes in either dimension within the same cortical regions.

PMID: 28025255 [PubMed - indexed for MEDLINE]

2015

2015

Icon for Springer Icon for PubMed Central Related Articles

Exploring the role of feedback-based auditory reflexes in forward masking by schroeder-phase complexes.

J Assoc Res Otolaryngol. 2015 Feb;16(1):81-99

Authors: Wojtczak M, Beim JA, Oxenham AJ

Abstract
Several studies have postulated that psychoacoustic measures of auditory perception are influenced by efferent-induced changes in cochlear responses, but these postulations have generally remained untested. This study measured the effect of stimulus phase curvature and temporal envelope modulation on the medial olivocochlear reflex (MOCR) and on the middle-ear muscle reflex (MEMR). The role of the MOCR was tested by measuring changes in the ear-canal pressure at 6 kHz in the presence and absence of a band-limited harmonic complex tone with various phase curvatures, centered either at (on-frequency) or well below (off-frequency) the 6-kHz probe frequency. The influence of possible MEMR effects was examined by measuring phase-gradient functions for the elicitor effects and by measuring changes in the ear-canal pressure with a continuous suppressor of the 6-kHz probe. Both on- and off-frequency complex tone elicitors produced significant changes in ear canal sound pressure. However, the pattern of results was not consistent with the earlier hypotheses postulating that efferent effects produce the psychoacoustic dependence of forward-masked thresholds on masker phase curvature. The results also reveal unexpectedly long time constants associated with some efferent effects, the source of which remains unknown.

PMID: 25338224 [PubMed - indexed for MEDLINE]

Icon for Elsevier Science Icon for PubMed Central Related Articles

Congenital amusia: a cognitive disorder limited to resolved harmonics and with no peripheral basis.

Neuropsychologia. 2015 Jan;66:293-301

Authors: Cousineau M, Oxenham AJ, Peretz I

Abstract
Pitch plays a fundamental role in audition, from speech and music perception to auditory scene analysis. Congenital amusia is a neurogenetic disorder that appears to affect primarily pitch and melody perception. Pitch is normally conveyed by the spectro-temporal fine structure of low harmonics, but some pitch information is available in the temporal envelope produced by the interactions of higher harmonics. Using 10 amusic subjects and 10 matched controls, we tested the hypothesis that amusics suffer exclusively from impaired processing of spectro-temporal fine structure. We also tested whether the inability of amusics to process acoustic temporal fine structure extends beyond pitch by measuring sensitivity to interaural time differences, which also rely on temporal fine structure. Further tests were carried out on basic intensity and spectral resolution. As expected, pitch perception based on spectro-temporal fine structure was impaired in amusics; however, no significant deficits were observed in amusics' ability to perceive the pitch conveyed via temporal-envelope cues. Sensitivity to interaural time differences was also not significantly different between the amusic and control groups, ruling out deficits in the peripheral coding of temporal fine structure. Finally, no significant differences in intensity or spectral resolution were found between the amusic and control groups. The results demonstrate a pitch-specific deficit in fine spectro-temporal information processing in amusia that seems unrelated to temporal or spectral coding in the auditory periphery. These results are consistent with the view that there are distinct mechanisms dedicated to processing resolved and unresolved harmonics in the general population, the former being altered in congenital amusia while the latter is spared.

PMID: 25433224 [PubMed - indexed for MEDLINE]

Icon for Atypon Icon for PubMed Central Related Articles

A fast method for measuring psychophysical thresholds across the cochlear implant array.

Trends Hear. 2015 Feb 04;19:

Authors: Bierer JA, Bierer SM, Kreft HA, Oxenham AJ

Abstract
A rapid threshold measurement procedure, based on Bekesy tracking, is proposed and evaluated for use with cochlear implants (CIs). Fifteen postlingually deafened adult CI users participated. Absolute thresholds for 200-ms trains of biphasic pulses were measured using the new tracking procedure and were compared with thresholds obtained with a traditional forced-choice adaptive procedure under both monopolar and quadrupolar stimulation. Virtual spectral sweeps across the electrode array were implemented in the tracking procedure via current steering, which divides the current between two adjacent electrodes and varies the proportion of current directed to each electrode. Overall, no systematic differences were found between threshold estimates with the new channel sweep procedure and estimates using the adaptive forced-choice procedure. Test-retest reliability for the thresholds from the sweep procedure was somewhat poorer than for thresholds from the forced-choice procedure. However, the new method was about 4 times faster for the same number of repetitions. Overall the reliability and speed of the new tracking procedure provides it with the potential to estimate thresholds in a clinical setting. Rapid methods for estimating thresholds could be of particular clinical importance in combination with focused stimulation techniques that result in larger threshold variations between electrodes.

PMID: 25656797 [PubMed - indexed for MEDLINE]

Icon for Springer Icon for PubMed Central Related Articles

Loudness Context Effects in Normal-Hearing Listeners and Cochlear-Implant Users.

J Assoc Res Otolaryngol. 2015 Aug;16(4):535-45

Authors: Wang N, Kreft HA, Oxenham AJ

Abstract
Context effects in loudness have been observed in normal auditory perception and may reflect a general gain control of the auditory system. However, little is known about such effects in cochlear-implant (CI) users. Discovering whether and how CI users experience loudness context effects should help us better understand the underlying mechanisms. In the present study, we examined the effects of a long-duration (1-s) intense precursor on the loudness relations between shorter-duration (200-ms) target and comparison stimuli. The precursor and target were separated by a silent gap of 50 ms, and the target and comparison were separated by a silent gap of 2 s. For normal-hearing listeners, the stimuli were narrowband noises; for CI users, all stimuli were delivered as pulse trains directly to the implant. Significant changes in loudness were observed in normal-hearing listeners, in line with earlier studies. The CI users also experienced some loudness changes but, in contrast to the results from normal-hearing listeners, the effect did not increase with increasing level difference between precursor and target. A "dual-process" hypothesis, used to explain earlier data from normal-hearing listeners, may provide an account of the present data by assuming that one of the two mechanisms, involving "induced loudness reduction," was absent or reduced in CI users.

PMID: 26040213 [PubMed - indexed for MEDLINE]

Icon for Springer Icon for PubMed Central Related Articles

Stimulus Frequency Otoacoustic Emissions Provide No Evidence for the Role of Efferents in the Enhancement Effect.

J Assoc Res Otolaryngol. 2015 Oct;16(5):613-29

Authors: Beim JA, Elliott M, Oxenham AJ, Wojtczak M

Abstract
Auditory enhancement refers to the perceptual phenomenon that a target sound is heard out more readily from a background sound if the background is presented alone first. Here we used stimulus-frequency otoacoustic emissions (SFOAEs) to test the hypothesis that activation of the medial olivocochlear efferent system contributes to auditory enhancement effects. The SFOAEs were used as a tool to measure changes in cochlear responses to a target component and the neighboring components of a multitone background between conditions producing enhancement and conditions producing no enhancement. In the "enhancement" condition, the target and multitone background were preceded by a precursor stimulus with a spectral notch around the signal frequency; in the control (no-enhancement) condition, the target and multitone background were presented without the precursor. In an experiment using a wideband multitone stimulus known to produce significant psychophysical enhancement effects, SFOAEs showed no changes consistent with enhancement, but some aspects of the results indicated possible contamination of the SFOAE magnitudes by the activation of the middle-ear-muscle reflex. The same SFOAE measurements performed using narrower-band stimuli at lower sound levels also showed no SFOAE changes consistent with either absolute or relative enhancement despite robust psychophysical enhancement effects observed in the same listeners with the same stimuli. The results suggest that cochlear efferent control does not play a significant role in auditory enhancement effects.

PMID: 26153415 [PubMed - indexed for MEDLINE]

Icon for American Psychological Association Icon for PubMed Central Related Articles

New perspectives on the measurement and time course of auditory enhancement.

J Exp Psychol Hum Percept Perform. 2015 Dec;41(6):1696-708

Authors: Feng L, Oxenham AJ

Abstract
A target sound can become more audible and may "pop out" from a simultaneously presented masker if the masker is presented first by itself, as a precursor. This phenomenon, known as auditory enhancement, may reflect the general perceptual principle of contrast enhancement, which facilitates adaptation to ongoing acoustic conditions and the detection of new events. Little is known about the mechanisms underlying enhancement, and potential confounding factors have made the size of the effect and its time course a point of contention. Here we measured enhancement as a function of precursor duration and delay between precursor offset and target onset, using 2 single-interval pitch comparison tasks, which involve either same-different or up-down judgments, to avoid the potential confounds of earlier studies. Although these 2 tasks elicit different levels of performance and may reflect different underlying mechanisms, they produced similar amounts of enhancement. The effect decreased with decreasing precursor duration, but remained present for precursors as short as 62.5 ms, and decreased with increasing gap between the precursor and target, but remained measurable 1 s after the precursor. Additional conditions, examining the effect of precursor/masker similarity and the possible role of grouping and cueing, suggest multiple sources of auditory enhancement.

PMID: 26280269 [PubMed - indexed for MEDLINE]

Icon for Public Library of Science Icon for PubMed Central Related Articles

Retroactive Streaming Fails to Improve Concurrent Vowel Identification.

PLoS One. 2015;10(10):e0140466

Authors: Brandewie EJ, Oxenham AJ

Abstract
The sequential organization of sound over time can interact with the concurrent organization of sounds across frequency. Previous studies using simple acoustic stimuli have suggested that sequential streaming cues can retroactively affect the perceptual organization of sounds that have already occurred. It is unknown whether such effects generalize to the perception of speech sounds. Listeners' ability to identify two simultaneously presented vowels was measured in the following conditions: no context, a preceding context stream (precursors), and a following context stream (postcursors). The context stream was comprised of brief repetitions of one of the two vowels, and the primary measure of performance was listeners' ability to identify the other vowel. Results in the precursor condition showed a significant advantage for the identification of the second vowel compared to the no-context condition, suggesting that sequential grouping mechanisms aided the segregation of the concurrent vowels, in agreement with previous work. However, performance in the postcursor condition was significantly worse compared to the no-context condition, providing no evidence for an effect of stream segregation, and suggesting a possible interference effect. Two additional experiments involving inharmonic (jittered) vowels were performed to provide additional cues to aid retroactive stream segregation; however, neither manipulation enabled listeners to improve their identification of the target vowel. Taken together with earlier studies, the results suggest that retroactive streaming may require large spectral differences between concurrent sources and thus may not provide a robust segregation cue for natural broadband sounds such as speech.

PMID: 26451598 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Using individual differences to test the role of temporal and place cues in coding frequency modulation.

J Acoust Soc Am. 2015 Nov;138(5):3093-104

Authors: Whiteford KL, Oxenham AJ

Abstract
The question of how frequency is coded in the peripheral auditory system remains unresolved. Previous research has suggested that slow rates of frequency modulation (FM) of a low carrier frequency may be coded via phase-locked temporal information in the auditory nerve, whereas FM at higher rates and/or high carrier frequencies may be coded via a rate-place (tonotopic) code. This hypothesis was tested in a cohort of 100 young normal-hearing listeners by comparing individual sensitivity to slow-rate (1-Hz) and fast-rate (20-Hz) FM at a carrier frequency of 500 Hz with independent measures of phase-locking (using dynamic interaural time difference, ITD, discrimination), level coding (using amplitude modulation, AM, detection), and frequency selectivity (using forward-masking patterns). All FM and AM thresholds were highly correlated with each other. However, no evidence was obtained for stronger correlations between measures thought to reflect phase-locking (e.g., slow-rate FM and ITD sensitivity), or between measures thought to reflect tonotopic coding (fast-rate FM and forward-masking patterns). The results suggest that either psychoacoustic performance in young normal-hearing listeners is not limited by peripheral coding, or that similar peripheral mechanisms limit both high- and low-rate FM coding.

PMID: 26627783 [PubMed - indexed for MEDLINE]

2014

2014

Icon for American Psychological Association Icon for PubMed Central Related Articles

Perceptual asymmetry induced by the auditory continuity illusion.

J Exp Psychol Hum Percept Perform. 2014 Jun;40(3):908-14

Authors: Ruggles DR, Oxenham AJ

Abstract
The challenges of daily communication require listeners to integrate both independent and complementary auditory information to form holistic auditory scenes. As part of this process listeners are thought to fill in missing information to create continuous perceptual streams, even when parts of messages are masked or obscured. One example of this filling-in process-the auditory continuity illusion-has been studied primarily using stimuli presented in isolation, leaving it unclear whether the illusion occurs in more complex situations with higher perceptual and attentional demands. In this study, young normal-hearing participants listened for long target tones, either real or illusory, in "clouds" of shorter masking tone and noise bursts with pseudorandom spectrotemporal locations. Patterns of detection suggest that illusory targets are salient within mixtures, although they do not produce the same level of performance as the real targets. The results suggest that the continuity illusion occurs in the presence of competing sounds and can be used to aid in the detection of partially obscured objects within complex auditory scenes.

PMID: 24364709 [PubMed - indexed for MEDLINE]

Icon for Public Library of Science Icon for PubMed Central Related Articles

Influence of musical training on understanding voiced and whispered speech in noise.

PLoS One. 2014;9(1):e86980

Authors: Ruggles DR, Freyman RL, Oxenham AJ

Abstract
This study tested the hypothesis that the previously reported advantage of musicians over non-musicians in understanding speech in noise arises from more efficient or robust coding of periodic voiced speech, particularly in fluctuating backgrounds. Speech intelligibility was measured in listeners with extensive musical training, and in those with very little musical training or experience, using normal (voiced) or whispered (unvoiced) grammatically correct nonsense sentences in noise that was spectrally shaped to match the long-term spectrum of the speech, and was either continuous or gated with a 16-Hz square wave. Performance was also measured in clinical speech-in-noise tests and in pitch discrimination. Musicians exhibited enhanced pitch discrimination, as expected. However, no systematic or statistically significant advantage for musicians over non-musicians was found in understanding either voiced or whispered sentences in either continuous or gated noise. Musicians also showed no statistically significant advantage in the clinical speech-in-noise tests. Overall, the results provide no evidence for a significant difference between young adult musicians and non-musicians in their ability to understand speech in noise.

PMID: 24489819 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Symmetric interactions and interference between pitch and timbre.

J Acoust Soc Am. 2014 Mar;135(3):1371-9

Authors: Allen EJ, Oxenham AJ

Abstract
Variations in the spectral shape of harmonic tone complexes are perceived as timbre changes and can lead to poorer fundamental frequency (F0) or pitch discrimination. Less is known about the effects of F0 variations on spectral shape discrimination. The aims of the study were to determine whether the interactions between pitch and timbre are symmetric, and to test whether musical training affects listeners' ability to ignore variations in irrelevant perceptual dimensions. Difference limens (DLs) for F0 were measured with and without random, concurrent, variations in spectral centroid, and vice versa. Additionally, sensitivity was measured as the target parameter and the interfering parameter varied by the same amount, in terms of individual DLs. Results showed significant and similar interference between pitch (F0) and timbre (spectral centroid) dimensions, with upward spectral motion often confused for upward F0 motion, and vice versa. Musicians had better F0DLs than non-musicians on average, but similar spectral centroid DLs. Both groups showed similar interference effects, in terms of decreased sensitivity, in both dimensions. Results reveal symmetry in the interference effects between pitch and timbre, once differences in sensitivity between dimensions and subjects are controlled. Musical training does not reliably help to overcome these effects.

PMID: 24606275 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Assessing the effects of temporal coherence on auditory stream formation through comodulation masking release.

J Acoust Soc Am. 2014 Jun;135(6):3520-9

Authors: Christiansen SK, Oxenham AJ

Abstract
Recent studies of auditory streaming have suggested that repeated synchronous onsets and offsets over time, referred to as "temporal coherence," provide a strong grouping cue between acoustic components, even when they are spectrally remote. This study uses a measure of auditory stream formation, based on comodulation masking release (CMR), to assess the conditions under which a loss of temporal coherence across frequency can lead to auditory stream segregation. The measure relies on the assumption that the CMR, produced by flanking bands remote from the masker and target frequency, only occurs if the masking and flanking bands form part of the same perceptual stream. The masking and flanking bands consisted of sequences of narrowband noise bursts, and the temporal coherence between the masking and flanking bursts was manipulated in two ways: (a) By introducing a fixed temporal offset between the flanking and masking bands that varied from zero to 60 ms and (b) by presenting the flanking and masking bursts at different temporal rates, so that the asynchronies varied from burst to burst. The results showed reduced CMR in all conditions where the flanking and masking bands were temporally incoherent, in line with expectations of the temporal coherence hypothesis.

PMID: 24907815 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Spectral motion contrast as a speech context effect.

J Acoust Soc Am. 2014 Sep;136(3):1237

Authors: Wang N, Oxenham AJ

Abstract
Spectral contrast effects may help "normalize" the incoming sound and produce perceptual constancy in the face of the variable acoustics produced by different rooms, talkers, and backgrounds. Recent studies have concentrated on the after-effects produced by the long-term average power spectrum. The present study examined contrast effects based on spectral motion, analogous to visual-motion after-effects. In experiment 1, the existence of spectral-motion after-effects with word-length inducers was established by demonstrating that the identification of the direction of a target spectral glide was influenced by the spectral motion of a preceding inducer glide. In experiment 2, the target glide was replaced with a synthetic sine-wave speech sound, including a formant transition. The speech category boundary was shifted by the presence and direction of the inducer glide. Finally, in experiment 3, stimuli based on synthetic sine-wave speech sounds were used as both context and target stimuli to show that the spectral-motion after-effects could occur even with inducers with relatively short speech-like durations and small frequency excursions. The results suggest that spectral motion may play a complementary role to the long-term average power spectrum in inducing speech context effects.

PMID: 25190397 [PubMed - indexed for MEDLINE]

Icon for Atypon Icon for PubMed Central Related Articles

Speech perception in tones and noise via cochlear implants reveals influence of spectral resolution on temporal processing.

Trends Hear. 2014 Oct 13;18:

Authors: Oxenham AJ, Kreft HA

Abstract
Under normal conditions, human speech is remarkably robust to degradation by noise and other distortions. However, people with hearing loss, including those with cochlear implants, often experience great difficulty in understanding speech in noisy environments. Recent work with normal-hearing listeners has shown that the amplitude fluctuations inherent in noise contribute strongly to the masking of speech. In contrast, this study shows that speech perception via a cochlear implant is unaffected by the inherent temporal fluctuations of noise. This qualitative difference between acoustic and electric auditory perception does not seem to be due to differences in underlying temporal acuity but can instead be explained by the poorer spectral resolution of cochlear implants, relative to the normally functioning ear, which leads to an effective smoothing of the inherent temporal-envelope fluctuations of noise. The outcome suggests an unexpected trade-off between the detrimental effects of poorer spectral resolution and the beneficial effects of a smoother noise temporal envelope. This trade-off provides an explanation for the long-standing puzzle of why strong correlations between speech understanding and spectral resolution have remained elusive. The results also provide a potential explanation for why cochlear-implant users and hearing-impaired listeners exhibit reduced or absent masking release when large and relatively slow temporal fluctuations are introduced in noise maskers. The multitone maskers used here may provide an effective new diagnostic tool for assessing functional hearing loss and reduced spectral resolution.

PMID: 25315376 [PubMed - indexed for MEDLINE]

Icon for Springer Icon for PubMed Central Related Articles

Exploring the role of feedback-based auditory reflexes in forward masking by schroeder-phase complexes.

J Assoc Res Otolaryngol. 2015 Feb;16(1):81-99

Authors: Wojtczak M, Beim JA, Oxenham AJ

Abstract
Several studies have postulated that psychoacoustic measures of auditory perception are influenced by efferent-induced changes in cochlear responses, but these postulations have generally remained untested. This study measured the effect of stimulus phase curvature and temporal envelope modulation on the medial olivocochlear reflex (MOCR) and on the middle-ear muscle reflex (MEMR). The role of the MOCR was tested by measuring changes in the ear-canal pressure at 6 kHz in the presence and absence of a band-limited harmonic complex tone with various phase curvatures, centered either at (on-frequency) or well below (off-frequency) the 6-kHz probe frequency. The influence of possible MEMR effects was examined by measuring phase-gradient functions for the elicitor effects and by measuring changes in the ear-canal pressure with a continuous suppressor of the 6-kHz probe. Both on- and off-frequency complex tone elicitors produced significant changes in ear canal sound pressure. However, the pattern of results was not consistent with the earlier hypotheses postulating that efferent effects produce the psychoacoustic dependence of forward-masked thresholds on masker phase curvature. The results also reveal unexpectedly long time constants associated with some efferent effects, the source of which remains unknown.

PMID: 25338224 [PubMed - indexed for MEDLINE]

Icon for American Psychological Association Icon for PubMed Central Related Articles

Expectations for melodic contours transcend pitch.

J Exp Psychol Hum Percept Perform. 2014 Dec;40(6):2338-47

Authors: Graves JE, Micheyl C, Oxenham AJ

Abstract
The question of what makes a good melody has interested composers, music theorists, and psychologists alike. Many of the observed principles of good "melodic continuation" involve melodic contour-the pattern of rising and falling pitch within a sequence. Previous work has shown that contour perception can extend beyond pitch to other auditory dimensions, such as brightness and loudness. Here, we show that the generalization of contour perception to nontraditional dimensions also extends to melodic expectations. In the first experiment, subjective ratings for 3-tone sequences that vary in brightness or loudness conformed to the same general contour-based expectations as pitch sequences. In the second experiment, we modified the sequence of melody presentation such that melodies with the same beginning were blocked together. This change produced substantively different results, but the patterns of ratings remained similar across the 3 auditory dimensions. Taken together, these results suggest that (a) certain well-known principles of melodic expectation (such as the expectation for a reversal following a skip) are dependent on long-term context, and (b) these expectations are not unique to the dimension of pitch and may instead reflect more general principles of perceptual organization.

PMID: 25365571 [PubMed - indexed for MEDLINE]

Icon for Elsevier Science Icon for PubMed Central Related Articles

Congenital amusia: a cognitive disorder limited to resolved harmonics and with no peripheral basis.

Neuropsychologia. 2015 Jan;66:293-301

Authors: Cousineau M, Oxenham AJ, Peretz I

Abstract
Pitch plays a fundamental role in audition, from speech and music perception to auditory scene analysis. Congenital amusia is a neurogenetic disorder that appears to affect primarily pitch and melody perception. Pitch is normally conveyed by the spectro-temporal fine structure of low harmonics, but some pitch information is available in the temporal envelope produced by the interactions of higher harmonics. Using 10 amusic subjects and 10 matched controls, we tested the hypothesis that amusics suffer exclusively from impaired processing of spectro-temporal fine structure. We also tested whether the inability of amusics to process acoustic temporal fine structure extends beyond pitch by measuring sensitivity to interaural time differences, which also rely on temporal fine structure. Further tests were carried out on basic intensity and spectral resolution. As expected, pitch perception based on spectro-temporal fine structure was impaired in amusics; however, no significant deficits were observed in amusics' ability to perceive the pitch conveyed via temporal-envelope cues. Sensitivity to interaural time differences was also not significantly different between the amusic and control groups, ruling out deficits in the peripheral coding of temporal fine structure. Finally, no significant differences in intensity or spectral resolution were found between the amusic and control groups. The results demonstrate a pitch-specific deficit in fine spectro-temporal information processing in amusia that seems unrelated to temporal or spectral coding in the auditory periphery. These results are consistent with the view that there are distinct mechanisms dedicated to processing resolved and unresolved harmonics in the general population, the former being altered in congenital amusia while the latter is spared.

PMID: 25433224 [PubMed - indexed for MEDLINE]

2013

2013

Icon for PubMed Central Related Articles

Revisiting place and temporal theories of pitch.

Acoust Sci Technol. 2013;34(6):388-396

Authors: Oxenham AJ

Abstract
The nature of pitch and its neural coding have been studied for over a century. A popular debate has revolved around the question of whether pitch is coded via "place" cues in the cochlea, or via timing cues in the auditory nerve. In the most recent incarnation of this debate, the role of temporal fine structure has been emphasized in conveying important pitch and speech information, particularly because the lack of temporal fine structure coding in cochlear implants might explain some of the difficulties faced by cochlear implant users in perceiving music and pitch contours in speech. In addition, some studies have postulated that hearing-impaired listeners may have a specific deficit related to processing temporal fine structure. This article reviews some of the recent literature surrounding the debate, and argues that much of the recent evidence suggesting the importance of temporal fine structure processing can also be accounted for using spectral (place) or temporal-envelope cues.

PMID: 25364292 [PubMed - as supplied by publisher]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Effects of temporal stimulus properties on the perception of across-frequency asynchrony.

J Acoust Soc Am. 2013 Feb;133(2):982-97

Authors: Wojtczak M, Beim JA, Micheyl C, Oxenham AJ

Abstract
The role of temporal stimulus parameters in the perception of across-frequency synchrony and asynchrony was investigated using pairs of 500-ms tones consisting of a 250-Hz tone and a tone with a higher frequency of 1, 2, 4, or 6 kHz. Subjective judgments suggested veridical perception of across-frequency synchrony but with greater sensitivity to changes in asynchrony for pairs in which the lower-frequency tone was leading than for pairs in which it was lagging. Consistent with the subjective judgments, thresholds for the detection of asynchrony measured in a three-alternative forced-choice task were lower when the signal interval contained a pair with the low-frequency tone leading than a pair with a high-frequency tone leading. A similar asymmetry was observed for asynchrony discrimination when the standard asynchrony was relatively small (≤20 ms) but not for larger standard asynchronies. Independent manipulation of onset and offset ramp durations indicated a dominant role of onsets in the perception of across-frequency asynchrony. A physiologically inspired model, involving broadly tuned monaural coincidence detectors that receive inputs from frequency-selective onset detectors, was able to accurately reproduce the asymmetric distributions of synchrony judgments. The model provides testable predictions for future physiological investigations of responses to broadband stimuli with across-frequency delays.

PMID: 23363115 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Temporal coherence versus harmonicity in auditory stream formation.

J Acoust Soc Am. 2013 Mar;133(3):EL188-94

Authors: Micheyl C, Kreft H, Shamma S, Oxenham AJ

Abstract
This study sought to investigate the influence of temporal incoherence and inharmonicity on concurrent stream segregation, using performance-based measures. Subjects discriminated frequency shifts in a temporally regular sequence of target pure tones, embedded in a constant or randomly varying multi-tone background. Depending on the condition tested, the target tones were either temporally coherent or incoherent with, and either harmonically or inharmonically related to, the background tones. The results provide further evidence that temporal incoherence facilitates stream segregation and they suggest that deviations from harmonicity can cause similar facilitation effects, even when the targets and the maskers are temporally coherent.

PMID: 23464127 [PubMed - indexed for MEDLINE]

Icon for Springer Icon for PubMed Central Related Articles

Perception of across-frequency asynchrony by listeners with cochlear hearing loss.

J Assoc Res Otolaryngol. 2013 Aug;14(4):573-89

Authors: Wojtczak M, Beim JA, Micheyl C, Oxenham AJ

Abstract
Cochlear hearing loss is often associated with broader tuning of the cochlear filters. Cochlear response latencies are dependent on the filter bandwidths, so hearing loss may affect the relationship between latencies across different characteristic frequencies. This prediction was tested by investigating the perception of synchrony between two tones exciting different regions of the cochlea in listeners with hearing loss. Subjective judgments of synchrony were compared with thresholds for asynchrony discrimination in a three-alternative forced-choice task. In contrast to earlier data from normal-hearing (NH) listeners, the synchronous-response functions obtained from the hearing-impaired (HI) listeners differed in patterns of symmetry and often had a very low peak (i.e., maximum proportion of "synchronous" responses). Also in contrast to data from NH listeners, the quantitative and qualitative correspondence between the data from the subjective and the forced-choice tasks was often poor. The results do not provide strong evidence for the influence of changes in cochlear mechanics on the perception of synchrony in HI listeners, and it remains possible that age, independent of hearing loss, plays an important role in temporal synchrony and asynchrony perception.

PMID: 23612740 [PubMed - indexed for MEDLINE]

Icon for Springer Icon for PubMed Central Related Articles

Modulation frequency discrimination with modulated and unmodulated interference in normal hearing and in cochlear-implant users.

J Assoc Res Otolaryngol. 2013 Aug;14(4):591-601

Authors: Kreft HA, Nelson DA, Oxenham AJ

Abstract
Differences in fundamental frequency (F0) provide an important cue for segregating simultaneous sounds. Cochlear implants (CIs) transmit F0 information primarily through the periodicity of the temporal envelope of the electrical pulse trains. Successful segregation of sounds with different F0s requires the ability to process multiple F0s simultaneously, but it is unknown whether CI users have this ability. This study measured modulation frequency discrimination thresholds for half-wave rectified sinusoidal envelopes modulated at 115 Hz in CI users and normal-hearing (NH) listeners. The target modulation was presented in isolation or in the presence of an interferer. Discrimination thresholds were strongly affected by the presence of an interferer, even when it was unmodulated and spectrally remote. Interferer modulation increased interference and often led to very high discrimination thresholds, especially when the interfering modulation frequency was lower than that of the target. Introducing a temporal offset between the interferer and the target led to at best modest improvements in performance in CI users and NH listeners. The results suggest no fundamental difference between acoustic and electric hearing in processing single or multiple envelope-based F0s, but confirm that differences in F0 are unlikely to provide a robust cue for perceptual segregation in CI users.

PMID: 23632651 [PubMed - indexed for MEDLINE]

Icon for Wiley Icon for Wiley Icon for PubMed Central Related Articles

Mechanisms and mechanics of auditory masking.

J Physiol. 2013 May 15;591(10):2375

Authors: Oxenham AJ

PMID: 23678149 [PubMed - indexed for MEDLINE]

Icon for Springer Icon for PubMed Central Related Articles

Pitch perception: dissociating frequency from fundamental-frequency discrimination.

Adv Exp Med Biol. 2013;787:137-45

Authors: Oxenham AJ, Micheyl C

Abstract
High-frequency pure tones (>6 kHz), which alone do not produce salient melodic pitch information, provide melodic pitch information when they form part of a harmonic complex tone with a lower fundamental frequency (F0). We explored this phenomenon in normal-hearing listeners by measuring F0 difference limens (F0DLs) for harmonic complex tones and pure-tone frequency difference limens (FDLs) for each of the tones within the harmonic complexes. Two spectral regions were tested. The low- and high-frequency band-pass regions comprised harmonics 6-11 of a 280- or 1,400-Hz F0, respectively; thus, for the high-frequency region, audible frequencies present were all above 7 kHz. Frequency discrimination of inharmonic log-spaced tone complexes was also tested in control conditions. All tones were presented in a background of noise to limit the detection of distortion products. As found in previous studies, F0DLs in the low region were typically no better than the FDL for each of the constituent pure tones. In contrast, F0DLs for the high-region complex were considerably better than the FDLs found for most of the constituent (high-frequency) pure tones. The data were compared with models of optimal spectral integration of information, to assess the relative influence of peripheral and more central noise in limiting performance. The results demonstrate a dissociation in the way pitch information is integrated at low and high frequencies and provide new challenges and constraints in the search for the underlying neural mechanisms of pitch.

PMID: 23716218 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Behavioral measures of cochlear compression and temporal resolution as predictors of speech masking release in hearing-impaired listeners.

J Acoust Soc Am. 2013 Oct;134(4):2895-912

Authors: Gregan MJ, Nelson PB, Oxenham AJ

Abstract
Hearing-impaired (HI) listeners often show less masking release (MR) than normal-hearing listeners when temporal fluctuations are imposed on a steady-state masker, even when accounting for overall audibility differences. This difference may be related to a loss of cochlear compression in HI listeners. Behavioral estimates of compression, using temporal masking curves (TMCs), were compared with MR for band-limited (500-4000 Hz) speech and pure tones in HI listeners and age-matched, noise-masked normal-hearing (NMNH) listeners. Compression and pure-tone MR estimates were made at 500, 1500, and 4000 Hz. The amount of MR was defined as the difference in performance between steady-state and 10-Hz square-wave-gated speech-shaped noise. In addition, temporal resolution was estimated from the slope of the off-frequency TMC. No significant relationship was found between estimated cochlear compression and MR for either speech or pure tones. NMNH listeners had significantly steeper off-frequency temporal masking recovery slopes than did HI listeners, and a small but significant correlation was observed between poorer temporal resolution and reduced MR for speech. The results suggest either that the effects of hearing impairment on MR are not determined primarily by changes in peripheral compression, or that the TMC does not provide a sufficiently reliable measure of cochlear compression.

PMID: 24116426 [PubMed - indexed for MEDLINE]

Icon for Public Library of Science Icon for PubMed Central Related Articles

Auditory frequency and intensity discrimination explained using a cortical population rate code.

PLoS Comput Biol. 2013;9(11):e1003336

Authors: Micheyl C, Schrater PR, Oxenham AJ

Abstract
The nature of the neural codes for pitch and loudness, two basic auditory attributes, has been a key question in neuroscience for over century. A currently widespread view is that sound intensity (subjectively, loudness) is encoded in spike rates, whereas sound frequency (subjectively, pitch) is encoded in precise spike timing. Here, using information-theoretic analyses, we show that the spike rates of a population of virtual neural units with frequency-tuning and spike-count correlation characteristics similar to those measured in the primary auditory cortex of primates, contain sufficient statistical information to account for the smallest frequency-discrimination thresholds measured in human listeners. The same population, and the same spike-rate code, can also account for the intensity-discrimination thresholds of humans. These results demonstrate the viability of a unified rate-based cortical population code for both sound frequency (pitch) and sound intensity (loudness), and thus suggest a resolution to a long-standing puzzle in auditory neuroscience.

PMID: 24244142 [PubMed - indexed for MEDLINE]

Icon for American Psychological Association Icon for PubMed Central Related Articles

Perceptual asymmetry induced by the auditory continuity illusion.

J Exp Psychol Hum Percept Perform. 2014 Jun;40(3):908-14

Authors: Ruggles DR, Oxenham AJ

Abstract
The challenges of daily communication require listeners to integrate both independent and complementary auditory information to form holistic auditory scenes. As part of this process listeners are thought to fill in missing information to create continuous perceptual streams, even when parts of messages are masked or obscured. One example of this filling-in process-the auditory continuity illusion-has been studied primarily using stimuli presented in isolation, leaving it unclear whether the illusion occurs in more complex situations with higher perceptual and attentional demands. In this study, young normal-hearing participants listened for long target tones, either real or illusory, in "clouds" of shorter masking tone and noise bursts with pseudorandom spectrotemporal locations. Patterns of detection suggest that illusory targets are salient within mixtures, although they do not produce the same level of performance as the real targets. The results suggest that the continuity illusion occurs in the presence of competing sounds and can be used to aid in the detection of partially obscured objects within complex auditory scenes.

PMID: 24364709 [PubMed - indexed for MEDLINE]

2012

2012

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Perception of across-frequency asynchrony and the role of cochlear delays.

J Acoust Soc Am. 2012 Jan;131(1):363-77

Authors: Wojtczak M, Beim JA, Micheyl C, Oxenham AJ

Abstract
Cochlear filtering results in earlier responses to high than to low frequencies. This study examined potential perceptual correlates of cochlear delays by measuring the perception of relative timing between tones of different frequencies. A brief 250-Hz tone was combined with a brief 1-, 2-, 4-, or 6-kHz tone. Two experiments were performed, one involving subjective judgments of perceived synchrony, the other involving asynchrony detection and discrimination. The functions relating the proportion of "synchronous" responses to the delay between the tones were similar for all tone pairs. Perceived synchrony was maximal when the tones in a pair were gated synchronously. The perceived-synchrony function slopes were asymmetric, being steeper on the low-frequency-leading side. In the second experiment, asynchrony-detection thresholds were lower for low-frequency rather than for high-frequency leading pairs. In contrast with previous studies, but consistent with the first experiment, thresholds did not depend on frequency separation between the tones, perhaps because of the elimination of within-channel cues. The results of the two experiments were related quantitatively using a decision-theoretic model, and were found to be highly correlated. Overall the results suggest that frequency-dependent cochlear group delays are compensated for at higher processing stages, resulting in veridical perception of timing relationships across frequency.

PMID: 22280598 [PubMed - indexed for MEDLINE]

Icon for HighWire Icon for PubMed Central Related Articles

Global not local masker features govern the auditory continuity illusion.

J Neurosci. 2012 Mar 28;32(13):4660-4

Authors: Riecke L, Micheyl C, Oxenham AJ

Abstract
When an acoustic signal is temporarily interrupted by another sound, it is sometimes heard as continuing through, even when the signal is actually turned off during the interruption-an effect known as the "auditory continuity illusion." A widespread view is that the illusion can only occur when peripheral neural responses contain no evidence that the signal was interrupted. Here we challenge this view using a combination of psychophysical measures from human listeners and computational simulations with a model of the auditory periphery. The results reveal that the illusion seems to depend more on the overall specific loudness than on the peripheral masking properties of the interrupting sound. This finding indicates that the continuity illusion is determined by the global features, rather than the fine-grained temporal structure, of the interrupting sound, and argues against the view that the illusion arises in the auditory periphery.

PMID: 22457512 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Effects of pulsing of a target tone on the ability to hear it out in different types of complex sounds.

J Acoust Soc Am. 2012 Apr;131(4):2927-37

Authors: Moore BC, Glasberg BR, Oxenham AJ

Abstract
Judgments of whether a sinusoidal probe is higher or lower in frequency than the closest partial ("target") in a multi-partial complex are improved when the target is pulsed on and off. These experiments explored the contribution of reduction in perceptual confusion and recovery from adaptation to this effect. In experiment 1, all partials except the target were replaced by noise to reduce perceptual confusion. Performance was much better than when the background was composed of multiple partials. When the level of the target was reduced to avoid ceiling effects, no effect of pulsing the target occurred. In experiment 2, the target and background partials were irregularly and independently amplitude modulated. This gave a large effect of pulsing the target, suggesting that if recovery from adaptation contributes to the effect, amplitude fluctuations do not prevent this. In experiment 3, the background was composed of multiple steady partials, but the target was irregularly amplitude modulated. This gave better performance than when the target was unmodulated and a moderate effect of pulsing the target. It is argued that when the target and background are steady tones, pulsing the target may result both in reduction of perceptual confusion and recovery from adaptation.

PMID: 22501070 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Comparing models of the combined-stimulation advantage for speech recognition.

J Acoust Soc Am. 2012 May;131(5):3970-80

Authors: Micheyl C, Oxenham AJ

Abstract
The "combined-stimulation advantage" refers to an improvement in speech recognition when cochlear-implant or vocoded stimulation is supplemented by low-frequency acoustic information. Previous studies have been interpreted as evidence for "super-additive" or "synergistic" effects in the combination of low-frequency and electric or vocoded speech information by human listeners. However, this conclusion was based on predictions of performance obtained using a suboptimal high-threshold model of information combination. The present study shows that a different model, based on Gaussian signal detection theory, can predict surprisingly large combined-stimulation advantages, even when performance with either information source alone is close to chance, without involving any synergistic interaction. A reanalysis of published data using this model reveals that previous results, which have been interpreted as evidence for super-additive effects in perception of combined speech stimuli, are actually consistent with a more parsimonious explanation, according to which the combined-stimulation advantage reflects an optimal combination of two independent sources of information. The present results do not rule out the possible existence of synergistic effects in combined stimulation; however, they emphasize the possibility that the combined-stimulation advantages observed in some studies can be explained simply by non-interactive combination of two information sources.

PMID: 22559370 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Further evidence that fundamental-frequency difference limens measure pitch discrimination.

J Acoust Soc Am. 2012 May;131(5):3989-4001

Authors: Micheyl C, Ryan CM, Oxenham AJ

Abstract
Difference limens for complex tones (DLCs) that differ in F0 are widely regarded as a measure of periodicity-pitch discrimination. However, because F0 changes are inevitably accompanied by changes in the frequencies of the harmonics, DLCs may actually reflect the discriminability of individual components. To test this hypothesis, DLCs were measured for complex tones, the component frequencies of which were shifted coherently upward or downward by ΔF = 0%, 25%, 37.5%, or 50% of the F0, yielding fully harmonic (ΔF = 0%), strongly inharmonic (ΔF = 25%, 37.5%), or odd-harmonic (ΔF = 50%) tones. If DLCs truly reflect periodicity-pitch discriminability, they should be larger (worse) for inharmonic tones than for harmonic and odd harmonic tones because inharmonic tones have a weaker pitch. Consistent with this prediction, the results of two experiments showed a non-monotonic dependence of DLCs on ΔF, with larger DLCs for ΔF's of ± 25% or ± 37.5% than for ΔF's of 0 or ± 50% of F0. These findings are consistent with models of pitch perception that involve harmonic templates or with an autocorrelation-based model provided that more than just the highest peak in the summary autocorrelogram is taken into account.

PMID: 22559372 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Vowel enhancement effects in cochlear-implant users.

J Acoust Soc Am. 2012 Jun;131(6):EL421-6

Authors: Wang N, Kreft H, Oxenham AJ

Abstract
Auditory enhancement of certain frequencies can occur through prior stimulation of surrounding frequency regions. The underlying neural mechanisms are unknown, but may involve stimulus-driven changes in cochlear gain via the medial olivocochlear complex (MOC) efferents. Cochlear implants (CIs) bypass the cochlea and stimulate the auditory nerve directly. If the MOC plays a critical role in enhancement then CI users should not exhibit this effect. Results using vowel stimuli, with and without preceding sounds designed to enhance formants, provided evidence of auditory enhancement in both normal-hearing listeners and CI users, suggesting that vowel enhancement is not mediated solely by cochlear effects.

PMID: 22713016 [PubMed - indexed for MEDLINE]

Icon for Elsevier Science Icon for PubMed Central Related Articles

Characterizing the dependence of pure-tone frequency difference limens on frequency, duration, and level.

Hear Res. 2012 Oct;292(1-2):1-13

Authors: Micheyl C, Xiao L, Oxenham AJ

Abstract
This study examined the relationship between the difference limen for frequency (DLF) of pure tones and three commonly explored stimulus parameters of frequency, duration, and sensation level. Data from 12 published studies of pure-tone frequency discrimination (a total of 583 DLF measurements across 77 normal-hearing listeners) were analyzed using hierarchical (or "mixed-effects") generalized linear models. Model parameters were estimated using two approaches (Bayesian and maximum likelihood). A model in which log-transformed DLFs were predicted using a sum of power-law functions plus a random subject- or group-specific term was found to explain a substantial proportion of the variability in the psychophysical data. The results confirmed earlier findings of an inverse-square-root relationship between log-transformed DLFs and duration, and of an inverse relationship between log(DLF) and sensation level. However, they did not confirm earlier suggestions that log(DLF) increases approximately linearly with the square-root of frequency; instead, the relationship between frequency and log(DLF) was best fitted using a power function of frequency with an exponent of about 0.8. These results, and the comprehensive quantitative analysis of pure-tone frequency discrimination on which they are based, provide a new reference for the quantitative evaluation of models of frequency (or pitch) discrimination.

PMID: 22841571 [PubMed - indexed for MEDLINE]

Icon for HighWire Icon for PubMed Central Related Articles

Pitch perception.

J Neurosci. 2012 Sep 26;32(39):13335-8

Authors: Oxenham AJ

Abstract
Pitch is one of the primary auditory sensations and plays a defining role in music, speech, and auditory scene analysis. Although the main physical correlate of pitch is acoustic periodicity, or repetition rate, there are many interactions that complicate the relationship between the physical stimulus and the perception of pitch. In particular, the effects of other acoustic parameters on pitch judgments, and the complex interactions between perceptual organization and pitch, have uncovered interesting perceptual phenomena that should help to reveal the underlying neural mechanisms.

PMID: 23015422 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Intelligibility of whispered speech in stationary and modulated noise maskers.

J Acoust Soc Am. 2012 Oct;132(4):2514-23

Authors: Freyman RL, Griffin AM, Oxenham AJ

Abstract
This study investigated the role of natural periodic temporal fine structure in helping listeners take advantage of temporal valleys in amplitude-modulated masking noise when listening to speech. Young normal-hearing participants listened to natural, whispered, and/or vocoded nonsense sentences in a variety of masking conditions. Whispering alters normal waveform temporal fine structure dramatically but, unlike vocoding, does not degrade spectral details created by vocal tract resonances. The improvement in intelligibility, or masking release, due to introducing 16-Hz square-wave amplitude modulations in an otherwise steady speech-spectrum noise was reduced substantially with vocoded sentences relative to natural speech, but was not reduced for whispered sentences. In contrast to natural speech, masking release for whispered sentences was observed even at positive signal-to-noise ratios. Whispered speech has a different short-term amplitude distribution relative to natural speech, and this appeared to explain the robust masking release for whispered speech at high signal-to-noise ratios. Recognition of whispered speech was not disproportionately affected by unpredictable modulations created by a speech-envelope modulated noise masker. Overall, the presence or absence of periodic temporal fine structure did not have a major influence on the degree of benefit obtained from imposing temporal fluctuations on a noise masker.

PMID: 23039445 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Forward masking of frequency modulation.

J Acoust Soc Am. 2012 Nov;132(5):3375-86

Authors: Byrne AJ, Wojtczak M, Viemeister NF

Abstract
Forward masking of sinusoidal frequency modulation (FM) was measured with three types of maskers: FM, amplitude modulation (AM), and a masker created by combining the magnitude spectrum of an FM tone with random component phases. For the signal FM rates used (5, 20, and 40 Hz), an FM masker raised detection thresholds in terms of frequency deviation by a factor of about 5 relative to without a masker. The AM masker produced a much smaller effect, suggesting that FM-to-AM conversion did not contribute substantially to the FM forward masking. The modulation depth of an FM masker had a nonmonotonic effect, with maximal masking observed at an intermediate value within the range of possible depths, while the random-phase FM masker produced less masking, arguing against a spectrally-based explanation for FM forward masking. Broad FM-rate selectivity for forward masking was observed for both 4-kHz and 500-Hz carriers. Thresholds measured as a function of the masker-signal delay showed slow recovery from FM forward masking, with residual masking for delays up to 500 ms. The FM forward-masking effect resembles that observed for AM [Wojtczak and Viemeister (2005). J. Acoust. Soc. Am. 188, 3198-3210] and may reflect modulation-rate selective neural adaptation to FM.

PMID: 23145618 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

On the possibility of a place code for the low pitch of high-frequency complex tones.

J Acoust Soc Am. 2012 Dec;132(6):3883-95

Authors: Santurette S, Dau T, Oxenham AJ

Abstract
Harmonics are considered unresolved when they interact with neighboring harmonics and cannot be heard out separately. Several studies have suggested that the pitch derived from unresolved harmonics is coded via temporal fine-structure cues emerging from their peripheral interactions. Such conclusions rely on the assumption that the components of complex tones with harmonic ranks down to at least 9 were indeed unresolved. The present study tested this assumption via three different measures: (1) the effects of relative component phase on pitch matches, (2) the effects of dichotic presentation on pitch matches, and (3) listeners' ability to hear out the individual components. No effects of relative component phase or dichotic presentation on pitch matches were found in the tested conditions. Large individual differences were found in listeners' ability to hear out individual components. Overall, the results are consistent with the coding of individual harmonic frequencies, based on the tonotopic activity pattern or phase locking to individual harmonics, rather than with temporal coding of single-channel interactions. However, they are also consistent with more general temporal theories of pitch involving the across-channel summation of information from resolved and/or unresolved harmonics. Simulations of auditory-nerve responses to the stimuli suggest potential benefits to a spatiotemporal mechanism.

PMID: 23231119 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Assessing the role of spectral and intensity cues in spectral ripple detection and discrimination in cochlear-implant users.

J Acoust Soc Am. 2012 Dec;132(6):3925-34

Authors: Anderson ES, Oxenham AJ, Nelson PB, Nelson DA

Abstract
Measures of spectral ripple resolution have become widely used psychophysical tools for assessing spectral resolution in cochlear-implant (CI) listeners. The objective of this study was to compare spectral ripple discrimination and detection in the same group of CI listeners. Ripple detection thresholds were measured over a range of ripple frequencies and were compared to spectral ripple discrimination thresholds previously obtained from the same CI listeners. The data showed that performance on the two measures was correlated, but that individual subjects' thresholds (at a constant spectral modulation depth) for the two tasks were not equivalent. In addition, spectral ripple detection was often found to be possible at higher rates than expected based on the available spectral cues, making it likely that temporal-envelope cues played a role at higher ripple rates. Finally, spectral ripple detection thresholds were compared to previously obtained speech-perception measures. Results confirmed earlier reports of a robust relationship between detection of widely spaced ripples and measures of speech recognition. In contrast, intensity difference limens for broadband noise did not correlate with spectral ripple detection measures, suggesting a dissociation between the ability to detect small changes in intensity across frequency and across time.

PMID: 23231122 [PubMed - indexed for MEDLINE]

2011

2011

Icon for American Psychological Association Icon for PubMed Central Related Articles

Perceptual grouping affects pitch judgments across time and frequency.

J Exp Psychol Hum Percept Perform. 2011 Feb;37(1):257-69

Authors: Borchert EM, Micheyl C, Oxenham AJ

Abstract
Pitch, the perceptual correlate of fundamental frequency (F0), plays an important role in speech, music, and animal vocalizations. Changes in F0 over time help define musical melodies and speech prosody, while comparisons of simultaneous F0 are important for musical harmony, and for segregating competing sound sources. This study compared listeners' ability to detect differences in F0 between pairs of sequential or simultaneous tones that were filtered into separate, nonoverlapping spectral regions. The timbre differences induced by filtering led to poor F0 discrimination in the sequential, but not the simultaneous, conditions. Temporal overlap of the two tones was not sufficient to produce good performance; instead performance appeared to depend on the two tones being integrated into the same perceptual object. The results confirm the difficulty of comparing the pitches of sequential sounds with different timbres and suggest that, for simultaneous sounds, pitch differences may be detected through a decrease in perceptual fusion rather than an explicit coding and comparison of the underlying F0s.

PMID: 21077719 [PubMed - indexed for MEDLINE]

Icon for Springer Icon for PubMed Central Related Articles

Forward masking in the amplitude-modulation domain for tone carriers: psychophysical results and physiological correlates.

J Assoc Res Otolaryngol. 2011 Jun;12(3):361-73

Authors: Wojtczak M, Nelson PC, Viemeister NF, Carney LH

Abstract
Wojtczak and Viemeister (J Acoust Soc Am 118:3198-3210, 2005) demonstrated forward masking in the amplitude-modulation (AM) domain. The present study examined whether this effect has correlates in physiological responses to AM at the level of the auditory midbrain. The human psychophysical experiment used 40-Hz, 100% AM (masker AM) that was imposed on a 5.5-kHz carrier during the first 150 ms of its duration. The masker AM was followed by a 50-ms burst of AM of the same rate (signal AM) imposed on the same (uninterrupted) carrier, either immediately after the masker or with a delay. In the physiological experiment, single-unit extracellular recordings in the awake rabbit inferior colliculus (IC) were obtained for stimuli designed to be similar to the uninterrupted-carrier conditions used in the psychophysics. The masker AM was longer (500 ms compared with 150 ms in the psychophysical experiment), and the carrier and modulation rate were chosen based on each neuron's audio- and envelope-frequency selectivity. Based on the average discharge rates of the responses or on the temporal correlation between neural responses to masked and unmasked stimuli, only a small subset of the population of IC cells exhibited suppression of signal AM following the masker. In contrast, changes in the discharge rates between the temporal segments of the carrier immediately preceding the signal AM and during the signal AM varied as a function of masker-signal delay with a trend that matched the psychophysical results. Unless the physiological observations were caused by species differences, they suggest that stages of processing higher than the IC must be considered to account for the AM-processing time constants measured perceptually in humans.

PMID: 21181225 [PubMed - indexed for MEDLINE]

Icon for HighWire Icon for PubMed Central Related Articles

Recovering sound sources from embedded repetition.

Proc Natl Acad Sci U S A. 2011 Jan 18;108(3):1188-93

Authors: McDermott JH, Wrobleski D, Oxenham AJ

Abstract
Cocktail parties and other natural auditory environments present organisms with mixtures of sounds. Segregating individual sound sources is thought to require prior knowledge of source properties, yet these presumably cannot be learned unless the sources are segregated first. Here we show that the auditory system can bootstrap its way around this problem by identifying sound sources as repeating patterns embedded in the acoustic input. Due to the presence of competing sounds, source repetition is not explicit in the input to the ear, but it produces temporal regularities that listeners detect and use for segregation. We used a simple generative model to synthesize novel sounds with naturalistic properties. We found that such sounds could be segregated and identified if they occurred more than once across different mixtures, even when the same sounds were impossible to segregate in single mixtures. Sensitivity to the repetition of sound sources can permit their recovery in the absence of other segregation cues or prior knowledge of sounds, and could help solve the cocktail party problem.

PMID: 21199948 [PubMed - indexed for MEDLINE]

Icon for HighWire Icon for PubMed Central Related Articles

Pitch perception beyond the traditional existence region of pitch.

Proc Natl Acad Sci U S A. 2011 May 03;108(18):7629-34

Authors: Oxenham AJ, Micheyl C, Keebler MV, Loper A, Santurette S

Abstract
Humans' ability to recognize musical melodies is generally limited to pure-tone frequencies below 4 or 5 kHz. This limit coincides with the highest notes on modern musical instruments and is widely believed to reflect the upper limit of precise stimulus-driven spike timing in the auditory nerve. We tested the upper limits of pitch and melody perception in humans using pure and harmonic complex tones, such as those produced by the human voice and musical instruments, in melody recognition and pitch-matching tasks. We found that robust pitch perception can be elicited by harmonic complex tones with fundamental frequencies below 2 kHz, even when all of the individual harmonics are above 6 kHz--well above the currently accepted existence region of pitch and above the currently accepted limits of neural phase locking. The results suggest that the perception of musical pitch at high frequencies is not constrained by temporal phase locking in the auditory nerve but may instead stem from higher-level constraints shaped by prior exposure to harmonic sounds.

PMID: 21502495 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Comparing spatial tuning curves, spectral ripple resolution, and speech perception in cochlear implant users.

J Acoust Soc Am. 2011 Jul;130(1):364-75

Authors: Anderson ES, Nelson DA, Kreft H, Nelson PB, Oxenham AJ

Abstract
Spectral ripple discrimination thresholds were measured in 15 cochlear-implant users with broadband (350-5600 Hz) and octave-band noise stimuli. The results were compared with spatial tuning curve (STC) bandwidths previously obtained from the same subjects. Spatial tuning curve bandwidths did not correlate significantly with broadband spectral ripple discrimination thresholds but did correlate significantly with ripple discrimination thresholds when the rippled noise was confined to an octave-wide passband, centered on the STC's probe electrode frequency allocation. Ripple discrimination thresholds were also measured for octave-band stimuli in four contiguous octaves, with center frequencies from 500 Hz to 4000 Hz. Substantial variations in thresholds with center frequency were found in individuals, but no general trends of increasing or decreasing resolution from apex to base were observed in the pooled data. Neither ripple nor STC measures correlated consistently with speech measures in noise and quiet in the sample of subjects in this study. Overall, the results suggest that spectral ripple discrimination measures provide a reasonable measure of spectral resolution that correlates well with more direct, but more time-consuming, measures of spectral resolution, but that such measures do not always provide a clear and robust predictor of performance in speech perception tasks.

PMID: 21786905 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Behavioral estimates of basilar-membrane compression: additivity of forward masking in noise-masked normal-hearing listeners.

J Acoust Soc Am. 2011 Nov;130(5):2835-44

Authors: Gregan MJ, Nelson PB, Oxenham AJ

Abstract
Cochlear hearing loss is often associated with a loss of basilar-membrane (BM) compression, which in turn may contribute to degraded processing of suprathreshold stimuli. Behavioral estimates of compression may therefore be useful as long as they are valid over a wide range of levels and frequencies. Additivity of forward masking (AFM) may provide such a measure, but research to date lacks normative data from normal-hearing (NH) listeners at high sound levels, which is necessary to evaluate data from hearing-impaired (HI) listeners. The present study measured AFM in six NH listeners for signal frequencies of 500, 1500, and 4000 Hz in the presence of background noise, designed to elevate signal thresholds to levels similar to those experienced by HI listeners. Results consistent with compressive BM responses were found for all six listeners at 500 Hz, five listeners at 1500 Hz, but only two listeners at 4000 Hz. Further measurements in the absence of background noise also indicated a lack of consistent compression at 4000 Hz at higher signal levels, in contrast to earlier results collected at lower levels. A better understanding of this issue will be required before AFM can be used as a general behavioral estimate of BM compression.

PMID: 22087912 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

The effect of carrier level on tuning in amplitude-modulation masking.

J Acoust Soc Am. 2011 Dec;130(6):3916-25

Authors: Wojtczak M

Abstract
The effect of carrier level on tuning in modulation masking was investigated for noise and tonal carriers. Bandwidths of the modulation filters, estimated from the masked detection thresholds using an envelope power spectrum model, were independent of level for the noise carrier but seemed to decrease with increasing level for the tonal carrier. However, the apparently sharper tuning could be explained by increased modulation sensitivity and modulation dynamic range with increasing level rather than improved modulation-frequency selectivity. Consistent with this interpretation, the addition of a high-pass noise with a level adjusted to maintain the same threshold for the detection of the signal modulation for each carrier level used eliminated the effect of level on tuning. Overall, modulation filters estimated from psychophysical data do not depend on level in contrast to the modulation transfer functions obtained from neural recordings in the inferior colliculus in physiological studies. The results highlight differences between the characteristics of modulation processing obtained from neural data and perception. The discrepancies indicate the need for further investigation into physiological correlates of tuning in modulation processing.

PMID: 22225047 [PubMed - indexed for MEDLINE]

2010

2010

Icon for Elsevier Science Icon for PubMed Central Related Articles

Pitch, harmonicity and concurrent sound segregation: psychoacoustical and neurophysiological findings.

Hear Res. 2010 Jul;266(1-2):36-51

Authors: Micheyl C, Oxenham AJ

Abstract
Harmonic complex tones are a particularly important class of sounds found in both speech and music. Although these sounds contain multiple frequency components, they are usually perceived as a coherent whole, with a pitch corresponding to the fundamental frequency (F0). However, when two or more harmonic sounds occur concurrently, e.g., at a cocktail party or in a symphony, the auditory system must separate harmonics and assign them to their respective F0s so that a coherent and veridical representation of the different sounds sources is formed. Here we review both psychophysical and neurophysiological (single-unit and evoked-potential) findings, which provide some insight into how, and how well, the auditory system accomplishes this task. A survey of computational models designed to estimate multiple F0s and segregate concurrent sources is followed by a review of the empirical literature on the perception and neural coding of concurrent harmonic sounds, including vowels, as well as findings obtained using single complex tones with mistuned harmonics.

PMID: 19788920 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Modulation rate discrimination using half-wave rectified and sinusoidally amplitude modulated stimuli in cochlear-implant users.

J Acoust Soc Am. 2010 Feb;127(2):656-9

Authors: Kreft HA, Oxenham AJ, Nelson DA

Abstract
Detection and modulation rate discrimination were measured in cochlear-implant users for pulse-trains that were either sinusoidally amplitude modulated or were modulated with half-wave rectified sinusoids, which in acoustic hearing have been used to simulate the response to low-frequency temporal fine structure. In contrast to comparable results from acoustic hearing, modulation rate discrimination was not statistically different for the two stimulus types. The results suggest that, in contrast to binaural perception, pitch perception in cochlear-implant users does not benefit from using stimuli designed to more closely simulate the cochlear response to low-frequency pure tones.

PMID: 20136187 [PubMed - indexed for MEDLINE]

Icon for Springer Icon for PubMed Central Related Articles

Otoacoustic estimation of cochlear tuning: validation in the chinchilla.

J Assoc Res Otolaryngol. 2010 Sep;11(3):343-65

Authors: Shera CA, Guinan JJ, Oxenham AJ

Abstract
We analyze published auditory-nerve and otoacoustic measurements in chinchilla to test a network of hypothesized relationships between cochlear tuning, cochlear traveling-wave delay, and stimulus-frequency otoacoustic emissions (SFOAEs). We find that the physiological data generally corroborate the network of relationships, including predictions from filter theory and the coherent-reflection model of OAE generation, at locations throughout the cochlea. The results support the use of otoacoustic emissions as noninvasive probes of cochlear tuning. Developing this application, we find that tuning ratios-defined as the ratio of tuning sharpness to SFOAE phase-gradient delay in periods-have a nearly species-invariant form in cat, guinea pig, and chinchilla. Analysis of the tuning ratios identifies a species-dependent parameter that locates a transition between "apical-like" and "basal-like" behavior involving multiple aspects of cochlear physiology. Approximate invariance of the tuning ratio allows determination of cochlear tuning from SFOAE delays. We quantify the procedure and show that otoacoustic estimates of chinchilla cochlear tuning match direct measures obtained from the auditory nerve. By assuming that invariance of the tuning ratio extends to humans, we derive new otoacoustic estimates of human cochlear tuning that remain mutually consistent with independent behavioral measurements obtained using different rationales, methodologies, and analysis procedures. The results confirm that at any given characteristic frequency (CF) human cochlear tuning appears sharper than that in the other animals studied, but varies similarly with CF. We show, however, that the exceptionality of human tuning can be exaggerated by the ways in which species are conventionally compared, which take no account of evident differences between the base and apex of the cochlea. Finally, our estimates of human tuning suggest that the spatial spread of excitation of a pure tone along the human basilar membrane is comparable to that in other common laboratory animals.

PMID: 20440634 [PubMed - indexed for MEDLINE]

Icon for Elsevier Science Icon for Elsevier Science Icon for PubMed Central Related Articles

Individual differences reveal the basis of consonance.

Curr Biol. 2010 Jun 08;20(11):1035-41

Authors: McDermott JH, Lehr AJ, Oxenham AJ

Abstract
Some combinations of musical notes are consonant (pleasant), whereas others are dissonant (unpleasant), a distinction central to music. Explanations of consonance in terms of acoustics, auditory neuroscience, and enculturation have been debated for centuries. We utilized individual differences to distinguish the candidate theories. We measured preferences for musical chords as well as nonmusical sounds that isolated particular acoustic factors--specifically, the beating and the harmonic relationships between frequency components, two factors that have long been thought to potentially underlie consonance. Listeners preferred stimuli without beats and with harmonic spectra, but across more than 250 subjects, only the preference for harmonic spectra was consistently correlated with preferences for consonant over dissonant chords. Harmonicity preferences were also correlated with the number of years subjects had spent playing a musical instrument, suggesting that exposure to music amplifies preferences for harmonic frequencies because of their musical importance. Harmonic spectra are prominent features of natural sounds, and our results indicate that they also underlie the perception of consonance.

PMID: 20493704 [PubMed - indexed for MEDLINE]

Icon for Springer Icon for PubMed Central Related Articles

Neural adaptation to tone sequences in the songbird forebrain: patterns, determinants, and relation to the build-up of auditory streaming.

J Comp Physiol A Neuroethol Sens Neural Behav Physiol. 2010 Aug;196(8):543-57

Authors: Bee MA, Micheyl C, Oxenham AJ, Klump GM

Abstract
Neural responses to tones in the mammalian primary auditory cortex (A1) exhibit adaptation over the course of several seconds. Important questions remain about the taxonomic distribution of multi-second adaptation and its possible roles in hearing. It has been hypothesized that neural adaptation could explain the gradual "build-up" of auditory stream segregation. We investigated the influence of several stimulus-related factors on neural adaptation in the avian homologue of mammalian A1 (field L2) in starlings (Sturnus vulgaris). We presented awake birds with sequences of repeated triplets of two interleaved tones (ABA-ABA-...) in which we varied the frequency separation between the A and B tones (DeltaF), the stimulus onset asynchrony (time from tone onset to onset within a triplet), and tone duration. We found that stimulus onset asynchrony generally had larger effects on adaptation compared with DeltaF and tone duration over the parameter range tested. Using a simple model, we show how time-dependent changes in neural responses can be transformed into neurometric functions that make testable predictions about the dependence of the build-up of stream segregation on various spectral and temporal stimulus properties.

PMID: 20563587 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Recovery from on- and off-frequency forward masking in listeners with normal and impaired hearing.

J Acoust Soc Am. 2010 Jul;128(1):247-56

Authors: Wojtczak M, Oxenham AJ

Abstract
The aim of this study was to investigate the possible mechanisms underlying an effect reported earlier [Wojtczak, M., and Oxenham, A. J. (2009). J. Acoust. Soc. Am. 125, 270-281] in normal-hearing listeners, whereby recovery from forward masking can be slower for off-frequency tonal maskers than for on-frequency tonal maskers that produce the same amount of masking at a 0-ms masker-signal delay. To rule out potential effects of confusion between the tonal signal and tonal masker, one condition used a noise-band forward masker. To test whether the effect involved temporal build-up, another condition used a short-duration (30-ms) forward masker. To test whether the effect is dependent on normal cochlear function, conditions were tested in five listeners with sensorineural hearing loss. For the 150-ms noise maskers, the data from normal-hearing listeners replicated the findings from the previous study that used tonal maskers. In contrast, no significant difference in recovery from on- and off-frequency masking was observed for the 30-ms tonal maskers in normal-hearing listeners, or for the 150-ms tonal maskers in hearing-impaired listeners. Overall, the results are consistent with a mechanism based on efferent feedback that affects the recovery from forward masking in the normal auditory system.

PMID: 20649220 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Pitch perception for mixtures of spectrally overlapping harmonic complex tones.

J Acoust Soc Am. 2010 Jul;128(1):257-69

Authors: Micheyl C, Keebler MV, Oxenham AJ

Abstract
This study measured difference limens for fundamental frequency (DLF0s) for a target harmonic complex in the presence of a simultaneous spectrally overlapping harmonic masker. The resolvability of the target harmonics was manipulated by bandpass filtering the stimuli into a low (800-2400 Hz) or high (1600-3200 Hz) spectral region, using different nominal F0s for the targets (100, 200, and 400 Hz), and different masker F0s (0, +9, or -9 semitones) relative to the target. Three different modes of masker presentation, relative to the target, were tested: ipsilateral, contralateral, and dichotic, with a higher masker level in the contralateral ear. Ipsilateral and dichotic maskers generally caused marked elevations in DLF0s compared to both the unmasked and contralateral masker conditions. Analyses based on excitation patterns revealed that ipsilaterally masked F0 difference limens were small (<2%) only when the excitation patterns evoked by the target-plus-masker mixture contained several salient (>1 dB) peaks at or close to target harmonic frequencies, even though these peaks were rarely produced by the target alone. The findings are discussed in terms of place- or place-time mechanisms of pitch perception.

PMID: 20649221 [PubMed - indexed for MEDLINE]

Icon for Springer Icon for PubMed Central Related Articles

Objective and subjective psychophysical measures of auditory stream integration and segregation.

J Assoc Res Otolaryngol. 2010 Dec;11(4):709-24

Authors: Micheyl C, Oxenham AJ

Abstract
The perceptual organization of sound sequences into auditory streams involves the integration of sounds into one stream and the segregation of sounds into separate streams. "Objective" psychophysical measures of auditory streaming can be obtained using behavioral tasks where performance is facilitated by segregation and hampered by integration, or vice versa. Traditionally, these two types of tasks have been tested in separate studies involving different listeners, procedures, and stimuli. Here, we tested subjects in two complementary temporal-gap discrimination tasks involving similar stimuli and procedures. One task was designed so that performance in it would be facilitated by perceptual integration; the other, so that performance would be facilitated by perceptual segregation. Thresholds were measured in both tasks under a wide range of conditions produced by varying three stimulus parameters known to influence stream formation: frequency separation, tone-presentation rate, and sequence length. In addition to these performance-based measures, subjective judgments of perceived segregation were collected in the same listeners under corresponding stimulus conditions. The patterns of results obtained in the two temporal-discrimination tasks, and the relationships between thresholds and perceived-segregation judgments, were mostly consistent with the hypothesis that stream segregation helped performance in one task and impaired performance in the other task. The tasks and stimuli described here may prove useful in future behavioral or neurophysiological experiments, which seek to manipulate and measure neural correlates of auditory streaming while minimizing differences between the physical stimuli.

PMID: 20658165 [PubMed - indexed for MEDLINE]

Icon for American Psychological Association Icon for PubMed Central Related Articles

Behavioral measures of auditory streaming in ferrets (Mustela putorius).

J Comp Psychol. 2010 Aug;124(3):317-30

Authors: Ma L, Micheyl C, Yin P, Oxenham AJ, Shamma SA

Abstract
An important aspect of the analysis of auditory "scenes" relates to the perceptual organization of sound sequences into auditory "streams." In this study, we adapted two auditory perception tasks, used in recent human psychophysical studies, to obtain behavioral measures of auditory streaming in ferrets (Mustela putorius). One task involved the detection of shifts in the frequency of tones within an alternating tone sequence. The other task involved the detection of a stream of regularly repeating target tones embedded within a randomly varying multitone background. In both tasks, performance was measured as a function of various stimulus parameters, which previous psychophysical studies in humans have shown to influence auditory streaming. Ferret performance in the two tasks was found to vary as a function of these parameters in a way that is qualitatively consistent with the human data. These results suggest that auditory streaming occurs in ferrets, and that the two tasks described here may provide a valuable tool in future behavioral and neurophysiological studies of the phenomenon.

PMID: 20695663 [PubMed - indexed for MEDLINE]

Icon for American Psychological Association Icon for PubMed Central Related Articles

Auditory stream segregation and the perception of across-frequency synchrony.

J Exp Psychol Hum Percept Perform. 2010 Aug;36(4):1029-1039

Authors: Micheyl C, Hunter C, Oxenham AJ

Abstract
This study explored the extent to which sequential auditory grouping affects the perception of temporal synchrony. In Experiment 1, listeners discriminated between 2 pairs of asynchronous "target" tones at different frequencies, A and B, in which the B tone either led or lagged. Thresholds were markedly higher when the target tones were temporally surrounded by "captor tones" at the A frequency than when the captor tones were absent or at a remote frequency. Experiment 2 extended these findings to asynchrony detection, revealing that the perception of synchrony, one of the most potent cues for simultaneous auditory grouping, is not immune to competing effects of sequential grouping. Experiment 3 examined the influence of ear separation on the interactions between sequential and simultaneous grouping cues. The results showed that, although ear separation could facilitate perceptual segregation and impair asynchrony detection, it did not prevent the perceptual integration of simultaneous sounds.

PMID: 20695716 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Does fundamental-frequency discrimination measure virtual pitch discrimination?

J Acoust Soc Am. 2010 Oct;128(4):1930-42

Authors: Micheyl C, Divis K, Wrobleski DM, Oxenham AJ

Abstract
Studies of pitch perception often involve measuring difference limens for complex tones (DLCs) that differ in fundamental frequency (F0). These measures are thought to reflect F0 discrimination and to provide an indirect measure of subjective pitch strength. However, in many situations discrimination may be based on cues other than the pitch or the F0, such as differences in the frequencies of individual components or timbre (brightness). Here, DLCs were measured for harmonic and inharmonic tones under various conditions, including a randomized or fixed lowest harmonic number, with and without feedback. The inharmonic tones were produced by shifting the frequencies of all harmonics upwards by 6.25%, 12.5%, or 25% of F0. It was hypothesized that, if DLCs reflect residue-pitch discrimination, these frequency-shifted tones, which produced a weaker and more ambiguous pitch than would yield larger DLCs than the harmonic tones. However, if DLCs reflect comparisons of component pitches, or timbre, they should not be systematically influenced by frequency shifting. The results showed larger DLCs and more scattered pitch matches for inharmonic than for harmonic complexes, confirming that the inharmonic tones produced a less consistent pitch than the harmonic tones, and consistent with the idea that DLCs reflect F0 pitch discrimination.

PMID: 20968365 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Musical intervals and relative pitch: frequency resolution, not interval resolution, is special.

J Acoust Soc Am. 2010 Oct;128(4):1943-51

Authors: McDermott JH, Keebler MV, Micheyl C, Oxenham AJ

Abstract
Pitch intervals are central to most musical systems, which utilize pitch at the expense of other acoustic dimensions. It seemed plausible that pitch might uniquely permit precise perception of the interval separating two sounds, as this could help explain its importance in music. To explore this notion, a simple discrimination task was used to measure the precision of interval perception for the auditory dimensions of pitch, brightness, and loudness. Interval thresholds were then expressed in units of just-noticeable differences for each dimension, to enable comparison across dimensions. Contrary to expectation, when expressed in these common units, interval acuity was actually worse for pitch than for loudness or brightness. This likely indicates that the perceptual dimension of pitch is unusual not for interval perception per se, but rather for the basic frequency resolution it supports. The ubiquity of pitch in music may be due in part to this fine-grained basic resolution.

PMID: 20968366 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Effects of background noise level on behavioral estimates of basilar-membrane compression.

J Acoust Soc Am. 2010 May;127(5):3018-25

Authors: Gregan MJ, Nelson PB, Oxenham AJ

Abstract
Hearing-impaired (HI) listeners often show poorer performance on psychoacoustic tasks than do normal-hearing (NH) listeners. Although some such deficits may reflect changes in suprathreshold sound processing, others may be due to stimulus audibility and the elevated absolute thresholds associated with hearing loss. Masking noise can be used to raise the thresholds of NH to equal the thresholds in quiet of HI listeners. However, such noise may have other effects, including changing peripheral response characteristics, such as the compressive input-output function of the basilar membrane in the normal cochlea. This study estimated compression behaviorally across a range of background noise levels in NH listeners at a 4 kHz signal frequency, using a growth of forward masking paradigm. For signals 5 dB or more above threshold in noise, no significant effect of broadband noise level was found on estimates of compression. This finding suggests that broadband noise does not significantly alter the compressive response of the basilar membrane to sounds that are presented well above their threshold in the noise. Similarities between the performance of HI listeners and NH listeners in threshold-equalizing noise are therefore unlikely to be due to a linearization of basilar-membrane responses to suprathreshold stimuli in the NH listeners.

PMID: 21117751 [PubMed - indexed for MEDLINE]

Icon for Springer Icon for PubMed Central Related Articles

Forward masking in the amplitude-modulation domain for tone carriers: psychophysical results and physiological correlates.

J Assoc Res Otolaryngol. 2011 Jun;12(3):361-73

Authors: Wojtczak M, Nelson PC, Viemeister NF, Carney LH

Abstract
Wojtczak and Viemeister (J Acoust Soc Am 118:3198-3210, 2005) demonstrated forward masking in the amplitude-modulation (AM) domain. The present study examined whether this effect has correlates in physiological responses to AM at the level of the auditory midbrain. The human psychophysical experiment used 40-Hz, 100% AM (masker AM) that was imposed on a 5.5-kHz carrier during the first 150 ms of its duration. The masker AM was followed by a 50-ms burst of AM of the same rate (signal AM) imposed on the same (uninterrupted) carrier, either immediately after the masker or with a delay. In the physiological experiment, single-unit extracellular recordings in the awake rabbit inferior colliculus (IC) were obtained for stimuli designed to be similar to the uninterrupted-carrier conditions used in the psychophysics. The masker AM was longer (500 ms compared with 150 ms in the psychophysical experiment), and the carrier and modulation rate were chosen based on each neuron's audio- and envelope-frequency selectivity. Based on the average discharge rates of the responses or on the temporal correlation between neural responses to masked and unmasked stimuli, only a small subset of the population of IC cells exhibited suppression of signal AM following the masker. In contrast, changes in the discharge rates between the temporal segments of the carrier immediately preceding the signal AM and during the signal AM varied as a function of masker-signal delay with a trend that matched the psychophysical results. Unless the physiological observations were caused by species differences, they suggest that stages of processing higher than the IC must be considered to account for the AM-processing time constants measured perceptually in humans.

PMID: 21181225 [PubMed - indexed for MEDLINE]

2009

2009

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Pitfalls in behavioral estimates of basilar-membrane compression in humans.

J Acoust Soc Am. 2009 Jan;125(1):270-81

Authors: Wojtczak M, Oxenham AJ

Abstract
Psychoacoustic estimates of basilar-membrane compression often compare on- and off-frequency forward masking. Such estimates involve assuming that the recovery from forward masking for a given signal frequency is independent of masker frequency. To test this assumption, thresholds for a brief 4-kHz signal were measured as a function of masker-signal delay. Comparisons were made between on-frequency (4 kHz) and off-frequency (either 2.4 or 4.4 kHz) maskers, adjusted in level to produce the same amount of masking at a 0-ms delay between masker offset and signal onset. Consistent with the assumption, forward-masking recovery from a moderate-level (83 dB SPL) 2.4-kHz masker and a high-level (92 dB SPL) 4.4-kHz masker was the same as from the equivalent on-frequency maskers. In contrast, recovery from a high-level (92 dB SPL) 2.4-kHz forward masker was slower than from the equivalent on-frequency masker. The results were used to simulate temporal masking curves, taking into account the differences in on- and off-frequency masking recoveries at high levels. The predictions suggest that compression estimates assuming frequency-independent masking recovery may overestimate compression by as much as a factor of 2. The results suggest caution in interpreting forward-masking data in terms of basilar-membrane compression, particularly when high-level maskers are involved.

PMID: 19173414 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Masking release for low- and high-pass-filtered speech in the presence of noise and single-talker interference.

J Acoust Soc Am. 2009 Jan;125(1):457-68

Authors: Oxenham AJ, Simonson AM

Abstract
Speech intelligibility was measured for sentences presented in spectrally matched steady noise, single-talker interference, or speech-modulated noise. The stimuli were unfiltered or were low-pass (LP) (1200 Hz cutoff) or high-pass (HP) (1500 Hz cutoff) filtered. The cutoff frequencies were selected to produce equal performance in both LP and HP conditions in steady noise and to limit access to the temporal fine structure of resolved harmonics in the HP conditions. Masking release, or the improvement in performance between the steady noise and single-talker interference, was substantial with no filtering. Under LP and HP filtering, masking release was roughly equal but was much less than in unfiltered conditions. When the average F0 of the interferer was shifted lower than that of the target, similar increases in masking release were observed under LP and HP filtering. Similar LP and HP results were also obtained for the speech-modulated-noise masker. The findings are not consistent with the idea that pitch conveyed by the temporal fine structure of low-order harmonics plays a crucial role in masking release. Instead, any reduction in speech redundancy, or manipulation that increases the target-to-masker ratio necessary for intelligibility to beyond around 0 dB, may result in reduced masking release.

PMID: 19173431 [PubMed - indexed for MEDLINE]

Icon for Elsevier Science Icon for Elsevier Science Icon for PubMed Central Related Articles

Temporal coherence in the perceptual organization and cortical representation of auditory scenes.

Neuron. 2009 Jan 29;61(2):317-29

Authors: Elhilali M, Ma L, Micheyl C, Oxenham AJ, Shamma SA

Abstract
Just as the visual system parses complex scenes into identifiable objects, the auditory system must organize sound elements scattered in frequency and time into coherent "streams." Current neurocomputational theories of auditory streaming rely on tonotopic organization of the auditory system to explain the observation that sequential spectrally distant sound elements tend to form separate perceptual streams. Here, we show that spectral components that are well separated in frequency are no longer heard as separate streams if presented synchronously rather than consecutively. In contrast, responses from neurons in primary auditory cortex of ferrets show that both synchronous and asynchronous tone sequences produce comparably segregated responses along the tonotopic axis. The results argue against tonotopic separation per se as a neural correlate of stream segregation. Instead we propose a computational model of stream segregation that can account for the data by using temporal coherence as the primary criterion for predicting stream formation.

PMID: 19186172 [PubMed - indexed for MEDLINE]

Icon for PubMed Central Related Articles

Sensory noise explains auditory frequency discrimination learning induced by training with identical stimuli.

Atten Percept Psychophys. 2009 Jan;71(1):5-7

Authors: Micheyl C, McDermott JH, Oxenham AJ

Abstract
Thresholds in various visual and auditory perception tasks have been found to improve markedly with practice at intermediate levels of task difficulty. Recently, however, there have been reports that training with identical stimuli, which, by definition, were impossible to discriminate correctly beyond chance, could induce as much discrimination learning as could training with different stimuli. These surprising findings have been interpreted as evidence that discrimination learning can occur in the absence of perceived differences between stimuli and need not involve the fine-tuning of a discrimination mechanism. Here, we show that these counterintuitive findings of discrimination learning without discrimination can be understood simply by considering the effect of internal noise on sensory representations. Because of such noise, physically identical stimuli are unlikely to be perceived as being strictly identical. We show that, given empirically derived levels of sensory noise, perceived differences evoked by identical stimuli are actually not much smaller than those induced by the physical differences typically used in discrimination-learning experiments. We suggest that findings of discrimination learning with identical stimuli can be explained without implicating any fundamentally new learning mechanism.

PMID: 19304592 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Auditory stream formation affects comodulation masking release retroactively.

J Acoust Soc Am. 2009 Apr;125(4):2182-8

Authors: Dau T, Ewert S, Oxenham AJ

Abstract
Many sounds in the environment have temporal envelope fluctuations that are correlated in different frequency regions. Comodulation masking release (CMR) illustrates how such coherent fluctuations can improve signal detection. This study assesses how perceptual grouping mechanisms affect CMR. Detection thresholds for a 1-kHz sinusoidal signal were measured in the presence of a narrowband (20-Hz-wide) on-frequency masker with or without four comodulated or independent flanking bands that were spaced apart by either 1/6 (narrow spacing) or 1 octave (wide spacing). As expected, CMR was observed for the narrow and wide comodulated flankers. However, in the wide (but not narrow) condition, this CMR was eliminated by adding a series of gated flanking bands after the signal. Control experiments showed that this effect was not due to long-term adaptation or general distraction. The results are interpreted in terms of the sequence of "postcursor" flanking bands forming a perceptual stream with the original flanking bands, resulting in perceptual segregation of the flanking bands from the masker. The results are consistent with the idea that modulation analysis occurs within, not across, auditory objects, and that across-frequency CMR only occurs if the on-frequency and flanking bands fall within the same auditory object or stream.

PMID: 19354394 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Can temporal fine structure represent the fundamental frequency of unresolved harmonics?

J Acoust Soc Am. 2009 Apr;125(4):2189-99

Authors: Oxenham AJ, Micheyl C, Keebler MV

Abstract
At least two modes of pitch perception exist: in one, the fundamental frequency (F0) of harmonic complex tones is estimated using the temporal fine structure (TFS) of individual low-order resolved harmonics; in the other, F0 is derived from the temporal envelope of high-order unresolved harmonics that interact in the auditory periphery. Pitch is typically more accurate in the former than in the latter mode. Another possibility is that pitch can sometimes be coded via the TFS from unresolved harmonics. A recent study supporting this third possibility [Moore et al. (2006a). J. Acoust. Soc. Am. 119, 480-490] based its conclusion on a condition where phase interaction effects (implying unresolved harmonics) accompanied accurate F0 discrimination (implying TFS processing). The present study tests whether these results were influenced by audible distortion products. Experiment 1 replicated the original results, obtained using a low-level background noise. However, experiments 2-4 found no evidence for the use of TFS cues with unresolved harmonics when the background noise level was raised, or the stimulus level was lowered, to render distortion inaudible. Experiment 5 measured the presence and phase dependence of audible distortion products. The results provide no evidence that TFS cues are used to code the F0 of unresolved harmonics.

PMID: 19354395 [PubMed - indexed for MEDLINE]

Icon for Springer Icon for PubMed Central Related Articles

On- and off-frequency forward masking by Schroeder-phase complexes.

J Assoc Res Otolaryngol. 2009 Dec;10(4):595-607

Authors: Wojtczak M, Oxenham AJ

Abstract
Forward masking by harmonic tone complexes was measured for on- and off-frequency maskers as a function of masker phase curvature for two masker durations (30 and 200 ms). For the lowest signal frequency (1 kHz), the results matched predictions based on the expected interactions between the phase curvature and amplitude compression of peripheral auditory filtering. For the higher signal frequencies (2 and 6 kHz), the data increasingly departed from predictions in two respects. First, the effects of the masker phase curvature became stronger with increasing masker duration, inconsistent with the expected effects of the fast-acting compression and time-invariant phase response of basilar membrane filtering. Second, significant effects of masker phase curvature were observed for the off-frequency masker using a 6-kHz signal, inconsistent with predictions based on linear processing of stimuli well below the signal frequency. New predictions were generated assuming an additional effect with a longer time constant, consistent with the influence of medial olivocochlear efferent activation on otoacoustic emissions in humans. Reasonable agreement between the predicted and the measured effects suggests that efferent activation is a potential candidate mechanism to explain certain spectro-temporal masking effects in human hearing.

PMID: 19626368 [PubMed - indexed for MEDLINE]

Icon for Elsevier Science Icon for PubMed Central Related Articles

Pitch, harmonicity and concurrent sound segregation: psychoacoustical and neurophysiological findings.

Hear Res. 2010 Jul;266(1-2):36-51

Authors: Micheyl C, Oxenham AJ

Abstract
Harmonic complex tones are a particularly important class of sounds found in both speech and music. Although these sounds contain multiple frequency components, they are usually perceived as a coherent whole, with a pitch corresponding to the fundamental frequency (F0). However, when two or more harmonic sounds occur concurrently, e.g., at a cocktail party or in a symphony, the auditory system must separate harmonics and assign them to their respective F0s so that a coherent and veridical representation of the different sounds sources is formed. Here we review both psychophysical and neurophysiological (single-unit and evoked-potential) findings, which provide some insight into how, and how well, the auditory system accomplishes this task. A survey of computational models designed to estimate multiple F0s and segregate concurrent sources is followed by a review of the empirical literature on the perception and neural coding of concurrent harmonic sounds, including vowels, as well as findings obtained using single complex tones with mistuned harmonics.

PMID: 19788920 [PubMed - indexed for MEDLINE]

2008

2008

Icon for American Institute of Physics Related Articles

Effects of level and background noise on interaural time difference discrimination for transposed stimuli.

J Acoust Soc Am. 2008 Jan;123(1):EL1-7

Authors: Dreyer AA, Oxenham AJ

Abstract
Just-noticeable interaural time differences were measured for low-frequency pure tones, high-frequency sinusoidally amplitude-modulated (SAM) tones, and high-frequency transposed stimuli, at multiple levels with or without a spectrally notched diotic noise to prevent spread of excitation. Performance with transposed stimuli and pure tones was similar in quiet; however, in noise, performance was poorer for transposed stimuli than for pure tones. Performance with SAM tones was always poorest. In all conditions, performance improved slightly with increasing level. The results suggest that the equivalence postulated between transposed stimuli and pure tones is not valid in the presence of a spectrally notched background noise.

PMID: 18177063 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

The pulse-train auditory aftereffect and the perception of rapid amplitude modulations.

J Acoust Soc Am. 2008 Feb;123(2):935-45

Authors: Gutschalk A, Micheyl C, Oxenham AJ

Abstract
Prolonged listening to a pulse train with repetition rates around 100 Hz induces a striking aftereffect, whereby subsequently presented sounds are heard with an unusually "metallic" timbre [Rosenblith et al., Science 106, 333-335 (1947)]. The mechanisms responsible for this auditory aftereffect are currently unknown. Whether the aftereffect is related to an alteration of the perception of temporal envelope fluctuations was evaluated. Detection thresholds for sinusoidal amplitude modulation (AM) imposed onto noise-burst carriers were measured for different AM frequencies (50-500 Hz), following the continuous presentation of a periodic pulse train, a temporally jittered pulse train, or an unmodulated noise. AM detection thresholds for AM frequencies of 100 Hz and above were significantly elevated compared to thresholds in quiet, following the presentation of the pulse-train inducers, and both induced a subjective auditory aftereffect. Unmodulated noise, which produced no audible aftereffect, left AM detection thresholds unchanged. Additional experiments revealed that, like the Rosenblith et al. aftereffect, the effect on AM thresholds does not transfer across ears, is not eliminated by protracted training, and can last several tens of seconds. The results suggest that the Rosenblith et al. aftereffect is related to a temporary alteration in the perception of fast temporal envelope fluctuations in sounds.

PMID: 18247896 [PubMed - indexed for MEDLINE]

Icon for HighWire Icon for PubMed Central Related Articles

Spectral completion of partially masked sounds.

Proc Natl Acad Sci U S A. 2008 Apr 15;105(15):5939-44

Authors: McDermott JH, Oxenham AJ

Abstract
Natural environments typically contain multiple sound sources. The sounds from these sources frequently overlap in time and often mask each other. Masking could potentially distort the representation of a sound's spectrum, altering its timbre and impairing object recognition. Here, we report that the auditory system partially corrects for the effects of masking in such situations, by using the audible, unmasked portions of an object's spectrum to fill in the inaudible portions. This spectral completion mechanism may help to achieve perceptual constancy and thus aid object recognition in complex auditory scenes.

PMID: 18391210 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Perception of suprathreshold amplitude modulation and intensity increments: Weber's law revisited.

J Acoust Soc Am. 2008 Apr;123(4):2220-36

Authors: Wojtczak M, Viemeister NF

Abstract
The perceived strength of intensity fluctuations evoked by suprathreshold sinusoidal amplitude modulation (AM) and the perceived size of intensity increments were compared across levels of a wideband noise and a 1-kHz tone. For the 1-kHz tone, the comparisons were made in quiet and in a high-pass noise. The data indicate that suprathreshold modulation depths and intensity increments, perceived as equivalent across levels, follow a pattern resembling Weber's law for noise and the "near miss" to Weber's law for a tone. The effect of a high-pass noise was largely consistent with that observed for AM and increment detection. The data suggest that Weber's law is not a direct consequence of the dependence of internal noise on stimulus level, as suggested by multiplicative internal noise models. Equal loudness ratios and equal loudness differences (computed using loudness for the stationary portions before and after the increment) accounted for the increment-matching data for noise and for the tone, respectively, but neither measure predicted the results for both types of stimuli. Predictions based on log-transformed excitation patterns and predictions using an equal number of intensity just-noticeable differences were in qualitative, but not quantitative, agreement with the data.

PMID: 18397028 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Estimates of compression at low and high frequencies using masking additivity in normal and impaired ears.

J Acoust Soc Am. 2008 Jun;123(6):4321-30

Authors: Plack CJ, Oxenham AJ, Simonson AM, O'Hanlon CG, Drga V, Arifianto D

Abstract
Auditory compression was estimated at 250 and 4000 Hz by using the additivity of forward masking technique, which measures the effects on signal threshold of combining two temporally nonoverlapping forward maskers. The increase in threshold in the combined-masker condition compared to the individual-masker conditions can be used to estimate compression. The signal was a 250 or 4000 Hz tone burst and the maskers (M1 and M2) were bands of noise. Signal thresholds were measured in the presence of M1 and M2 alone and combined for a range of masker levels. The results were used to derive response functions at each frequency. The procedure was conducted with normal-hearing and hearing-impaired listeners. The results suggest that the response function in normal ears is similar at 250 and 4000 Hz with a mid level compression exponent of about 0.2. However, compression extends over a smaller range of levels at 250 Hz. The results confirm previous estimates of compression using temporal masking curves (TMCs) without assuming a linear off-frequency reference as in the TMC procedure. The impaired ears generally showed less compression. Importantly, some impaired ears showed a linear response at 250 Hz, providing a further indication that low-frequency compression originates in the cochlea.

PMID: 18537383 [PubMed - indexed for MEDLINE]

Icon for Public Library of Science Icon for PubMed Central Related Articles

Neural correlates of auditory perceptual awareness under informational masking.

PLoS Biol. 2008 Jun 10;6(6):e138

Authors: Gutschalk A, Micheyl C, Oxenham AJ

Abstract
Our ability to detect target sounds in complex acoustic backgrounds is often limited not by the ear's resolution, but by the brain's information-processing capacity. The neural mechanisms and loci of this "informational masking" are unknown. We combined magnetoencephalography with simultaneous behavioral measures in humans to investigate neural correlates of informational masking and auditory perceptual awareness in the auditory cortex. Cortical responses were sorted according to whether or not target sounds were detected by the listener in a complex, randomly varying multi-tone background known to produce informational masking. Detected target sounds elicited a prominent, long-latency response (50-250 ms), whereas undetected targets did not. In contrast, both detected and undetected targets produced equally robust auditory middle-latency, steady-state responses, presumably from the primary auditory cortex. These findings indicate that neural correlates of auditory awareness in informational masking emerge between early and late stages of processing within the auditory cortex.

PMID: 18547141 [PubMed - indexed for MEDLINE]

Icon for Elsevier Science Icon for PubMed Central Related Articles

Music perception, pitch, and the auditory system.

Curr Opin Neurobiol. 2008 Aug;18(4):452-63

Authors: McDermott JH, Oxenham AJ

Abstract
The perception of music depends on many culture-specific factors, but is also constrained by properties of the auditory system. This has been best characterized for those aspects of music that involve pitch. Pitch sequences are heard in terms of relative as well as absolute pitch. Pitch combinations give rise to emergent properties not present in the component notes. In this review we discuss the basic auditory mechanisms contributing to these and other perceptual effects in music.

PMID: 18824100 [PubMed - indexed for MEDLINE]

Icon for PubMed Central Related Articles

Pitch perception and auditory stream segregation: implications for hearing loss and cochlear implants.

Trends Amplif. 2008 Dec;12(4):316-31

Authors: Oxenham AJ

Abstract
Pitch is important for speech and music perception, and may also play a crucial role in our ability to segregate sounds that arrive from different sources. This article reviews some basic aspects of pitch coding in the normal auditory system and explores the implications for pitch perception in people with hearing impairments and cochlear implants. Data from normal-hearing listeners suggest that the low-frequency, low-numbered harmonics within complex tones are of prime importance in pitch perception and in the perceptual segregation of competing sounds. The poorer frequency selectivity experienced by many hearing-impaired listeners leads to less access to individual harmonics, and the coding schemes currently employed in cochlear implants provide little or no representation of individual harmonics. These deficits in the coding of harmonic sounds may underlie some of the difficulties experienced by people with hearing loss and cochlear implants, and may point to future areas where sound representation in auditory prostheses could be improved.

PMID: 18974203 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Harmonic segregation through mistuning can improve fundamental frequency discrimination.

J Acoust Soc Am. 2008 Sep;124(3):1653-67

Authors: Bernstein JG, Oxenham AJ

Abstract
This study investigated the relationship between harmonic frequency resolution and fundamental frequency (f(0)) discrimination. Consistent with earlier studies, f(0) discrimination of a diotic bandpass-filtered harmonic complex deteriorated sharply as the f(0) decreased to the point where only harmonics above the tenth were presented. However, when the odd harmonics were mistuned by 3%, performance improved dramatically, such that performance nearly equaled that found with only even harmonics present. Mistuning also improved performance when alternating harmonics were presented to opposite ears (dichotic condition). In a task involving frequency discrimination of individual harmonics within the complexes, mistuning the odd harmonics yielded no significant improvement in the resolution of individual harmonics. Pitch matches to the mistuned complexes suggested that the even harmonics dominated the pitch for f(0)'s at which a benefit of mistuning was observed. The results suggest that f(0) discrimination performance can benefit from perceptual segregation based on inharmonicity, and that poor performance when only high-numbered harmonics are present is not due to limited peripheral harmonic resolvability. Taken together with earlier results, the findings suggest that f(0) discrimination may depend on auditory filter bandwidths, but that spectral resolution of individual harmonics is neither necessary nor sufficient for accurate f(0) discrimination.

PMID: 19045656 [PubMed - indexed for MEDLINE]

Icon for Atypon Icon for PubMed Central Related Articles

Is relative pitch specific to pitch?

Psychol Sci. 2008 Dec;19(12):1263-71

Authors: McDermott JH, Lehr AJ, Oxenham AJ

Abstract
Melodies, speech, and other stimuli that vary in pitch are processed largely in terms of the relative pitch differences between sounds. Relative representations permit recognition of pitch patterns despite variations in overall pitch level between instruments or speakers. A key component of relative pitch is the sequence of pitch increases and decreases from note to note, known as the melodic contour. Here we report that contour representations are also produced by patterns in loudness and brightness (an aspect of timbre). The representations of contours in different dimensions evidently have much in common, as contours in one dimension can be readily recognized in other dimensions. Moreover, contours in loudness and brightness are nearly as useful as pitch contours for recognizing familiar melodies that are normally conveyed via pitch. Our results indicate that relative representations via contour extraction are a general feature of the auditory system, and may have a common central locus.

PMID: 19121136 [PubMed - indexed for MEDLINE]

2007

2007

Icon for Atypon Icon for PubMed Central Related Articles

Cortical FMRI activation to sequences of tones alternating in frequency: relationship to perceived rate and streaming.

J Neurophysiol. 2007 Mar;97(3):2230-8

Authors: Wilson EC, Melcher JR, Micheyl C, Gutschalk A, Oxenham AJ

Abstract
Human listeners were functionally imaged while reporting their perception of sequences of alternating-frequency tone bursts separated by 0, 1/8, 1, or 20 semitones. Our goal was to determine whether functional magnetic resonance imaging (fMRI) activation of auditory cortex changes with frequency separation in a manner predictable from the perceived rate of the stimulus. At the null and small separations, the tones were generally heard as a single stream with a perceived rate equal to the physical tone presentation rate. fMRI activation in auditory cortex was appreciably phasic, showing prominent peaks at the sequence onset and offset. At larger-frequency separations, the higher- and lower-frequency tones perceptually separated into two streams, each with a rate equal to half the overall tone presentation rate. Under those conditions, fMRI activation in auditory cortex was more sustained throughout the sequence duration and was larger in magnitude and extent. Phasic to sustained changes in fMRI activation with changes in frequency separation and perceived rate are comparable to, and consistent with, those produced by changes in the physical rate of a sequence and are far greater than the effects produced by changing other physical stimulus variables, such as sound level or bandwidth. We suggest that the neural activity underlying the changes in fMRI activation with frequency separation contribute to the coding of the co-occurring changes in perceived rate and perceptual organization of the sound sequences into auditory streams.

PMID: 17202231 [PubMed - indexed for MEDLINE]

Icon for IEEE Engineering in Medicine and Biology Society Related Articles

A low-power asynchronous interleaved sampling algorithm for cochlear implants that encodes envelope and phase information.

IEEE Trans Biomed Eng. 2007 Jan;54(1):138-49

Authors: Sit JJ, Simonson AM, Oxenham AJ, Faltys MA, Sarpeshkar R

Abstract
Cochlear implants currently fail to convey phase information, which is important for perceiving music, tonal languages, and for hearing in noisy environments. We propose a bio-inspired asynchronous interleaved sampling (AIS) algorithm that encodes both envelope and phase information, in a manner that may be suitable for delivery to cochlear implant users. Like standard continuous interleaved sampling (CIS) strategies, AIS naturally meets the interleaved-firing requirement, which is to stimulate only one electrode at a time, minimizing electrode interactions. The majority of interspike intervals are distributed over 1-4 ms, thus staying within the absolute refractory limit of neurons, and form a more natural, pseudostochastic pattern of firing due to complex channel interactions. Stronger channels are selected to fire more often but the strategy ensures that weaker channels are selected to fire in proportion to their signal strength as well. The resulting stimulation rates are considerably lower than those of most modern implants, saving power yet delivering higher potential performance. Correlations with original sounds were found to be significantly higher in AIS reconstructions than in signal reconstructions using only envelope information. Two perceptual tests on normal-hearing listeners verified that the reconstructed signals enabled better melody and speech recognition in noise than those processed using tone-excited envelope-vocoder simulations of cochlear implant processing. Thus, our strategy could potentially save power and improve hearing performance in cochlear implant users.

PMID: 17260865 [PubMed - indexed for MEDLINE]

Icon for Elsevier Science Icon for PubMed Central Related Articles

The role of auditory cortex in the formation of auditory streams.

Hear Res. 2007 Jul;229(1-2):116-31

Authors: Micheyl C, Carlyon RP, Gutschalk A, Melcher JR, Oxenham AJ, Rauschecker JP, Tian B, Courtenay Wilson E

Abstract
Auditory streaming refers to the perceptual parsing of acoustic sequences into "streams", which makes it possible for a listener to follow the sounds from a given source amidst other sounds. Streaming is currently regarded as an important function of the auditory system in both humans and animals, crucial for survival in environments that typically contain multiple sound sources. This article reviews recent findings concerning the possible neural mechanisms behind this perceptual phenomenon at the level of the auditory cortex. The first part is devoted to intra-cortical recordings, which provide insight into the neural "micromechanisms" of auditory streaming in the primary auditory cortex (A1). In the second part, recent results obtained using functional magnetic resonance imaging (fMRI) and magnetoencephalography (MEG) in humans, which suggest a contribution from cortical areas other than A1, are presented. Overall, the findings concur to demonstrate that many important features of sequential streaming can be explained relatively simply based on neural responses in the auditory cortex.

PMID: 17307315 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Across-frequency pitch discrimination interference between complex tones containing resolved harmonics.

J Acoust Soc Am. 2007 Mar;121(3):1621-31

Authors: Micheyl C, Oxenham AJ

Abstract
Pitch discrimination interference (PDI) refers to an impairment in the ability to discriminate changes in the fundamental frequency (F0) of a target harmonic complex, caused by another harmonic complex (the interferer) presented simultaneously in a remote spectral region. So far, PDI has been demonstrated for target complexes filtered into a higher spectral region than the interferer and containing no peripherally resolved harmonics in their passband. Here, it is shown that PDI also occurs when the target harmonic complex contains resolved harmonics in its passband (experiment 1). PDI was also observed when the target was filtered into a lower spectral region than that of the interferer (experiment 2), revealing that differences in relative harmonic dominance and pitch salience between the simultaneous target and the interferer, as confirmed using pitch matches (experiment 3), do not entirely explain PDI. When the target was in the higher spectral region, and the F0 separation between the target and the interferer was around 7% or 10%, dramatic PDI effects were observed despite the relatively large FO separation between the two sequential targets (14%-20%). Overall, the results suggest that PDI is more general than previously thought, and is not limited to targets consisting only of unresolved harmonics.

PMID: 17407899 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Evaluation of companding-based spectral enhancement using simulated cochlear-implant processing.

J Acoust Soc Am. 2007 Mar;121(3):1709-16

Authors: Oxenham AJ, Simonson AM, Turicchia L, Sarpeshkar R

Abstract
This study tested a time-domain spectral enhancement algorithm that was recently proposed by Turicchia and Sarpeshkar [IEEE Trans. Speech Audio Proc. 13, 243-253 (2005)]. The algorithm uses a filter bank, with each filter channel comprising broadly tuned amplitude compression, followed by more narrowly tuned expansion (companding). Normal-hearing listeners were tested in their ability to recognize sentences processed through a noise-excited envelope vocoder that simulates aspects of cochlear-implant processing. The sentences were presented in a steady background noise at signal-to-noise ratios of 0, 3, and 6 dB and were either passed directly through an envelope vocoder, or were first processed by the companding algorithm. Using an eight-channel envelope vocoder, companding produced small but significant improvements in speech reception. Parametric variations of the companding algorithm showed that the improvement in intelligibility was robust to changes in filter tuning, whereas decreases in the time constants resulted in a decrease in intelligibility. Companding continued to provide a benefit when the number of vocoder frequency channels was increased to sixteen. When integrated within a sixteen-channel cochlear-implant simulator, companding also led to significant improvements in sentence recognition. Thus, companding may represent a readily implementable way to provide some speech recognition benefits to current cochlear-implant users.

PMID: 17407907 [PubMed - indexed for MEDLINE]

Icon for HighWire Icon for PubMed Central Related Articles

A sound element gets lost in perceptual competition.

Proc Natl Acad Sci U S A. 2007 Jul 17;104(29):12223-7

Authors: Shinn-Cunningham BG, Lee AK, Oxenham AJ

Abstract
Our ability to understand auditory signals depends on properly separating the mixture of sound arriving from multiple sources. Sound elements tend to belong to only one object at a time, consistent with the principle of disjoint allocation, although there are instances of duplex perception or coallocation, in which two sound objects share one sound element. Here we report an effect of "nonallocation," in which a sound element "disappears" when two ongoing objects compete for its ownership. When a target tone is presented either as one of a sequence of tones or simultaneously with a harmonic vowel complex, it is heard as part of the corresponding object. However, depending on the spatial configuration of the scene, if the target, the tones, and the vowel are all presented together, the target may not be perceived in either the tones or the vowel, even though it is not perceived as a separate entity. This finding suggests an asymmetry in the strength of the perceptual evidence required to reject vs. to include an element within the auditory foreground, a result with important implications for how we process complex auditory scenes containing ambiguous information.

PMID: 17615235 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

A further test of the linearity of temporal summation in forward masking.

J Acoust Soc Am. 2007 Oct;122(4):1880-3

Authors: Plack CJ, Carcagno S, Oxenham AJ

Abstract
An experiment tested the hypothesis that the masking effects of two nonoverlapping forward maskers are summed linearly over time. First, the levels of individual noise maskers required to mask a brief 4-kHz signal presented at 10-, 20-, 30-, or 40-dB sensation level (SL) were found. The hypothesis predicts that a combination of the first masker presented at the level required to mask the 10-dB SL signal and the second masker presented at the level required to mask the 20-dB SL signal, should produce the same amount of masking as the converse situation (i.e., the first masker presented at the level required to mask the 20-dB SL signal and the second masker presented at the level required to mask the 10-dB SL signal), and similarly for the 30- and 40-dB SL signals. The results were consistent with the predictions.

PMID: 17902824 [PubMed - indexed for MEDLINE]

Icon for HighWire Related Articles

Human cortical activity during streaming without spectral cues suggests a general neural substrate for auditory stream segregation.

J Neurosci. 2007 Nov 28;27(48):13074-81

Authors: Gutschalk A, Oxenham AJ, Micheyl C, Wilson EC, Melcher JR

Abstract
The brain continuously disentangles competing sounds, such as two people speaking, and assigns them to distinct streams. Neural mechanisms have been proposed for streaming based on gross spectral differences between sounds, but not for streaming based on other nonspectral features. Here, human listeners were presented with sequences of harmonic complex tones that had identical spectral envelopes, and unresolved spectral fine structure, but one of two fundamental frequencies (f0) and pitches. As the f0 difference between tones increased, listeners perceived the tones as being segregated into two streams (one stream for each f0) and cortical activity measured with functional magnetic resonance imaging and magnetoencephalography increased. This trend was seen in primary cortex of Heschl's gyrus and in surrounding nonprimary areas. The results strongly resemble those for pure tones. Both the present and pure tone results may reflect neuronal forward suppression that diminishes as one or more features of successive sounds become increasingly different. We hypothesize that feature-specific forward suppression subserves streaming based on diverse perceptual cues and results in explicit neural representations for auditory streams within auditory cortex.

PMID: 18045901 [PubMed - indexed for MEDLINE]

2006

2006

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Level dependence of auditory filters in nonsimultaneous masking as a function of frequency.

J Acoust Soc Am. 2006 Jan;119(1):444-53

Authors: Oxenham AJ, Simonson AM

Abstract
Auditory filter bandwidths were measured using nonsimultaneous masking, as a function of signal level between 10 and 35 dB SL for signal frequencies of 1, 2, 4, and 6 kHz. The brief sinusoidal signal was presented in a temporal gap within a spectrally notched noise. Two groups of normal-hearing subjects were tested, one using a fixed masker level and adaptively varying signal level, the other using a fixed signal level and adaptively varying masker level. In both cases, auditory filters were derived by assuming a constant filter shape for a given signal level. The filter parameters derived from the two paradigms were not significantly different. At 1 kHz, the equivalent rectangular bandwidth (ERB) decreased as the signal level increased from 10 to 20 dB SL, after which it remained roughly constant. In contrast, at 6 kHz, the ERB increased consistently with signal levels from 10 to 35 dB SL. The results at 2 and 4 kHz were intermediate, showing no consistent change in ERB with signal level. Overall, the results suggest changes in the level dependence of the auditory filters at frequencies above 1 kHz that are not currently incorporated in models of human auditory filter tuning.

PMID: 16454299 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Effects of introducing unprocessed low-frequency information on the reception of envelope-vocoder processed speech.

J Acoust Soc Am. 2006 Apr;119(4):2417-26

Authors: Qin MK, Oxenham AJ

Abstract
This study investigated the benefits of adding unprocessed low-frequency information to acoustic simulations of cochlear-implant processing in normal-hearing listeners. Implant processing was simulated using an eight-channel noise-excited envelope vocoder, and low-frequency information was added by replacing the lower frequency channels of the processor with a low-pass-filtered version of the original stimulus. Experiment 1 measured sentence-level speech reception as a function of target-to-masker ratio, with either steady-state speech-shaped noise or single-talker maskers. Experiment 2 measured listeners' ability to identify two vowels presented simultaneously, as a function of the F0 difference between the two vowels. In both experiments low-frequency information was added below either 300 or 600 Hz. The introduction of the additional low-frequency information led to substantial and significant improvements in performance in both experiments, with a greater improvement observed for the higher (600 Hz) than for the lower (300 Hz) cutoff frequency. However, performance never equaled performance in the unprocessed conditions. The results confirm other recent demonstrations that added low-frequency information can provide significant benefits in intelligibility, which may at least in part be attributed to improvements in F0 representation. The findings provide further support for efforts to make use of residual acoustic hearing in cochlear-implant users.

PMID: 16642854 [PubMed - indexed for MEDLINE]

Icon for Elsevier Science Related Articles

Influence of musical and psychoacoustical training on pitch discrimination.

Hear Res. 2006 Sep;219(1-2):36-47

Authors: Micheyl C, Delhommeau K, Perrot X, Oxenham AJ

Abstract
This study compared the influence of musical and psychoacoustical training on auditory pitch discrimination abilities. In a first experiment, pitch discrimination thresholds for pure and complex tones were measured in 30 classical musicians and 30 non-musicians, none of whom had prior psychoacoustical training. The non-musicians' mean thresholds were more than six times larger than those of the classical musicians initially, and still about four times larger after 2h of training using an adaptive two-interval forced-choice procedure; this difference is two to three times larger than suggested by previous studies. The musicians' thresholds were close to those measured in earlier psychoacoustical studies using highly trained listeners, and showed little improvement with training; this suggests that classical musical training can lead to optimal or nearly optimal pitch discrimination performance. A second experiment was performed to determine how much additional training was required for the non-musicians to obtain thresholds as low as those of the classical musicians from experiment 1. Eight new non-musicians with no prior training practiced the frequency discrimination task for a total of 14 h. It took between 4 and 8h of training for their thresholds to become as small as those measured in the classical musicians from experiment 1. These findings supplement and qualify earlier data in the literature regarding the respective influence of musical and psychoacoustical training on pitch discrimination performance.

PMID: 16839723 [PubMed - indexed for MEDLINE]

Icon for HighWire Icon for PubMed Central Related Articles

Masking by inaudible sounds and the linearity of temporal summation.

J Neurosci. 2006 Aug 23;26(34):8767-73

Authors: Plack CJ, Oxenham AJ, Drga V

Abstract
Many natural sounds, including speech and animal vocalizations, involve rapid sequences that vary in spectrum and amplitude. Each sound within a sequence has the potential to affect the audibility of subsequent sounds in a process known as forward masking. Little is known about the neural mechanisms underlying forward masking, particularly in more realistic situations in which multiple sounds follow each other in rapid succession. A parsimonious hypothesis is that the effects of consecutive sounds combine linearly, so that the total masking effect is a simple sum of the contributions from the individual maskers. The experiment reported here tests a counterintuitive prediction of this linear-summation hypothesis, namely that a sound that itself is inaudible should, under certain circumstances, affect the audibility of subsequent sounds. The results show that, when two forward maskers are combined, the second of the two maskers can continue to produce substantial masking, even when it is completely masked by the first masker. Thus, inaudible sounds can affect the perception of subsequent sounds. A model incorporating instantaneous compression (reflecting the nonlinear response of the basilar membrane in the cochlea), followed by linear summation of the effects of the maskers, provides a good account of the data. Despite the presence of multiple sources of nonlinearity in the auditory system, masking effects by sequential sounds combine in a manner that is well captured by a time-invariant linear system.

PMID: 16928865 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Detection and F0 discrimination of harmonic complex tones in the presence of competing tones or noise.

J Acoust Soc Am. 2006 Sep;120(3):1493-505

Authors: Micheyl C, Bernstein JG, Oxenham AJ

Abstract
Normal-hearing listeners' ability to "hear out" the pitch of a target harmonic complex tone (HCT) was tested with simultaneous HCT or noise maskers, all bandpass-filtered into the same spectral region (1200-3600 Hz). Target-to-masker ratios (TMRs) necessary to discriminate fixed fundamental-frequency (F0) differences were measured for target F0s between 100 and 400 Hz. At high F0s (400 Hz), asynchronous gating of masker and signal, presenting the masker in a different F0 range, and reducing the F0 rove of the masker, all resulted in improved performance. At the low F0s (100 Hz), none of these manipulations improved performance significantly. The findings are generally consistent with the idea that the ability to segregate sounds based on cues such as F0 differences and onset/offset asynchronies can be strongly limited by peripheral harmonic resolvability. However, some cases were observed where perceptual segregation appeared possible, even when no peripherally resolved harmonics were present in the mixture of target and masker. A final experiment, comparing TMRs necessary for detection and F0 discrimination, showed that F0 discrimination of the target was possible with noise maskers at only a few decibels above detection threshold, whereas similar performance with HCT maskers was only possible 15-25 dB above detection threshold.

PMID: 17004471 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

The relationship between frequency selectivity and pitch discrimination: effects of stimulus level.

J Acoust Soc Am. 2006 Dec;120(6):3916-28

Authors: Bernstein JG, Oxenham AJ

Abstract
Three experiments tested the hypothesis that fundamental frequency (fo) discrimination depends on the resolvability of harmonics within a tone complex. Fundamental frequency difference limens (fo DLs) were measured for random-phase harmonic complexes with eight fo's between 75 and 400 Hz, bandpass filtered between 1.5 and 3.5 kHz, and presented at 12.5-dB/component average sensation level in threshold equalizing noise with levels of 10, 40, and 65 dB SPL per equivalent rectangular auditory filter bandwidth. With increasing level, the transition from large (poor) to small (good) fo DLs shifted to a higher fo. This shift corresponded to a decrease in harmonic resolvability, as estimated in the same listeners with excitation patterns derived from measures of auditory filter shape and with a more direct measure that involved hearing out individual harmonics. The results are consistent with the idea that resolved harmonics are necessary for good fo discrimination. Additionally, fo DLs for high fo's increased with stimulus level in the same way as pure-tone frequency DLs, suggesting that for this frequency range, the frequencies of harmonics are more poorly encoded at higher levels, even when harmonics are well resolved.

PMID: 17225419 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

The relationship between frequency selectivity and pitch discrimination: sensorineural hearing loss.

J Acoust Soc Am. 2006 Dec;120(6):3929-45

Authors: Bernstein JG, Oxenham AJ

Abstract
This study tested the relationship between frequency selectivity and the minimum spacing between harmonics necessary for accurate fo discrimination. Fundamental frequency difference limens (fo DLs) were measured for ten listeners with moderate sensorineural hearing loss (SNHL) and three normal-hearing listeners for sine- and random-phase harmonic complexes, bandpass filtered between 1500 and 3500 Hz, with fo's ranging from 75 to 500 Hz (or higher). All listeners showed a transition between small (good) fo DLs at high fo's and large (poor) fo DLs at low fo's, although the fo at which this transition occurred (fo,tr) varied across listeners. Three measures thought to reflect frequency selectivity were significantly correlated to both the fo,tr and the minimum fo DL achieved at high fo's: (1) the maximum fo for which fo DLs were phase dependent, (2) the maximum modulation frequency for which amplitude modulation and quasi-frequency modulation were discriminable, and (3) the equivalent rectangular bandwidth of the auditory filter, estimated using the notched-noise method. These results provide evidence of a relationship between fo discrimination performance and frequency selectivity in listeners with SNHL, supporting "spectral" and "spectro-temporal" theories of pitch perception that rely on sharp tuning in the auditory periphery to accurately extract fo information.

PMID: 17225420 [PubMed - indexed for MEDLINE]

2005

2005

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Estimates of auditory filter phase response at and below characteristic frequency.

J Acoust Soc Am. 2005 Apr;117(4 Pt 1):1713-6

Authors: Oxenham AJ, Ewert SD

PMID: 15898618 [PubMed - indexed for MEDLINE]

Icon for HighWire Icon for PubMed Central Related Articles

Neuromagnetic correlates of streaming in human auditory cortex.

J Neurosci. 2005 Jun 01;25(22):5382-8

Authors: Gutschalk A, Micheyl C, Melcher JR, Rupp A, Scherg M, Oxenham AJ

Abstract
The brain is constantly faced with the challenge of organizing acoustic input from multiple sound sources into meaningful auditory objects or perceptual streams. The present study examines the neural bases of auditory stream formation using neuromagnetic and behavioral measures. The stimuli were sequences of alternating pure tones, which can be perceived as either one or two streams. In the first experiment, physical stimulus parameters were varied between values that promoted the perceptual grouping of the tone sequence into one coherent stream and values that promoted its segregation into two streams. In the second experiment, an ambiguous tone sequence produced a bistable percept that switched spontaneously between one- and two-stream percepts. The first experiment demonstrated a strong correlation between listeners' perception and long-latency (>60 ms) activity that likely arises in nonprimary auditory cortex. The second demonstrated a covariation between this activity and listeners' perception in the absence of physical stimulus changes. Overall, the results indicate a tight coupling between auditory cortical activity and streaming perception, suggesting that an explicit representation of auditory streams may be maintained within nonprimary auditory areas.

PMID: 15930387 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Comparing different estimates of cochlear compression in listeners with normal and impaired hearing.

J Acoust Soc Am. 2005 May;117(5):3028-41

Authors: Rosengard PS, Oxenham AJ, Braida LD

Abstract
A loss of cochlear compression may underlie many of the difficulties experienced by hearing-impaired listeners. Two behavioral forward-masking paradigms that have been used to estimate the magnitude of cochlear compression are growth of masking (GOM) and temporal masking (TM). The aim of this study was to determine whether these two measures produce within-subjects results that are consistent across a range of signal frequencies and, if so, to compare them in terms of reliability or efficiency. GOM and TM functions were measured in a group of five normal-hearing and five hearing-impaired listeners at signal frequencies of 1000, 2000, and 4000 Hz. Compression values were derived from the masking data and confidence intervals were constructed around these estimates. Both measures produced comparable estimates of compression, but both measures have distinct advantages and disadvantages, so that the more appropriate measure depends on factors such as the frequency region of interest and the degree of hearing loss. Because of the long testing times needed, neither measure is suitable for clinical use in its current form.

PMID: 15957772 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

An autocorrelation model with place dependence to account for the effect of harmonic number on fundamental frequency discrimination.

J Acoust Soc Am. 2005 Jun;117(6):3816-31

Authors: Bernstein JG, Oxenham AJ

Abstract
Fundamental frequency (f0) difference limens (DLs) were measured as a function of f0 for sine- and random-phase harmonic complexes, bandpass filtered with 3-dB cutoff frequencies of 2.5 and 3.5 kHz (low region) or 5 and 7 kHz (high region), and presented at an average 15 dB sensation level (approximately 48 dB SPL) per component in a wideband background noise. Fundamental frequencies ranged from 50 to 300 Hz and 100 to 600 Hz in the low and high spectral regions, respectively. In each spectral region, f0 DLs improved dramatically with increasing f0 as approximately the tenth harmonic appeared in the passband. Generally, f0 DLs for complexes with similar harmonic numbers were similar in the two spectral regions. The dependence of f0 discrimination on harmonic number presents a significant challenge to autocorrelation (AC) models of pitch, in which predictions generally depend more on spectral region than harmonic number. A modification involving a "lag window"is proposed and tested, restricting the AC representation to a limited range of lags relative to each channel's characteristic frequency. This modified unitary pitch model was able to account for the dependence of f0 DLs on harmonic number, although this correct behavior was not based on peripheral harmonic resolvability.

PMID: 16018484 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Icon for PubMed Central Related Articles

Comparing F0 discrimination in sequential and simultaneous conditions.

J Acoust Soc Am. 2005 Jul;118(1):41-4

Authors: Micheyl C, Oxenham AJ

Abstract
In an influential study, Carlyon and Shackleton [J. Acoust. Soc. Am. 95, 3541-3554 (1994)] measured listeners' performance (d') in fundamental-frequency (F0) discrimination between harmonic complex tones (HCTs) presented simultaneously in different spectral regions and compared their performance with that found in a sequential-comparison task. In this Letter, it is suggested that Carlyon and Shackleton's analysis of the simultaneous-comparison data did not adequately reflect their assumption that listeners were effectively comparing F0's across regions. A reanalysis consistent with this assumption is described. The new results suggest that under the assumption that listeners were effectively comparing F0 across regions, their performance in this task was substantially higher than originally estimated by Carlyon and Shackleton, and in some conditions much higher than expected from the performances measured in a traditional F0-discrimination task with sequential HCTs. Possible explanations for this outcome, as well as alternative interpretations, are proposed.

PMID: 16119327 [PubMed - indexed for MEDLINE]

Icon for Wolters Kluwer Related Articles

Effects of envelope-vocoder processing on F0 discrimination and concurrent-vowel identification.

Ear Hear. 2005 Oct;26(5):451-60

Authors: Qin MK, Oxenham AJ

Abstract
OBJECTIVE: The aim of this study was to examine the effects of envelope-vocoder sound processing on listeners' ability to discriminate changes in fundamental frequency (F0) in anechoic and reverberant conditions and on their ability to identify concurrent vowels based on differences in F0.
DESIGN: In the first experiment, F0 difference limens (F0DLs) were measured as a function of number of envelope-vocoder frequency channels (1, 4, 8, 24, and 40 channels, and unprocessed) in four normal-hearing listeners, with degree of simulated reverberation (no, mild, and severe reverberation) as a parameter. In the second experiment, vowel identification was measured as a function of the F0 difference between two simultaneous vowels in six normal-hearing listeners, with the number of vocoder channels (8 and 24 channels, and unprocessed) as a parameter.
RESULTS: Reverberation was detrimental to F0 discrimination in conditions with fewer numbers of vocoder channels. Despite the reasonable F0DLs (<1 semitone) with 24- and 8-channel vocoder processing, listeners were unable to benefit from F0 differences between the competing vowels in the concurrent-vowel paradigm.
CONCLUSIONS: The overall detrimental effects of vocoder processing are probably are due to the poor spectral representation of the lower-order harmonics. The F0 information carried in the temporal envelope is weak, susceptible to reverberation, and may not suffice for source segregation. To the extent that vocoder processing simulates cochlear implant processing, users of current implant processing schemes are unlikely to benefit from F0 differences between competing talkers when listening to speech in complex environments. The results provide further incentive for finding a way to make the information from low-order, resolved harmonics available to cochlear implant users.

PMID: 16230895 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Forward masking of amplitude modulation: basic characteristics.

J Acoust Soc Am. 2005 Nov;118(5):3198-210

Authors: Wojtczak M, Viemeister NF

Abstract
In this study we demonstrate an effect for amplitude modulation (AM) that is analogous to forward making of audio frequencies, i.e., the modulation threshold for detection of AM (signal) is raised by preceding AM (masker). In the study we focused on the basic characteristics of the forward-masking effect. Functions representing recovery from AM forward masking measured with a 150- ms 40- Hz masker AM and a 50- ms signal AM of the same rate imposed on the same broadband-noise carrier, showed an exponential decay of forward masking with increasing delay from masker offset. Thresholds remained elevated by more than 2 dB over an interval of at least 150 ms following the masker. Masked-threshold patterns, measured with a fixed signal rate (20, 40, and 80 Hz) and a variable masker rate, showed tuning of the AM forward-masking effect. The tuning was approximately constant across signal modulation rates used and consistent with the idea of modulation-rate selective channels. Combining two equally effective forward maskers of different frequencies did not lead to an increase in forward masking relative to that produced by either component alone. Overall, the results are consistent with modulation-rate selective neural channels that adapt and recover from the adaptation relatively quickly.

PMID: 16334900 [PubMed - indexed for MEDLINE]

2004

2004

Icon for HighWire Icon for PubMed Central Related Articles

Correct tonotopic representation is necessary for complex pitch perception.

Proc Natl Acad Sci U S A. 2004 Feb 03;101(5):1421-5

Authors: Oxenham AJ, Bernstein JG, Penagos H

Abstract
The ability to extract a pitch from complex harmonic sounds, such as human speech, animal vocalizations, and musical instruments, is a fundamental attribute of hearing. Some theories of pitch rely on the frequency-to-place mapping, or tonotopy, in the inner ear (cochlea), but most current models are based solely on the relative timing of spikes in the auditory nerve. So far, it has proved to be difficult to distinguish between these two possible representations, primarily because temporal and place information usually covary in the cochlea. In this study, "transposed stimuli" were used to dissociate temporal from place information. By presenting the temporal information of low-frequency sinusoids to locations in the cochlea tuned to high frequencies, we found that human subjects displayed poor pitch perception for single tones. More importantly, none of the subjects was able to extract the fundamental frequency from multiple low-frequency harmonics presented to high-frequency regions of the cochlea. The experiments demonstrate that tonotopic representation is crucial to complex pitch perception and provide a new tool in the search for the neural basis of pitch.

PMID: 14718671 [PubMed - indexed for MEDLINE]

Icon for HighWire Icon for PubMed Central Related Articles

A neural representation of pitch salience in nonprimary human auditory cortex revealed with functional magnetic resonance imaging.

J Neurosci. 2004 Jul 28;24(30):6810-5

Authors: Penagos H, Melcher JR, Oxenham AJ

Abstract
Pitch, one of the primary auditory percepts, is related to the temporal regularity or periodicity of a sound. Previous functional brain imaging work in humans has shown that the level of population neural activity in centers throughout the auditory system is related to the temporal regularity of a sound, suggesting a possible relationship to pitch. In the current study, functional magnetic resonance imaging was used to measure activation in response to harmonic tone complexes whose temporal regularity was identical, but whose pitch salience (or perceptual pitch strength) differed, across conditions. Cochlear nucleus, inferior colliculus, and primary auditory cortex did not show significant differences in activation level between conditions. Instead, a correlate of pitch salience was found in the neural activity levels of a small, spatially localized region of nonprimary auditory cortex, overlapping the anterolateral end of Heschl's gyrus. The present data contribute to converging evidence that anterior areas of nonprimary auditory cortex play an important role in processing pitch.

PMID: 15282286 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Masker phase effects in normal-hearing and hearing-impaired listeners: evidence for peripheral compression at low signal frequencies.

J Acoust Soc Am. 2004 Oct;116(4 Pt 1):2248-57

Authors: Oxenham AJ, Dau T

Abstract
The presence of cochlear-based compression at low frequencies was investigated by measuring phase effects in harmonic maskers. In normal-hearing listeners, the amount of masking produced depends strongly on the phase relationships between the individual masker components. This effect is thought to be determined primarily by properties of the cochlea, including the phase dispersion and compressive input-output function of the basilar membrane. Thresholds for signals of 250 and 1000 Hz were measured in harmonic maskers with fundamental frequencies of 12.5 and 100 Hz as a function of the masker phase curvature. Results from 12 listeners with sensorineural hearing loss showed reduced masker phase effects, when compared with data from normal-hearing listeners, at both 250- and 1000-Hz signal frequencies. The effects of hearing impairment on phase-related masking differences were not well simulated in normal-hearing listeners by an additive white noise, suggesting that the effects of hearing impairment are not simply due to reduced sensation level. Maximum differences in masked threshold were correlated with auditory filter bandwidths at the respective frequencies, suggesting that both measures are affected by a common underlying mechanism, presumably related to cochlear outer hair cell function. The results also suggest that normal peripheral compression remains strong even at 250 Hz.

PMID: 15532656 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Sequential F0 comparisons between resolved and unresolved harmonics: no evidence for translation noise between two pitch mechanisms.

J Acoust Soc Am. 2004 Nov;116(5):3038-50

Authors: Micheyl C, Oxenham AJ

Abstract
Carlyon and Shackleton [J. Acoust. Soc. Am. 95, 3541-3554 (1994)] suggested that fundamental-frequency (F0) discrimination performance between resolved and unresolved harmonics is limited by an internal "translation" noise between the outputs of two distinct F0 encoding mechanisms, in addition to the encoding noise associated with each mechanism. To test this hypothesis further, F0 difference limens (DLF0s) were measured in six normal-hearing listeners using sequentially presented groups of harmonics. The two groups of harmonics presented on each trial were bandpass filtered into the same or different spectral regions, in such a way that both groups contained mainly resolved harmonics, both groups contained only unresolved harmonics, or one group contained mainly resolved and the other only unresolved harmonics. Three spectral regions (low: 600-1150 Hz, mid: 1400-2500 Hz, or high: 3000-5250 Hz) and two nominal F0s (100 and 200 Hz) were used. The DLF0s measured in across-region conditions were well accounted for by a model assuming only two sources of internal noise: the encoding noise estimated on the basis of the within-region results plus a constant noise associated with F0 comparisons across different spectral regions, independent of resolvability. No evidence for an across-pitch-mechanism translation noise was found. A reexamination of previous evidence for the existence of such noise suggests that the present negative outcome is unlikely to be explained by insufficient measurement sensitivity or an unusually large across-region comparison noise in the present study. While the results do not rule out the possibility of two separate pitch mechanisms, they indicate that the F0s of sequentially presented resolved and unresolved harmonics can be compared internally at no or negligible extra cost.

PMID: 15603149 [PubMed - indexed for MEDLINE]

2003

2003

Icon for American Institute of Physics Related Articles

Pitch discrimination of diotic and dichotic tone complexes: harmonic resolvability or harmonic number?

J Acoust Soc Am. 2003 Jun;113(6):3323-34

Authors: Bernstein JG, Oxenham AJ

Abstract
Three experiments investigated the relationship between harmonic number, harmonic resolvability, and the perception of harmonic complexes. Complexes with successive equal-amplitude sine- or random-phase harmonic components of a 100- or 200-Hz fundamental frequency (f0) were presented dichotically, with even and odd components to opposite ears, or diotically, with all harmonics presented to both ears. Experiment 1 measured performance in discriminating a 3.5%-5% frequency difference between a component of a harmonic complex and a pure tone in isolation. Listeners achieved at least 75% correct for approximately the first 10 and 20 individual harmonics in the diotic and dichotic conditions, respectively, verifying that only processes before the binaural combination of information limit frequency selectivity. Experiment 2 measured fundamental frequency difference limens (f0 DLs) as a function of the average lowest harmonic number. Similar results at both f0's provide further evidence that harmonic number, not absolute frequency, underlies the order-of-magnitude increase observed in f0 DLs when only harmonics above about the 10th are presented. Similar results under diotic and dichotic conditions indicate that the auditory system, in performing f0 discrimination, is unable to utilize the additional peripherally resolved harmonics in the dichotic case. In experiment 3, dichotic complexes containing harmonics below the 12th, or only above the 15th, elicited pitches of the f0 and twice the f0, respectively. Together, experiments 2 and 3 suggest that harmonic number, regardless of peripheral resolvability, governs the transition between two different pitch percepts, one based on the frequencies of individual resolved harmonics and the other based on the periodicity of the temporal envelope.

PMID: 12822804 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Intensity discrimination and increment detection in cochlear-implant users.

J Acoust Soc Am. 2003 Jul;114(1):396-407

Authors: Wojtczak M, Donaldson GS, Viemeister NF

Abstract
Intensity difference limens (DLs) were measured in users of the Nucleus 22 and Clarion v1.2 cochlear implants and in normal-hearing listeners to better understand mechanisms of intensity discrimination in electric and acoustic hearing and to evaluate the possible role of neural adaptation. Intensity DLs were measured for three modes of presentation: gated (intensity increments gated synchronously with the pedestal), fringe (intensity increments delayed 250 or 650 ms relative to the onset of the pedestal), and continuous (intensity increments occur in the presence of a pedestal that is played throughout the experimental run). Stimuli for cochlear-implant listeners were trains of biphasic pulses; stimuli for normal-hearing listeners were a 1-kHz tone and a wideband noise. Clarion cochlear-implant listeners showed level-dependent effects of presentation mode. At low pedestal levels, gated thresholds were generally similar to thresholds obtained in the fringe and continuous conditions. At higher pedestal levels, however, the fringe and continuous conditions produced smaller intensity DLs than the gated condition, similar to the gated-continuous difference in intensity DLs observed in acoustic hearing. Nucleus cochlear-implant listeners did not show consistent threshold differences for the gated and fringe conditions, and were not tested in the continuous condition. It is not clear why a difference between gated and fringe thresholds occurred for the Clarion but not the Nucleus subjects. Normal-hearing listeners showed improved thresholds for the continuous condition relative to the gated condition, but the effect was larger for the 1-kHz tonal carrier than for the noise carrier. Findings suggest that adaptation occurring central to the inner hair cell synapse mediates the gated-continuous difference observed in Clarion cochlear-implant listeners and may also contribute to the gated-continuous difference in acoustic hearing.

PMID: 12880051 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Effects of simulated cochlear-implant processing on speech reception in fluctuating maskers.

J Acoust Soc Am. 2003 Jul;114(1):446-54

Authors: Qin MK, Oxenham AJ

Abstract
This study investigated the effects of simulated cochlear-implant processing on speech reception in a variety of complex masking situations. Speech recognition was measured as a function of target-to-masker ratio, processing condition (4, 8, 24 channels, and unprocessed) and masker type (speech-shaped noise, amplitude-modulated speech-shaped noise, single male talker, and single female talker). The results showed that simulated implant processing was more detrimental to speech reception in fluctuating interference than in steady-state noise. Performance in the 24-channel processing condition was substantially poorer than in the unprocessed condition, despite the comparable representation of the spectral envelope. The detrimental effects of simulated implant processing in fluctuating maskers, even with large numbers of channels, may be due to the reduction in the pitch cues used in sound source segregation, which are normally carried by the peripherally resolved low-frequency harmonics and the temporal fine structure. The results suggest that using steady-state noise to test speech intelligibility may underestimate the difficulties experienced by cochlear-implant users in fluctuating acoustic backgrounds.

PMID: 12880055 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Suprathreshold effects of adaptation produced by amplitude modulation.

J Acoust Soc Am. 2003 Aug;114(2):991-7

Authors: Wojtczak M, Viemeister NF

Abstract
This work extends the study of adaptation to amplitude modulation (AM) to the perception of highly detectable modulation. A fixed-level matching procedure was used to find perceptually equivalent modulation depths for 16-Hz modulation imposed on a 1-kHz standard and a 4-kHz comparison. The modulation depths in the two stimuli were compared before and after a 10-min exposure to a 1-kHz tone (adaptor) 100% modulated in amplitude at different rates. For modulation depths of 63% (20 log m = -4) and smaller, the perceived modulation depth was reduced after exposure to the adaptor that was modulated at the same rate as the standard. The size of this reduction expressed as a difference between the post- and pre-exposure AM depths was similar to the increase in AM-detection threshold observed after adaptation. Postexposure suprathreshold modulation depth was not appreciably reduced when the modulation depth of the standard was large (approached 100%). A much smaller or no reduction in the perceived modulation depth was also observed when the modulation rates of the adaptor and the standard tone were different. The tuning of the observed effect of the adaptor appears to be much sharper than the tuning shown by modulation-masking results.

PMID: 12942978 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Informational masking and musical training.

J Acoust Soc Am. 2003 Sep;114(3):1543-9

Authors: Oxenham AJ, Fligor BJ, Mason CR, Kidd G

Abstract
The relationship between musical training and informational masking was studied for 24 young adult listeners with normal hearing. The listeners were divided into two groups based on musical training. In one group, the listeners had little or no musical training; the other group was comprised of highly trained, currently active musicians. The hypothesis was that musicians may be less susceptible to informational masking, which is thought to reflect central, rather than peripheral, limitations on the processing of sound. Masked thresholds were measured in two conditions, similar to those used by Kidd et al. [J. Acoust. Soc. Am. 95, 3475-3480 (1994)]. In both conditions the signal was comprised of a series of repeated tone bursts at 1 kHz. The masker was comprised of a series of multitone bursts, gated with the signal. In one condition the frequencies of the masker were selected randomly for each burst; in the other condition the masker frequencies were selected randomly for the first burst of each interval and then remained constant throughout the interval. The difference in thresholds between the two conditions was taken as a measure of informational masking. Frequency selectivity, using the notched-noise method, was also estimated in the two groups. The results showed no difference in frequency selectivity between the two groups, but showed a large and significant difference in the amount of informational masking between musically trained and untrained listeners. This informational masking task, which requires no knowledge specific to musical training (such as note or interval names) and is generally not susceptible to systematic short- or medium-term training effects, may provide a basis for further studies of analytic listening abilities in different populations.

PMID: 14514207 [PubMed - indexed for MEDLINE]

Icon for Wolters Kluwer Related Articles

Cochlear compression: perceptual measures and implications for normal and impaired hearing.

Ear Hear. 2003 Oct;24(5):352-66

Authors: Oxenham AJ, Bacon SP

Abstract
This article provides a review of recent developments in our understanding of how cochlear nonlinearity affects sound perception and how a loss of the nonlinearity associated with cochlear hearing impairment changes the way sounds are perceived. The response of the healthy mammalian basilar membrane (BM) to sound is sharply tuned, highly nonlinear, and compressive. Damage to the outer hair cells (OHCs) results in changes to all three attributes: in the case of total OHC loss, the response of the BM becomes broadly tuned and linear. Many of the differences in auditory perception and performance between normal-hearing and hearing-impaired listeners can be explained in terms of these changes in BM response. Effects that can be accounted for in this way include poorer audiometric thresholds, loudness recruitment, reduced frequency selectivity, and changes in apparent temporal processing. All these effects can influence the ability of hearing-impaired listeners to perceive speech, especially in complex acoustic backgrounds. A number of behavioral methods have been proposed to estimate cochlear nonlinearity in individual listeners. By separating the effects of cochlear nonlinearity from other aspects of hearing impairment, such methods may contribute towards identifying the different physiological mechanisms responsible for hearing loss in individual patients. This in turn may lead to more accurate diagnoses and more effective hearing-aid fitting for individual patients. A remaining challenge is to devise a behavioral measure that is sufficiently accurate and efficient to be used in a clinical setting.

PMID: 14534407 [PubMed - indexed for MEDLINE]

Icon for Springer Icon for PubMed Central Related Articles

Estimates of human cochlear tuning at low levels using forward and simultaneous masking.

J Assoc Res Otolaryngol. 2003 Dec;4(4):541-54

Authors: Oxenham AJ, Shera CA

Abstract
Auditory filter shapes were derived from psychophysical measurements in eight normal-hearing listeners using a variant of the notched-noise method for brief signals in forward and simultaneous masking. Signal frequencies of 1, 2, 4, 6, and 8 kHz were tested. The signal level was fixed at 10 dB above absolute threshold in the forward-masking conditions and fixed at either 10 or 35 dB above absolute threshold in the simultaneous-masking conditions. The results show that filter equivalent rectangular bandwidths (ERBs) are substantially narrower in forward masking than has been found in previous studies using simultaneous masking. Furthermore, in contrast to earlier studies, the sharpness of tuning doubles over the range of frequencies tested, giving Q(ERB) values of about 10 and 20 at signal frequencies of 1 and 8 kHz, respectively. It is argued that the new estimates of auditory filter bandwidth provide a more accurate estimate of human cochlear tuning at low levels than earlier estimates using simultaneous masking at higher levels, and that they are therefore more suitable for comparison to cochlear tuning data from other species. The data may also prove helpful in defining the parameters for nonlinear models of human cochlear processing.

PMID: 14716510 [PubMed - indexed for MEDLINE]

2002

2002

Icon for HighWire Icon for PubMed Central Related Articles

Revised estimates of human cochlear tuning from otoacoustic and behavioral measurements.

Proc Natl Acad Sci U S A. 2002 Mar 05;99(5):3318-23

Authors: Shera CA, Guinan JJ, Oxenham AJ

Abstract
We develop an objective, noninvasive method for determining the frequency selectivity of cochlear tuning at low and moderate sound levels. Applicable in humans at frequencies of 1 kHz and above, the method is based on the measurement of stimulus-frequency otoacoustic emissions and, unlike previous noninvasive physiological methods, does not depend on the frequency selectivity of masking or suppression. The otoacoustic measurements indicate that at low sound levels human cochlear tuning is more than twice as sharp as implied by standard behavioral studies and has a different dependence on frequency. New behavioral measurements designed to minimize the influence of nonlinear effects such as suppression agree with the emission-based values. A comparison of cochlear tuning in cat, guinea pig, and human indicates that, contrary to common belief, tuning in the human cochlea is considerably sharper than that found in the other mammals. The sharper tuning may facilitate human speech communication.

PMID: 11867706 [PubMed - indexed for MEDLINE]

Icon for Nature Publishing Group Icon for PubMed Central Related Articles

Chimaeric sounds reveal dichotomies in auditory perception.

Nature. 2002 Mar 07;416(6876):87-90

Authors: Smith ZM, Delgutte B, Oxenham AJ

Abstract
By Fourier's theorem, signals can be decomposed into a sum of sinusoids of different frequencies. This is especially relevant for hearing, because the inner ear performs a form of mechanical Fourier transform by mapping frequencies along the length of the cochlear partition. An alternative signal decomposition, originated by Hilbert, is to factor a signal into the product of a slowly varying envelope and a rapidly varying fine time structure. Neurons in the auditory brainstem sensitive to these features have been found in mammalian physiological studies. To investigate the relative perceptual importance of envelope and fine structure, we synthesized stimuli that we call 'auditory chimaeras', which have the envelope of one sound and the fine structure of another. Here we show that the envelope is most important for speech reception, and the fine structure is most important for pitch perception and sound localization. When the two features are in conflict, the sound of speech is heard at a location determined by the fine structure, but the words are identified according to the envelope. This finding reveals a possible acoustic basis for the hypothesized 'what' and 'where' pathways in the auditory cortex.

PMID: 11882898 [PubMed - indexed for MEDLINE]

2001

2001

Icon for American Institute of Physics Related Articles

Forward masking: adaptation or integration?

J Acoust Soc Am. 2001 Feb;109(2):732-41

Authors: Oxenham AJ

Abstract
The aim of this study was to attempt to distinguish between neural adaptation and persistence (or temporal integration) as possible explanations of forward masking. Thresholds were measured for a sinusoidal signal as a function of signal duration for conditions where the delay between the masker offset and the signal offset (the offset-offset interval) was fixed. The masker was a 200-ms broadband noise, presented at a spectrum level of 40 dB (re: 20 microPa), and the signal was a 4-kHz sinusoid, gated with 2-ms ramps. The offset-offset interval was fixed at various durations between 4 and 102 ms and signal thresholds were measured for a range of signal durations at each interval. A substantial decrease in thresholds was observed with increasing duration for signal durations up to about 20 ms. At short offset-offset intervals, the amount of temporal integration exceeded that normally found in quiet. The results were simulated using models of temporal integration (the temporal-window model) and adaptation. For both models, the inclusion of a peripheral nonlinearity, similar to that observed physiologically in studies of the basilar membrane, was essential in producing a good fit to the data. Both models were about equally successful in accounting for the present data. However, the temporal-window model provided a somewhat better account of similar data from a simultaneous-masking experiment, using the same parameters. This suggests that the linear, time-invariant properties of the temporal-window approach are appropriate for modeling forward masking. Overall the results confirm that forward masking can be described in terms of peripheral nonlinearity followed by linear temporal integration at higher levels in the auditory system. However, the difference in predictions between the adaptation and integration models is relatively small, meaning that influence of adaptation cannot be ruled out.

PMID: 11248977 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

The effect of basilar-membrane nonlinearity on the shapes of masking period patterns in normal and impaired hearing.

J Acoust Soc Am. 2001 Apr;109(4):1571-86

Authors: Wojtczak M, Schroder AC, Kong YY, Nelson DA

Abstract
Masking period patterns (MPPs) were measured in listeners with normal and impaired hearing using amplitude-modulated tonal maskers and short tonal probes. The frequency of the masker was either the same as the frequency of the probe (on-frequency masking) or was one octave below the frequency of the probe (off-frequency masking). In experiment 1, MPPs were measured for listeners with normal hearing using different masker levels. Carrier frequencies of 3 and 6 kHz were used for the masker. The probe had a frequency of 6 kHz. For all masker levels, the off-frequency MPPs exhibited deeper and longer valleys compared with the on-frequency MPPs. Hearing-impaired listeners were tested in experiment 2. For some hearing-impaired subjects, masker frequencies of 1.5 kHz and 3 kHz were paired with a probe frequency of 3 kHz. MPPs measured for listeners with hearing loss had similar shapes for on- and off-frequency maskers. It was hypothesized that the shapes of MPPs reflect nonlinear processing at the level of the basilar membrane in normal hearing and more linear processing in impaired hearing. A model assuming different cochlear gains for normal versus impaired hearing and similar parameters of the temporal integrator for both groups of listeners successfully predicted the MPPs.

PMID: 11325128 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Modulation detection interference: effects of concurrent and sequential streaming.

J Acoust Soc Am. 2001 Jul;110(1):402-8

Authors: Oxenham AJ, Dau T

Abstract
The presence of amplitude fluctuations in one frequency region can interfere with our ability to detect similar fluctuations in another (remote) frequency region. This effect is known as modulation detection interference (MDI). Gating the interfering and target sounds asynchronously is known to lead to a reduction in MDI, presumably because the two sounds become perceptually segregated. The first experiment examined the relative effects of carrier and modulator gating asynchrony in producing a release from MDI. The target carrier was a 900-ms, 4.3-kHz sinusoid, modulated in amplitude by a 500-ms, 16-Hz sinusoid, with 200-ms unmodulated fringes preceding and following the modulation. The interferer (masker) was a 1-kHz sinusoid, modulated by a narrowband noise with a 16-Hz bandwidth, centered around 16 Hz. Extending the masker carrier for 200 ms before and after the signal carrier reduced MDI, regardless of whether the target and masker modulators were gated synchronously or were gated with onset and offset asynchronies of 200 ms. Similarly, when the carriers were gated synchronously, asynchronous gating of the modulators did not produce a release from MDI. The second experiment measured MDI with a synchronous target and masker and investigated the effect of adding a series of precursor tones, which were designed to promote the forming of a perceptual stream with the masker, thereby leaving the target perceptually isolated. Four modulated or unmodulated precursor tones presented at the masker frequency were sufficient to completely eliminate MDI. The results support the idea that MDI is due to a perceptual grouping of the masker and target, and show that conditions promoting sufficient perceptual segregation of the masker and target can lead to a total elimination of MDI.

PMID: 11508965 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Reconciling frequency selectivity and phase effects in masking.

J Acoust Soc Am. 2001 Sep;110(3 Pt 1):1525-38

Authors: Oxenham AJ, Dau T

Abstract
The effects of auditory frequency selectivity and phase response on masking were studied using harmonic tone complex maskers with a 100-Hz fundamental frequency. Positive and negative Schroeder-phase complexes (m+ and m-), were used as maskers and the signal was a long-duration sinusoid. In the first experiment, thresholds for signal frequencies of 1 and 4 kHz were measured as a function of masker bandwidth and number of components. A large difference in thresholds between the m+ and m- complexes was found only when masker components were presented ipsilateral to the signal over a frequency range wider than the traditional critical band, regardless of the absolute number of components. In the second experiment, frequency selectivity was measured in harmonic tone complexes with fixed or random phases as well as in noise, using a variant of the notched-noise method with a fixed masker level. The data showed that frequency selectivity is not affected by masker type, indicating that the wide listening bandwidth suggested by the first experiment cannot be ascribed to broader effective filters in complex-tone maskers than in noise maskers. The third experiment employed a novel method of measuring frequency selectivity, which has the advantage that the overall level at the input and the output of the auditory filter remains roughly constant across all conditions. The auditory filter bandwidth measured using this method was wider than that measured in the second experiment, but may still be an underestimate, due to the effects of off-frequency listening. The data were modeled using a single-channel model with various initial filters. The main findings from the simulations were: (1) the magnitude response of the Gammatone filter is too narrow to account for the phase effects observed in the data; (2) none of the other filters currently used in auditory models can account for both frequency selectivity and phase effects in masking; (3) the Gammachirp filter can be made to provide a good account of the data by altering its phase response. The final conclusion suggests that masker phase effects can be accounted for with a single-channel model, while still remaining consistent with measures of frequency selectivity: effects that appear to involve broadband processing do not necessarily require across-channel mechanisms.

PMID: 11572363 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

A new procedure for measuring peripheral compression in normal-hearing and hearing-impaired listeners.

J Acoust Soc Am. 2001 Oct;110(4):2045-64

Authors: Nelson DA, Schroder AC, Wojtczak M

Abstract
Forward-masking growth functions for on-frequency (6-kHz) and off-frequency (3-kHz) sinusoidal maskers were measured in quiet and in a high-pass noise just above the 6-kHz probe frequency. The data show that estimates of response-growth rates obtained from those functions in quiet, which have been used to infer cochlear compression, are strongly dependent on the spread of probe excitation toward higher frequency regions. Therefore, an alternative procedure for measuring response-growth rates was proposed, one that employs a fixed low-level probe and avoids level-dependent spread of probe excitation. Fixed-probe-level temporal masking curves (TMCs) were obtained from normal-hearing listeners at a test frequency of 1 kHz, where the short 1-kHz probe was fixed in level at about 10 dB SL. The level of the preceding forward masker was adjusted to obtain masked threshold as a function of the time delay between masker and probe. The TMCs were obtained for an on-frequency masker (1 kHz) and for other maskers with frequencies both below and above the probe frequency. From these measurements, input/output response-growth curves were derived for individual ears. Response-growth slopes varied from >1.0 at low masker levels to <0.2 at mid masker levels. In three subjects, response growth increased again at high masker levels (>80 dB SPL). For the fixed-level probe, the TMC slopes changed very little in the presence of a high-pass noise masking upward spread of probe excitation. A greater effect on the TMCs was observed when a high-frequency cueing tone was used with the masking tone. In both cases, however, the net effects on the estimated rate of response growth were minimal.

PMID: 11681384 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Towards a measure of auditory-filter phase response.

J Acoust Soc Am. 2001 Dec;110(6):3169-78

Authors: Oxenham AJ, Dau T

Abstract
This study investigates how the phase curvature of the auditory filters varies with center frequency (CF) and level. Harmonic tone complex maskers were used, with component phases adjusted using a variant of an equation proposed by Schroeder [IEEE Trans. Inf. Theory 16, 85-89 (1970)]. In experiment 1, the phase curvature of the masker was varied systematically and sinusoidal signal thresholds were measured at frequencies from 125 to 8000 Hz. At all signal frequencies, threshold differences of 20 dB or more were observed between the most effective and least effective masker phase curvature. In experiment 2, the effect of overall masker level on masker phase effects was studied using signal frequencies of 250, 1000, and 4000 Hz. The results were used to estimate the phase curvature of the auditory filters. The estimated relative phase curvature decreases dramatically with decreasing CF below 1000 Hz. At frequencies above 1000 Hz, relative auditory-filter phase curvature increases only slowly with increasing CF, or may remain constant. The phase curvature of the auditory filters seems to be broadly independent of overall level. Most aspects of the data are in qualitative agreement with peripheral physiological findings from other mammals, which suggests that the phase responses observed here are of peripheral origin. However, in contrast to the data reported in a cat auditory-nerve study [Carney et al., J. Acoust. Soc. Am. 105, 2384-2391 (1999)], no reversal in the sign of the phase curvature was observed at very low frequencies. Overall, the results provide a framework for mapping out the phase curvature of the auditory filters and provide constraints on future models of peripheral filtering in the human auditory system.

PMID: 11785818 [PubMed - indexed for MEDLINE]

2000

2000

Icon for American Institute of Physics Related Articles

Basilar-membrane nonlinearity estimated by pulsation threshold.

J Acoust Soc Am. 2000 Jan;107(1):501-7

Authors: Plack CJ, Oxenham AJ

Abstract
The pulsation threshold technique was used to estimate the basilar-membrane (BM) response to a tone at characteristic frequency (CF). A pure-tone signal was alternated with a pure-tone masker. The frequency of the masker was 0.6 times that of the signal. For signal levels from around 20 dB above absolute threshold to 85 dB SPL, the masker level was varied to find the level at which a transition occurred between the signal being perceived as "pulsed" or "continuous" (the pulsation threshold). The transition is assumed to occur when the masker excitation is somewhat greater than the signal excitation at the place on the BM tuned to the signal. If it is assumed further that the response at this place to the lower-frequency masker is linear, then the shape of the masking function provides an estimate of the BM response to the signal. Signal frequencies of 0.25, 0.5, 1, 2, 4, and 8 kHz were tested. The mean slopes of the masking functions for signal levels between 50 and 80 dB SPL were 0.76, 0.50, 0.34, 0.32, 0.35, and 0.41, respectively. The results suggest that compression on the BM increases between CFs of 0.25 and 1 kHz and is roughly constant for frequencies of 1 kHz and above. Despite requiring a subjective criterion, the pulsation threshold measurements had a reasonably low variability. However, the estimated compression was less than in an earlier study using forward masking. The smaller amount of compression observed here may be due to the effects of off-frequency listening.

PMID: 10641658 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Level discrimination of sinusoids as a function of duration and level for fixed-level, roving-level, and across-frequency conditions.

J Acoust Soc Am. 2000 Mar;107(3):1605-14

Authors: Oxenham AJ, Buus S

Abstract
The ability of listeners to detect level differences between two sinusoidal stimuli in a two-interval forced-choice procedure was measured as a function of duration and level in three conditions: (1) the pedestal was fixed in level and the stimuli in the two intervals had the same frequency of either 1 or 2 kHz (fixed-level condition); (2) the pedestal was roved in level over a 20-dB range from trial to trial, but the stimuli still had the same frequency of either 1 or 2 kHz (roving-level condition); and (3) the pedestal was roved in level over a 20-dB range and the two stimuli differed in frequency, such that one was around 1 kHz while the other was around 2 kHz (across-frequency condition). In the fixed-level conditions, difference limens decreased (improved) with both increasing duration and level, as found in previous studies. In the roving-level conditions, difference limens increased and the dependence on duration and level decreased. Difference limens in the across-frequency conditions were generally highest and showed very little dependence on either stimulus duration or level. The results may be understood in terms of different internal noise components with additive variances: In the fixed-level conditions, sensation noise, which is dependent on stimulus attributes such as duration and level, is dominant. In more difficult conditions, where trace-memory and/or across-channel comparisons are required, a more central, stimulus-independent noise dominates.

PMID: 10738814 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Influence of spatial and temporal coding on auditory gap detection.

J Acoust Soc Am. 2000 Apr;107(4):2215-23

Authors: Oxenham AJ

Abstract
This study investigated the effect on gap detection of perceptual channels, hypothesized to be tuned to spatial location or fundamental frequency (f0). Thresholds were measured for the detection of a silent temporal gap between two markers. In the first experiment, the markers were broadband noise, presented either binaurally or monaurally. In the binaural conditions, the markers were either diotic, or had a 640-micros interaural time difference (ITD) or a 12-dB interaural level difference (ILD). Reversing the ITD across the two markers had no effect on gap detection relative to the diotic condition. Reversing the ILD across the two markers produced a marked deterioration in performance. However, the same deterioration was observed in the monaural conditions when a 12-dB level difference was introduced between the two markers. The results provide no evidence for the role of spatially tuned neural channels in gap detection. In the second experiment, the markers were harmonic tone complexes, filtered to contain only high, unresolved harmonics. Using complexes with a fixed spectral envelope, where the f0 (of 140 or 350 Hz) was different for the two markers, produced a deterioration in performance, relative to conditions where the f0 remained the same. A larger deterioration was observed when the two markers occupied different spectral regions but had the same f0. This supports the idea that peripheral coding is dominant in determining gap-detection thresholds when the two markers differ along any physical dimension. Higher-order neural coding mechanisms of f0 and spatial location seem to play a smaller role and no role, respectively.

PMID: 10790047 [PubMed - indexed for MEDLINE]

Icon for Elsevier Science Related Articles

Effects of masker frequency and duration in forward masking: further evidence for the influence of peripheral nonlinearity.

Hear Res. 2000 Dec;150(1-2):258-66

Authors: Oxenham AJ, Plack CJ

Abstract
Forward masking has often been thought of in terms of neural adaptation, with nonlinearities in the growth and decay of forward masking being accounted for by the nonlinearities inherent in adaptation. In contrast, this study presents further evidence for the hypothesis that forward masking can be described as a linear process, once peripheral, mechanical nonlinearities are taken into account. The first experiment compares the growth of masking for on- and off-frequency maskers. Signal thresholds were measured as a function of masker level for three masker-signal intervals of 0, 10, and 30 ms. The brief 4-kHz sinusoidal signal was masked by a 200-ms sinusoidal forward masker which had a frequency of either 2.4 kHz (off-frequency) or 4 kHz (on-frequency). As in previous studies, for the on-frequency condition, the slope of the function relating signal threshold to masker level became shallower as the delay between the masker and signal was increased. In contrast, the slopes for the off-frequency condition were independent of masker-signal delay and had a value of around unity, indicating linear growth of masking for all masker-signal delays. In the second experiment, a broadband Gaussian noise forward masker was used to mask a brief 6-kHz sinusoidal signal. The spectrum level of the masker was either 0 or 40 dB (re: 20 microPa). The gap between the masker and signal was either 0 or 20 ms. Signal thresholds were measured for masker durations from 5 to 200 ms. The effect of masker duration was found to depend more on signal level than on gap duration or masker level. Overall, the results support the idea that forward masking can be modeled as a linear process, preceded by a static nonlinearity resembling that found on the basilar membrane.

PMID: 11077208 [PubMed - indexed for MEDLINE]

1999

1999

Icon for American Institute of Physics Related Articles

Sequential stream segregation in the absence of spectral cues.

J Acoust Soc Am. 1999 Jan;105(1):339-46

Authors: Vliegen J, Oxenham AJ

Abstract
This paper investigates the cues used by the auditory system in the perceptual organization of sequential sounds. In particular, the ability to organize sounds in the absence of spectral cues is studied. In the first experiment listeners were presented with a tone sequence ABA ABA ..., where the fundamental frequency (f0) of tone A was fixed at 100 Hz and the f0 difference between tones A and B varied across trials between 1 and 11 semitones. Three spectral conditions were tested: pure tones, harmonic complexes filtered with a bandpass region between 500 and 2000 Hz, and harmonic complexes filtered with a bandpass region chosen so that only harmonics above the tenth would be passed by the filter, thus severely limiting spectral information. Listeners generally reported that they could segregate tones A and B into two separate perceptual streams when the f0 interval exceeded about four semitones. This was true for all conditions. The second experiment showed that most listeners were better able to recognize a short atonal melody interleaved with random distracting tones when the distracting tones were in an f0 region 11 semitones higher than the melody than when the distracting tones were in the same f0 region. The results were similar for both pure tones and complex tones comprising only high, unresolved harmonics. The results from both experiments show that spectral separation is not a necessary condition for perceptual stream segregation. This suggests that models of stream segregation that are based solely on spectral properties may require some revision.

PMID: 9921660 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

The role of spectral and periodicity cues in auditory stream segregation, measured using a temporal discrimination task.

J Acoust Soc Am. 1999 Aug;106(2):938-45

Authors: Vliegen J, Moore BC, Oxenham AJ

Abstract
In a previous paper, it was shown that sequential stream segregation could be based on both spectral information and periodicity information, if listeners were encouraged to hear segregation [Vliegen and Oxenham, J. Acoust. Soc. Am. 105, 339-346 (1999)]. The present paper investigates whether segregation based on periodicity information alone also occurs when the task requires integration. This addresses the question: Is segregation based on periodicity automatic and obligatory? A temporal discrimination task was used, as there is evidence that it is difficult to compare the timing of auditory events that are perceived as being in different perceptual streams. An ABA ABA ABA... sequence was used, in which tone B could be either exactly at the temporal midpoint between two successive tones A or slightly delayed. The tones A and B were of three types: (1) both pure tones; (2) both complex tones filtered through a fixed passband so as to contain only harmonics higher than the 10th, thereby eliminating detectable spectral differences, where only the fundamental frequency (f0) was varied between tones A and B; and (3) both complex tones with the same f0, but where the center frequency of the spectral passband varied between tones. Tone A had a fixed frequency of 300 Hz (when A and B were pure tones) or a fundamental frequency (f0) of 100 Hz (when A and B were complex tones). Five different intervals, ranging from 1 to 18 semitones, were used. The results for all three conditions showed that shift thresholds increased with increasing interval between tones A and B, but the effect was largest for the conditions where A and B differed in spectrum (i.e., the pure-tone and the variable-center-frequency conditions). The results suggest that spectral information is dominant in inducing (involuntary) segregation, but periodicity information can also play a role.

PMID: 10462799 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Intensity discrimination and detection of amplitude modulation.

J Acoust Soc Am. 1999 Oct;106(4 Pt 1):1917-24

Authors: Wojtczak M, Viemeister NF

Abstract
Thresholds for detection of low-rate sinusoidal amplitude modulation and for detection of intensity increments were measured over a wide range of levels in an examination of the relationship between these fundamental aspects of intensity processing. As expected, thresholds measured with a continuous 1-kHz tone decrease with increasing carrier/pedestal level. For levels between 6 and 85 dB SPL the data are well described by 10 log delta I/I = 0.44.(20 log m) + D(fm), where delta I/I is the Weber fraction for increment detection, m is the modulation index at threshold, and D(fm) depends on modulation rate (fm). The relationship between the psychometric functions for modulation and increment detection is also consistent with this equation. The data indicate a clear relationship between modulation and increment detection and thus provide an important additional consideration for models of modulation processing. No existing models provide an adequate account of this relationship.

PMID: 10530016 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Inter-relationship between different psychoacoustic measures assumed to be related to the cochlear active mechanism.

J Acoust Soc Am. 1999 Nov;106(5):2761-78

Authors: Moore BC, Vickers DA, Plack CJ, Oxenham AJ

Abstract
The active mechanism in the cochlea is thought to depend on the integrity of the outer hair cells (OHCs). Cochlear hearing loss is usually associated with damage to both inner hair cells (IHCs) and OHCs, with the latter resulting in a reduction in or complete loss of the function of the active mechanism. It is believed that the active mechanism contributes to the sharpness of tuning on the basilar membrane (BM) and is also responsible for compressive input-output functions on the BM. Hence, one would expect a close relationship between measures of sharpness of tuning and measures of compression. This idea was tested by comparing three different measures of the status of the active mechanism, at center frequencies of 2, 4, and 6 kHz, using subjects with normal hearing, with unilateral or highly asymmetric cochlear hearing loss, and with bilateral loss. The first measure, HLOHC, was an indirect measure of the amount of the hearing loss attributable to OHC damage; this was based on loudness matches between the two ears of subjects with unilateral hearing loss and was derived using a loudness model. The second measure was the equivalent rectangular bandwidth (ERB) of the auditory filter, which was estimated using the notched-noise method. The third measure was based on the slopes of growth-of-masking functions obtained in forward masking. The ratio of slopes for a masker centered well below the signal frequency and a masker centered at the signal frequency gives a measure of BM compression at the place corresponding to the signal frequency; a ratio close to 1 indicates little or no compression, while ratios less than 1 indicate that compression is occurring at the signal place. Generally, the results showed the expected pattern. The ERB tended to increase with increasing HLOHC. The ratio of the forward-masking slopes increased from about 0.3 to about 1 as HLOHC increased from 0 to 55 dB. The ratio of the slopes was highly correlated with the ERB (r = 0.92), indicating that the sharpness of the auditory filter decreases as the compression on the BM decreases.

PMID: 10573892 [PubMed - indexed for MEDLINE]

1998

1998

Icon for American Psychological Association Related Articles

Psychoacoustic consequences of compression in the peripheral auditory system.

Psychol Rev. 1998 Jan;105(1):108-24

Authors: Moore BC, Oxenham AJ

Abstract
Input-output functions on the basilar membrane of the cochlea show a strong compressive nonlinearity at midrange levels for frequencies close to the characteristic frequency of a given place. This article shows how many different phenomena can be explained as consequences of this nonlinearity, including the "excess" masking produced when 2 nonsimultaneous maskers are combined, the nonlinear growth of forward masking with masker level, the influence of component phase on the effectiveness of complex forward maskers, changes in the ability to detect increments and decrements with level, temporal integration, and the influence of component phase and level on the perception of vowellike sounds. Cochlear hearing loss causes basilar-membrane responses to become more linear. This can account for loudness recruitment, linear additivity of nonsimultaneous masking, linear growth of forward masking, reduced temporal resolution for sounds with fluctuating envelopes, and reduced temporal integration.

PMID: 9450373 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Temporal integration at 6 kHz as a function of masker bandwidth.

J Acoust Soc Am. 1998 Feb;103(2):1033-42

Authors: Oxenham AJ

Abstract
Thresholds were measured for a 6-kHz sinusoidal signal presented within a 500-ms masker. The masker was either a bandpass Gaussian noise of varying bandwidth, or a sinusoid of the same frequency as the signal. The spectrum level of the noise masker was kept constant at 20 dB SPL, and the level of the sinusoidal masker was 40 dB SPL. Thresholds for signal durations between 2 and 300 ms were measured for masker bandwidths ranging from 60 to 12,000 Hz. The masker was spectrally centered around 6 kHz. For masker bandwidths less than 600 Hz, the slope of the temporal integration function decreased with decreasing masker bandwidth. The results are not consistent with current models of temporal integration or temporal resolution. It is suggested that the results at narrow bandwidths can be understood in terms of changes in the power spectrum of the stimulus envelope or modulation spectrum. According to this view, the onset and offset ramps of the signal introduce detectable high-frequency components into the modulation spectrum, which provide a salient cue in narrowband maskers. For broadband maskers, these high-frequency components are masked by the inherent rapid fluctuations in the masker envelope. Additionally, for signal durations between 7 and 80 ms, signal thresholds decreased by up to 5 dB as the masker bandwidth increased from 1200 to 12,000 Hz. The mechanisms underlying this effect are not yet fully understood.

PMID: 9479757 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Basilar-membrane nonlinearity and the growth of forward masking.

J Acoust Soc Am. 1998 Mar;103(3):1598-608

Authors: Plack CJ, Oxenham AJ

Abstract
Forward masking growth functions were measured for pure-tone maskers and signals at 2 and 6 kHz as a function of the silent interval between the masker and signal. The inclusion of conditions involving short signals and short masker-signal intervals ensured that a wide range of signal thresholds were recorded. A consistent pattern was seen across all the results. When the signal level was below about 35 dB SPL the growth of masking was shallow, so that signal threshold increased at a much slower rate than masker level. When the signal level exceeded this value, the masking function steepened, approaching unity (linear growth) at the highest masker and signal levels. The results are inconsistent with an explanation for forward-masking growth in terms of saturating neural adaptation. Instead the data are well described by a model incorporating a simulation of the basilar-membrane response at characteristic frequency (which is almost linear at low levels and compressive at higher levels) followed by a sliding intensity integrator or temporal window. Taken together with previous results, the findings suggest that the principle nonlinearity in temporal masking may be the basilar membrane response function, and that subsequent to this the auditory system behaves as if it were linear in the intensity domain.

PMID: 9514024 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Suppression and the upward spread of masking.

J Acoust Soc Am. 1998 Dec;104(6):3500-10

Authors: Oxenham AJ, Plack CJ

Abstract
The purpose of this study is to clarify the role of suppression in the growth of masking when a signal is well above the masker in frequency (upward spread of masking). Classical psychophysical models assume that masking is primarily due to the spread of masker excitation, and that the nonlinear upward spread of masking reflects a differential growth in excitation between the masker and the signal at the signal frequency. In contrast, recent physiological studies have indicated that upward spread of masking in the auditory nerve is due to the increasing effect of suppression with increasing masker level. This study compares thresholds for signals between 2.4 and 5.6 kHz in simultaneous and nonsimultaneous masking for conditions in which the masker is either at or well below the signal frequency. Maximum differences between simultaneous and nonsimultaneous masking were small (< 6 dB) for the on-frequency conditions but larger for the off-frequency conditions (15-32 dB). The results suggest that suppression plays a major role in determining thresholds at high masker levels, when the masker is well below the signal in frequency. This is consistent with the conclusions of physiological studies. However, for signal levels higher than about 40 dB SPL, the growth of masking for signals above the masker frequency is nonlinear even in the nonsimultaneous-masking conditions, where suppression is not expected. This is consistent with an explanation based on the compressive response of the basilar membrane, and confirms that suppression is not necessary for nonlinear upward spread of masking.

PMID: 9857509 [PubMed - indexed for MEDLINE]

1997

1997

Icon for American Institute of Physics Related Articles

A behavioral measure of basilar-membrane nonlinearity in listeners with normal and impaired hearing.

J Acoust Soc Am. 1997 Jun;101(6):3666-75

Authors: Oxenham AJ, Plack CJ

Abstract
This paper examines the possibility of estimating basilar-membrane (BM) nonlinearity using a psychophysical technique. The level of a forward masker required to mask a brief signal was measured for conditions where the masker was either at, or one octave below, the signal frequency. The level of the forward masker at masked threshold provided an indirect measure of the BM response to the signal, as follows. Consistent with physiological studies, it was assumed that the BM responds linearly to frequencies well below the characteristic frequency (CF). Thus the ratio of the slopes of the masking functions between a masker at the signal frequency and a masker well below the signal frequency should provide an estimate of BM compression at CF. Results obtained from normally hearing listeners were in quantitative agreement with physiological estimates of BM compression. Furthermore, differences between normally hearing listeners and listeners with cochlear hearing impairment were consistent with the physiological effects of damage to the cochlea. The results support the hypothesis that BM nonlinearity governs the nonlinear growth of the upward spread of masking, and suggest that this technique provides a straightforward method for estimating BM nonlinearity in humans.

PMID: 9193054 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Short-term temporal integration: evidence for the influence of peripheral compression.

J Acoust Soc Am. 1997 Jun;101(6):3676-87

Authors: Oxenham AJ, Moore BC, Vickers DA

Abstract
Thresholds for a 6.5-kHz sinusoidal signal, temporally centered in a 400-ms broadband-noise masker, were measured as a function of signal duration for normally hearing listeners and listeners with cochlear hearing loss over a range of masker levels. For the normally hearing listeners, the slope of the function relating signal threshold to signal duration (integration function) was steeper at medium masker levels than at low or high levels by a factor of nearly 2, for signal durations between 2 and 10 ms, while no significant effect of level was found for signal durations of 20 ms and more. No effect of stimulus level was found for the hearing-impaired listeners at any signal duration. For signal durations greater than 10 ms, consistent with many previous studies, the slope of the integration function was shallower for the hearing-impaired listeners than for the normally hearing listeners. However, for shorter durations, there was no significant difference in slope between the results from the hearing-impaired listeners and those from the normally hearing listeners in the high- and low-level masker conditions. A model incorporating a compressive nonlinearity, representing the effect of basilar-membrane (BM) compression, and a short-term temporal integrator, postulated to be a more central process, can account well for changes in the short-term integration function with level, if it is assumed that the compression is greater at medium levels than at low or high levels by a factor of about 4. This is in reasonable agreement with physiological measurements of BM compression, and with previous psychophysical estimates.

PMID: 9193055 [PubMed - indexed for MEDLINE]

Related Articles

Effects of fast-acting high-frequency compression on the intelligibility of speech in steady and fluctuating background sounds.

Br J Audiol. 1997 Aug;31(4):257-73

Authors: Stone MA, Moore BC, Wojtczak M, Gudgin E

Abstract
This study examines whether speech intelligibility in background sounds can be improved for persons with loudness recruitment by the use of fast-acting compression applied at high frequencies, when the overall level of the sounds is held constant by means of a slow-acting automatic gain control (AGC) system and when appropriate frequency-response shaping is applied. Two types of fast-acting compression were used in the high-frequency channel of a two-channel system: a compression limiter with a 10:1 compression ratio and with a compression threshold about 9 dB below the peak level of the signal in the high-frequency channel; and a wide dynamic range compressor with a 2:1 compression ratio and with the compression threshold about 24 dB below the peak level of the signal in the high-frequency channel. A condition with linear processing in the high-frequency channel was also used. Speech reception thresholds (SRTs) were measured for two background sounds: a steady speech-shaped noise and a single male talker. All subjects had moderate-to-severe sensorineural hearing loss. Three different types of speech material were used: the adaptive sentence lists (ASL), the Bamford-Kowal-Bench (BKB) sentence lists and the Boothroyd word lists. For the steady background noise, the compression generally led to poorer performance than for the linear condition, although the deleterious effect was only significant for the 10:1 compression ratio. For the background of a single talker, the compression had no significant effect except for the ASL sentences, where the 10:1 compression gave significantly better performance than the linear condition. Overall, the results did not show any clear benefits of the fast-acting compression, possibly because the slow-acting AGC allowed the use of gains in the linear condition that were markedly higher than would normally be used with linear hearing aids.

PMID: 9307821 [PubMed - indexed for MEDLINE]

Icon for American Institute of Physics Related Articles

Increment and decrement detection in sinusoids as a measure of temporal resolution.

J Acoust Soc Am. 1997 Sep;102(3):1779-90

Authors: Oxenham AJ

Abstract
Measuring thresholds for the detection of brief decrements in the level of a sinusoid is an established method of estimating auditory temporal resolution. Generally, a background noise is added to the stimulus to avoid the detection of the "spectral splatter" introduced by the decrement. Results are often described in terms of a temporal-window model, comprising a band-pass filter, a compressive nonlinearity, a sliding temporal integrator, and a decision device. In this study, thresholds for increments, as well as decrements, in the level of a 55 dB SPL, 4-kHz sinusoidal pedestal were measured as function of increment and decrement duration in the presence of a broadband background noise ranging in spectrum level from -20 to +20 dB SPL. Thresholds were also measured using a 55-dB, 8-kHz pedestal in the absence of background noise. Thresholds for decrements, in terms of the dB change in level (delta L), were found to be more dependent on duration than those for increments. Also, performance was found to be dependent on background-noise level over most levels tested. Neither finding is consistent with the predictions of the temporal-window model or other similar models of temporal resolution. The difference between increment and decrement detection was more successfully simulated by using a decision criterion based on the maximum slope of the temporal-window output.

PMID: 9301055 [PubMed - indexed for MEDLINE]

1995

1995

Icon for American Institute of Physics Related Articles

Additivity of masking in normally hearing and hearing-impaired subjects.

J Acoust Soc Am. 1995 Oct;98(4):1921-34

Authors: Oxenham AJ, Moore BC

Abstract
The effects of combining two equally effective maskers were studied in normally hearing and elderly hearing-impaired subjects. The additivity of nonsimultaneous masking was investigated by measuring thresholds for a brief 4-kHz signal in the presence of a broadband-noise forward masker, a backward masker, and a combination of both. For the normally hearing subjects, combining two equally effective nonsimultaneous maskers resulted in up to a 15-dB greater increase in threshold than the 3 dB predicted by an energy-summation model ("excess masking"). However, the hearing-impaired subjects showed little or no excess masking. The difference between the two groups is consistent with a theory linking excess masking to the compressive transfer function measured on the basilar membrane (BM). In the hearing-impaired subjects the transfer function is more linear, accounting for the lack of excess masking. The additivity of simultaneous masking was investigated by measuring thresholds for a 100-ms 4-kHz signal in the presence of either a 400-ms broadband noise masker or a 400-ms sinusoidal masker at the same frequency as the signal, and then combining two equally effective maskers, a noise and a tone. The maximum amount of excess masking (3 to 4 dB) was similar across the two groups of subjects, consistent with an explanation based on the use of different detection cues for the tonal and noise maskers. It is argued that, while peripheral compression may underlie excess masking for pairs of nonsimultaneous maskers, it is unlikely that in simultaneous masking, where the maskers are close in frequency to the signal, the two maskers are compressed individually before their effects are combined. It is further suggested that BM nonlinearity may underlie the effects of the upward spread of masking and the nonlinear growth of forward masking, as well as accounting for the additivity of simultaneous masking when the masker frequencies are well below that of the signal.

PMID: 7593916 [PubMed - indexed for MEDLINE]

Principal Investigators

Andrew J. Oxenham, PhD

PubMed | Google Scholar | Expert@Minneota | Profile

Magdalena Wojtczak, PhD

PubMed | Expert@Minneota | Profile