Project 5: Perceptual coherence and the role of FPD in noise
This project aims to use computational techniques to build linguistic structures such as prosodic trees on the basis of ‘perceptual coherence’, that is, the grouping of sound components such as harmonics and formants into larger units. The Sheffield project will examine overlap in conversations. The Cluj project will examine other aspects of the relationship between FPD and auditory perceptual coherence, and test the hypothesis that FPD-derived coherence contributes to speech processing in noise.
Perceptual experiments have shown that synthetic speech is more intelligible in noise when it contains FPD that mimics natural speech patterns. Knowledge of the FPD for one’s native language may also result in speech perception advantages over non-native listeners in noise. Intriguingly, FPD that affects intelligibility in noise is often not easily noticeable in good listening conditions. One hypothesis is that the FPD enables better grouping of the auditory signal, and thereby more efficient lexical access and understanding. Perceptually coherent signals should result in structures that are robust in everyday noise conditions (including competing speakers).
There will be three linguistic-phonetic foci in our investigations: (1) Short-term spectrotemporal changes, e.g. near segment boundaries, that may contribute low-level auditory coherence. (2) Systematic variation that spreads over several syllables and provides information either about a single phoneme (e.g. /r/ resonance in English, vowel harmony in French) and/or about prosodic structures such as accent groups and intonational phrases (e.g. f0 contour; strengthening at phrase boundaries). (3) Phonetic variation due to morphological structure as described above. Pilot data analysed in Cambridge and Sheffield suggests that morphological differences may involve elements of (1) and interact with prosodic strengthening (3), so this third focus links low-level auditory processes with higher-order prosodic structure as well as grammar.
Three additional project options are available. (1) To study speech produced in the presence of an N-talker babble for various N, to examine how FPD is affected by speech production changes brought about by noise. (2) To include audio-visual experiments, thus broadening to an ecologically-valid, multi-sensory approach that addresses general perceptual, rather than just auditory, coherence. (3) To use other languages (particularly Romanian) as a testbed for a predictive account of coherence-inducing FPD.
Methods Paired stimuli will be constructed that are identical except for the presence/absence of FPD, and their word intelligibility in noise assessed. Stimuli whose intelligibility is enhanced in noise will be used for the computational modelling. Intelligibility tests will follow standard procedures. Computational auditory scene analysis algorithms and simulations of auditory ‘glimpsing’ opportunities will be used to determine which parts of the speech signal are most salient in noise. Explanations for improvements in intelligibility will be sought with reference to auditorily-salient information.
Young researchers One ESR (Gorisch) is based at Sheffield; and one ESR (Kabir) will be based at Cluj. They will visit Cambridge and/or York for FPD, and Leuven for noisy episodic representations. This project integrates computer science with phonetics & phonology.
Links Auditory coherence for prosodic structuring links with Theme III. Listening in noise will feed into Theme IV by addressing the issue of what kinds of episodic representations are useful in noise. Auditory grouping of simultaneous speech is linked to turn-taking in Project 1.
Working on this project: » Prof Bill Wells » Dr Guy Brown » Prof Dr Mircea Giurgiu » Jan Gorisch » Ahsanul Kabir » Dr Jonas Beskow » Prof Rolf Carlson » Prof David House » Prof Björn Granström
