Lip-reading enables the brain to synthesize auditory features of unknown silent speech

Mathieu Bourguignon*, Martijn Baart, Efthymia Kapnoula, Nicola Molinaro

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

9 Downloads (Pure)

Abstract

Lip-reading is crucial for understanding speech in challenging conditions. But how the brain extracts meaning from—-silent—-visual speech is still under debate. Lip-reading in silence activates the auditory cortices, but it is not known whether such activation reflects immediate synthesis of the corresponding auditory stimulus or imagery of unrelated sounds.

To disentangle these possibilities, we used magnetoencephalography to evaluate how cortical activity in 28 healthy adults humans (17 females) entrained to the auditory speech envelope and lip movements (mouth opening) when listening to a spoken story without visual input (audio-only), and when seeing a silent video of a speaker articulating another story (video-only).

In video-only, auditory cortical activity entrained to the absent auditory signal at frequencies below 1 Hz more than to the seen lip movements. This entrainment process was characterized by an auditory-speech—to—brain delay of ∼70 ms in the left hemisphere, compared to ∼20 ms in audio-only. Entrainment to mouth opening was found in the right angular gyrus at below 1 Hz, and in early visual cortices at 1—8 Hz.

These findings demonstrate that the brain can use a silent lip-read signal to synthesize a coarse-grained auditory speech representation in early auditory cortices. Our data indicate the following underlying oscillatory mechanism: Seeing lip movements first modulates neuronal activity in early visual cortices at frequencies that match articulatory lip movements; the right angular gyrus then extracts slower features of lip movements, mapping them onto the corresponding speech sound features; this information is fed to auditory cortices, most likely facilitating speech parsing.
Original languageEnglish
JournalJournal of Neuroscience
DOIs
Publication statusAccepted/In press - 2020

Fingerprint

Lipreading
Visual Cortex
Magnetoencephalography
Phonetics
Imagery (Psychotherapy)

Cite this

@article{99d5b8b7060e4b89986d45e6b0cb43e2,
title = "Lip-reading enables the brain to synthesize auditory features of unknown silent speech",
abstract = "Lip-reading is crucial for understanding speech in challenging conditions. But how the brain extracts meaning from—-silent—-visual speech is still under debate. Lip-reading in silence activates the auditory cortices, but it is not known whether such activation reflects immediate synthesis of the corresponding auditory stimulus or imagery of unrelated sounds.To disentangle these possibilities, we used magnetoencephalography to evaluate how cortical activity in 28 healthy adults humans (17 females) entrained to the auditory speech envelope and lip movements (mouth opening) when listening to a spoken story without visual input (audio-only), and when seeing a silent video of a speaker articulating another story (video-only).In video-only, auditory cortical activity entrained to the absent auditory signal at frequencies below 1 Hz more than to the seen lip movements. This entrainment process was characterized by an auditory-speech—to—brain delay of ∼70 ms in the left hemisphere, compared to ∼20 ms in audio-only. Entrainment to mouth opening was found in the right angular gyrus at below 1 Hz, and in early visual cortices at 1—8 Hz.These findings demonstrate that the brain can use a silent lip-read signal to synthesize a coarse-grained auditory speech representation in early auditory cortices. Our data indicate the following underlying oscillatory mechanism: Seeing lip movements first modulates neuronal activity in early visual cortices at frequencies that match articulatory lip movements; the right angular gyrus then extracts slower features of lip movements, mapping them onto the corresponding speech sound features; this information is fed to auditory cortices, most likely facilitating speech parsing.",
author = "Mathieu Bourguignon and Martijn Baart and Efthymia Kapnoula and Nicola Molinaro",
note = "Mathieu Bourguignon was supported by the Innoviris Attract program (grant 2015-BB2B-10), by the Spanish Ministry of Economy and Competitiveness (grant PSI2016-77175-P), and by the Marie Skłodowska-Curie Action of the European Commission (grant 743562). Martijn Baart was supported by the Netherlands Organization for Scientific Research (NWO, VENI grant 275-89-027). Efthymia C. Kapnoula was supported by the Spanish Ministry of Economy and Competitiveness, through the Juan de la Cierva-Formaci{\'o}n fellowship, and by the Spanish Ministry of Economy and Competitiveness (grant PSI2017-82563-P). Nicola Molinaro was supported by the Spanish Ministry of Science, Innovation and Universities (grant RTI2018-096311-B-I00), the Agencia Estatal de Investigaci{\'o}n (AEI), the Fondo Europeo de Desarrollo Regional (FEDER) and by the Basque government (grant PI_2016_1_0014). The authors acknowledge financial support from the Spanish Ministry of Economy and Competitiveness, through the “Severo Ochoa” Programme for Centres/Units of Excellence in R&D” (SEV-2015-490) awarded to the BCBL.",
year = "2020",
doi = "10.1523/JNEUROSCI.1101-19.2019",
language = "English",
journal = "Journal of Neuroscience",
issn = "0270-6474",
publisher = "SOC NEUROSCIENCE",

}

Lip-reading enables the brain to synthesize auditory features of unknown silent speech. / Bourguignon, Mathieu; Baart, Martijn; Kapnoula, Efthymia; Molinaro, Nicola.

In: Journal of Neuroscience, 2020.

Research output: Contribution to journalArticleScientificpeer-review

TY - JOUR

T1 - Lip-reading enables the brain to synthesize auditory features of unknown silent speech

AU - Bourguignon, Mathieu

AU - Baart, Martijn

AU - Kapnoula, Efthymia

AU - Molinaro, Nicola

N1 - Mathieu Bourguignon was supported by the Innoviris Attract program (grant 2015-BB2B-10), by the Spanish Ministry of Economy and Competitiveness (grant PSI2016-77175-P), and by the Marie Skłodowska-Curie Action of the European Commission (grant 743562). Martijn Baart was supported by the Netherlands Organization for Scientific Research (NWO, VENI grant 275-89-027). Efthymia C. Kapnoula was supported by the Spanish Ministry of Economy and Competitiveness, through the Juan de la Cierva-Formación fellowship, and by the Spanish Ministry of Economy and Competitiveness (grant PSI2017-82563-P). Nicola Molinaro was supported by the Spanish Ministry of Science, Innovation and Universities (grant RTI2018-096311-B-I00), the Agencia Estatal de Investigación (AEI), the Fondo Europeo de Desarrollo Regional (FEDER) and by the Basque government (grant PI_2016_1_0014). The authors acknowledge financial support from the Spanish Ministry of Economy and Competitiveness, through the “Severo Ochoa” Programme for Centres/Units of Excellence in R&D” (SEV-2015-490) awarded to the BCBL.

PY - 2020

Y1 - 2020

N2 - Lip-reading is crucial for understanding speech in challenging conditions. But how the brain extracts meaning from—-silent—-visual speech is still under debate. Lip-reading in silence activates the auditory cortices, but it is not known whether such activation reflects immediate synthesis of the corresponding auditory stimulus or imagery of unrelated sounds.To disentangle these possibilities, we used magnetoencephalography to evaluate how cortical activity in 28 healthy adults humans (17 females) entrained to the auditory speech envelope and lip movements (mouth opening) when listening to a spoken story without visual input (audio-only), and when seeing a silent video of a speaker articulating another story (video-only).In video-only, auditory cortical activity entrained to the absent auditory signal at frequencies below 1 Hz more than to the seen lip movements. This entrainment process was characterized by an auditory-speech—to—brain delay of ∼70 ms in the left hemisphere, compared to ∼20 ms in audio-only. Entrainment to mouth opening was found in the right angular gyrus at below 1 Hz, and in early visual cortices at 1—8 Hz.These findings demonstrate that the brain can use a silent lip-read signal to synthesize a coarse-grained auditory speech representation in early auditory cortices. Our data indicate the following underlying oscillatory mechanism: Seeing lip movements first modulates neuronal activity in early visual cortices at frequencies that match articulatory lip movements; the right angular gyrus then extracts slower features of lip movements, mapping them onto the corresponding speech sound features; this information is fed to auditory cortices, most likely facilitating speech parsing.

AB - Lip-reading is crucial for understanding speech in challenging conditions. But how the brain extracts meaning from—-silent—-visual speech is still under debate. Lip-reading in silence activates the auditory cortices, but it is not known whether such activation reflects immediate synthesis of the corresponding auditory stimulus or imagery of unrelated sounds.To disentangle these possibilities, we used magnetoencephalography to evaluate how cortical activity in 28 healthy adults humans (17 females) entrained to the auditory speech envelope and lip movements (mouth opening) when listening to a spoken story without visual input (audio-only), and when seeing a silent video of a speaker articulating another story (video-only).In video-only, auditory cortical activity entrained to the absent auditory signal at frequencies below 1 Hz more than to the seen lip movements. This entrainment process was characterized by an auditory-speech—to—brain delay of ∼70 ms in the left hemisphere, compared to ∼20 ms in audio-only. Entrainment to mouth opening was found in the right angular gyrus at below 1 Hz, and in early visual cortices at 1—8 Hz.These findings demonstrate that the brain can use a silent lip-read signal to synthesize a coarse-grained auditory speech representation in early auditory cortices. Our data indicate the following underlying oscillatory mechanism: Seeing lip movements first modulates neuronal activity in early visual cortices at frequencies that match articulatory lip movements; the right angular gyrus then extracts slower features of lip movements, mapping them onto the corresponding speech sound features; this information is fed to auditory cortices, most likely facilitating speech parsing.

U2 - 10.1523/JNEUROSCI.1101-19.2019

DO - 10.1523/JNEUROSCI.1101-19.2019

M3 - Article

JO - Journal of Neuroscience

JF - Journal of Neuroscience

SN - 0270-6474

ER -