Corpus phonetics: past, present, and future

Semi-automatic analysis of digital speech collections is transforming the science of phonetics. Convenient search and analysis of large published bodies of recordings, transcripts, metadata, and annotations – as much as three or four orders of magnitude larger than a few decades ago – has created a trend towards “corpus phonetics,” whose benefits include greatly increased researcher productivity, better coverage of variation in speech patterns, and crucial support for reproducibility. This trend is still largely confined to acoustic phonetics, with articulatory and perceptual research lagging behind. Other remaining challenges include still-limited access to the necessary skills, and a lack of consistent standards. These changes coincide with the broader Open Data movement, but future solutions will also need to include more constrained forms of publication motivated by valid concerns for privacy, confidentiality, and intellectual property.

Cours

PDF