Here you will find details of a forced aligner to facilitate the creation of Welsh speech corpora.
If you would like an easy way of using the forced aligner in the Docker environment, the following resource is also available:
Standard metadata on the forced aligner is also stored in our language resources repository which is part of the European META-SHARE network:
Llwytho'r adnodd 'Forced Aligner on techiaith's META-SHARE' i lawr o metashare.techiaith.cymru
Download 'Forced Aligner on techiaith's META-SHARE' resource from metashare.techiaith.cymru
Language-specific speech technologies such as speech recognition and text-to-speech are dependent on speech corpora consisting of speakers’ spoken examples, transcribed and annotated with information on the phonetics and stress pattern for each word.
Alignment may be tackled by manually matching the text with the sound recordings. However, since some speech corpora are enormous, automatic methods are needed to align text with speech.
A forced aligner uses speech recognition software, transcriptions and pronunciation dictionaries in order to align and specify the location of every word and phoneme within a sound file:
Once a text has been force-aligned with an entire speech corpus, it’s possible to gain a better understanding of the speech corpus’ quality, as well as improve the training of the acoustic models.