Trawsgrifiwr (Welsh Transcriber)

Trawsgrifiwr is a Windows only program which transcribes Welsh speech to text. Trawsgrifiwr doesn’t yet recongnise all your words correctly every time. In simple tests this first version gets about  30% of words in a sentence wrong. Results are given in a text box where you can correct and copy them to the clipboard in order to paste them into any software on your PC.

Download the program from http://techiaith.cymru/trawsgrifiwr-windows/trawsgrifiwr-windows-1.0.2-setup.msi

Trawsgrifiwr was made possible thanks to various projects and cooperation between Mozilla, volunteers, and the Welsh Technologies Unit, Canolfan Bedwyr, Bangor University. Bangor University’s part of the work, including development of Trawsgrifiwr and Ap Macsen (the Welsh Personal Digital Assistant) was funded by the Welsh Government.

Trawsgrifiwr is mainly based on Mozilla’s DeepSpeech. DeepSpeech is a speech recognition engine that can be trained and included easily within any software. To learn more about DeepSpeech go to https://github.com/mozilla/deepspeech. The code for Trawsgrifiwr is available from:

techiaith/trawsgrifiwr-windows

Collecting huge amounts of recordings is crucial to training a speech recognition engine. We have done this mainly through the medium of Common Voice, Mozilla’s platform to collect people’s voices reading aloud specific sentences. We are very grateful to Rhoslyn Prys (meddal.com) who undertook many crowdsourcing campaigns as a volunteer, working with the Mentrau Iaith (Language Ventures), Cyngor Gwynedd local authority, and the National Library of Wales. We also wish to thank the Welsh Government for their publicity campaign, and to the multitude of participant across Wales and beyond who have also contributed their voices to the Welsh version of Common Voice.

We also thank the Centre Inria de Paris for the open source OSCAR corpus which includes a large collection of Welsh texts scraped from the web. We used the corpus to train models of Welsh vocabulary and phraseology in order to help the recognition process and to obtain greater accuracy.  For more information, go to  https://traces1.inria.fr/oscar/.