Macsen – Welsh Language Digital Assistant

Macsen is an open source Welsh language digital assistant that can run on the Raspberry Pi.

All source code and other resources are available on GitHub so that anyone can join us and download, adapt and develop their own ‘Macsen’ system.

techiaith/macsen

Let us know if you are, as a software company, a coding club, school or just as an enthusiast using Macsen in your projects or activities.

HOW ELSE CAN I HELP?

HELP US DEVELOP MACSEN AND WELSH LANGUAGE SPEECH RECOGNITION.
CONTRIBUTE YOUR VOICE THROUGH OUR CROWDSOURCING APP, ‘PALDARUO’

Apple AppStore  |  Google Play

The Paldaruo ap is used to crowdsource recordings of various individuals speaking Welsh. The recordings are kept in the Paldaruo Speech Corpus. The corpus to date contains 34 hours of recordings by nearly 500 individuals and has been used to train the speech recognition components within Macsen.

But we need more recordings, by more individuals, so that we can improve recognition accuracies and expand the range of texts and questions Macsen can recognize.

Here is a video on how you can use Paldaruo:

More information about Welsh language speech recognition resources are available from the following page on the Welsh National Language Technologies Portal.

 

Macsen Research Publications

BUILDING INTELLIGENT ASSISTANTS FOR SPEAKERS OF A LESSER-RESOURCED LANGUAGE, CCURL 2016 2nd Workshop on Collaboration and Computing for Under-Resourced Languages ‘Towards an Alliance for Digital Language Diversity’ LREC 2016, Portoroz, Slovenia.
– http://techiaith.bangor.ac.uk/posteri/#BuildingIntelligentAssistantsforSpeakersofaLesser-ResourcedLanguage