Towards a Welsh ‘Siri’…..

It is increasingly possible for you to speak with devices such as your phone or computer in order to command and control applications and devices as well as to receive intelligent and relevant answers to questions voiced in natural language.

Such capabilities are possible as a consequence of recent advancements in speech recognition, machine translation and natural language processing and understanding. As such they are the prime enablers for a disruptive change and a fundamental shift in how users and consumers engage with their devices and how they more widely use technology.

If looked at in its wider historical context, this is only the next step in the evolution of human computer interaction; from keyboard, to mouse, to touch, to voice and language.

There are four main commercial platforms driving this change, namely Siri, Ok Google, Microsoft Cortana and Amazon Alexa, as well as some lesser known open platforms.

 

 

To date, these provide their powerful capabilities in English and some other major languages, with little evidence that they are likely to extend their choice of languages to the ‘long tail’ of smaller languages, including Welsh, in the near future.

The Language Technologies Unit has been sponsored by the Welsh Government through its Welsh Language Technology and Digital Media Fund and S4C therefore to fulfill the ‘Welsh Language Communications Infrastructure‘ project, ensuring that users with a preferred language of Welsh are not left behind in such developments.

Our first deliverable as part of the project is a brief report on how we can achieve this. It concludes that the commercial offerings by the large companies do not provide any technical means at the moment for realising a Welsh language digital assistant. Thus only open alternatives such as finer grained online APIs and various open source software allow us to progress.

It is hoped that the project will lay the foundations for a range of Welsh language technologies to be used in such environments, including improving the work done to date on Welsh language speech recognition as well as machine translation for leveraging some of capabilities provided via English language based technologies.

All of the software and resources developed by the project will be available here from the Welsh National Language Technologies Portal. The project will stimulate the development of new Welsh language software and services that could contribute to the mainstreaming of Welsh in the next phase of human-computer interaction.

In the meantime, we need your help! Please contribute your voice to our speech corpus via our Paldaruo app:

paldaruo

iTunes Google Play