Or the ability to create your own naturally sounding Welsh language synthetic voices…
Ar part of our work on the Macsen project, we’ve created tools that will enable you to create naturally sounding Welsh language synthetic voices. The tools make it easy for you to prepare recordings scripts, record an individual’s voice, and with its knowledge of Welsh language pronunciation, build for you a Welsh language synthetic voice that sounds very similar to the recorded individual.
Here are examples of the voices of two members of the techiaith team having been synthesized with the new tools:
The team had the opportunity to demonstrate these tools at a recent SeneddLab 2017 event where a new voice was created within one hour, named ‘RoboLlywydd’ and used to speak the answers to questions about the National Assembly for Wales. Although the ‘RoboLlywydd’ name was just for fun, it showed that it’s possible to create and use many different individual voices within your own personal digital assistants. The following video talks more about this (especially after the fifth and a half minute):
We used an already open source system called MaryTTS which can now be used to create Welsh voices using the resources at the following GitHub repository:
This is a technology which is becoming increasingly prevalent as the human voice is used more and more for question and answer systems on mobile phones and tablets, and voice control for such things as television sets, robots and dictation systems. If Welsh cannot be used in these environments, then the language will be excluded from the digital world and Welsh speakers will have no choice but to speak English with these devices.
In order to pave the way for new Welsh medium technologies we have produced a Welsh question and answer prototype, where a personal assistant called “Macsen” is able to answer questions such as what is the news or weather.
Here is a video that introduces Macsen and demonstrates it at work on a small Raspberry Pi computer:
All of Macsen’s code and resources are available on GitHub so that anyone can expand its capabilities and develop their own Macsen. The homepage for Macsen on the web and where you’ll know where to begin is:
We will continue to work on speech recognition and other open resources for Macsen. Get in touch with us if you’re a software company, coding club, school or a hacker with an interest in including Macsen into your own software projects.
We are developing Welsh language speech recognition as part of our Welsh Language Communications Infrastructure, sharing it here on the Welsh National Language Technologies Portal with other developers of Welsh language software and apps.
Today we are pleased to share the first version of a Welsh language speech recognition system
Julius Cymraeg (julius-cy)
This project is based on the Julius – an open source large vocabulary continuous speech recognition (LVCSR) system and the files, sripts required to its adaption for supporting to recognize Welsh language speech rather than English or Japanese.
The first release allows julius-cy to recognize very simple questions and commands in Welsh concerning the weather, news, time, music as well as asking for a joke or a proverb. This means that julius-cy is limited to recognising specific sentences and vocabulary:
You can try adding your own texts and questions for julius-cy to recognize after reading this!
Hmm. It doesn’t work very well for me. How can I help?
We are using very initial acoustic models in julius-cy, therefore it may be possible that julius-cy will not be able to fully recognize everyone’s speech successfully.
If this is the case, and you have not already contributed your voice to our Paldaruo Speech Corpus, then please use our Paldaruo ap (http://techiaith.bangor.ac.uk/paldaruo) on any iOS or Android device so that we can improve the acoustic models with your voice.
It is increasingly possible for you to speak with devices such as your phone or computer in order to command and control applications and devices as well as to receive intelligent and relevant answers to questions voiced in natural language.
Such capabilities are possible as a consequence of recent advancements in speech recognition, machine translation and natural language processing and understanding. As such they are the prime enablers for a disruptive change and a fundamental shift in how users and consumers engage with their devices and how they more widely use technology.
If looked at in its wider historical context, this is only the next step in the evolution of human computer interaction; from keyboard, to mouse, to touch, to voice and language.
There are four main commercial platforms driving this change, namely Siri, Ok Google, Microsoft Cortana and Amazon Alexa, as well as some lesser known open platforms.
To date, these provide their powerful capabilities in English and some other major languages, with little evidence that they are likely to extend their choice of languages to the ‘long tail’ of smaller languages, including Welsh, in the near future.
The Language Technologies Unit has been sponsored by the Welsh Government through its Welsh Language Technology and Digital Media Fund and S4C therefore to fulfill the ‘Welsh Language Communications Infrastructure‘ project, ensuring that users with a preferred language of Welsh are not left behind in such developments.
Our first deliverable as part of the project is a brief report on how we can achieve this. It concludes that the commercial offerings by the large companies do not provide any technical means at the moment for realising a Welsh language digital assistant. Thus only open alternatives such as finer grained online APIs and various open source software allow us to progress.
It is hoped that the project will lay the foundations for a range of Welsh language technologies to be used in such environments, including improving the work done to date on Welsh language speech recognition as well as machine translation for leveraging some of capabilities provided via English language based technologies.
All of the software and resources developed by the project will be available here from the Welsh National Language Technologies Portal. The project will stimulate the development of new Welsh language software and services that could contribute to the mainstreaming of Welsh in the next phase of human-computer interaction.
In the meantime, we need your help! Please contribute your voice to our speech corpus via our Paldaruo app:
Since its launch in March, a few coders and companies have been using the cloud based Welsh language text-to-speech API service.
Very often however, developers from companies in particular wish to utilise Welsh language text-to-speech available offline and in Microsoft Windows based environments. We also get from time to time e-mails from text-to-speech developers of other lesser resourced languages asking for help on using their own voices in Microsoft Windows.
Our Welsh language text-to-speech voice is possible thanks to the superb Festival Speech Synthesis System. However, Festival, as its developers openly admit, does not support Microsoft Windows very well at all.
We think that Festival and its Welsh voice should be possible in Microsoft Windows. Therefore, we’ve published the speech data that makes Festival talk Welsh on GitHub as well as hack on the side to create a Visual Studio Solution project that makes Festival run natively on Windows with a very basic COM and .NET interface.
Without these resources there are very few, if any, options for Welsh or any Festival voice to be usable on Windows. We hope that these contributions are of great help and can be improved upon with the aid of Welsh language and international open source communities.
Text to speech technologies are now commonly used in mobile apps, websites and desktop applications to improve user experience and understanding. Today we are pleased to launch an API service that will make it possible for anybody to insert Welsh text to speech technologies into their websites and software.
Using the open source Festival Speech Synthesis System, and a Welsh language speech model we previously created our new web API makes it easy to automatically convert any Welsh text into audio in realtime. This cloud service needs no setup on the user’s side making it instantly widely accessible and available to all.
Below, you can find an example of how this voice could be inserted into this page, with only one line of code!
You can get started with the API today by signing up to our API Centre and creating your API key.
To learn more see our Speech Technologies pages.