Category Archives: Speech

Bangor University has just developed new training scripts and models that bring together the various features of DeepSpeech, along with CommonVoice data, and provides a complete solution for producing models and scorers for Welsh language speech recognition. They may be of interest to any other users of DeepSpeech that are working with a similarly lesser resourced language to Welsh.

The scripts:

are based on DeepSpeech 0.7.4
make use of DeepSpeech’s Dockerfiles (so setup and installation is easier).
train with CommonVoice data
utilize transfer learning
with some additional test sets and corpora, produce optimized scorers/language models for various applications
exports models with metadata

The initial README 4 describes how to get started.

We’d like to share also the models that are produced from these scripts which can be found at https://github.com/techiaith/docker-deepspeech-cy/releases/tag/20.06 4

At the moment these models are used in two prototype applications which the Welsh speaking community can install and try, namely a Windows/C# based transcriber and an Android/iOS voice assistant app 1 called Macsen. Source code for these applications using DeepSpeech can also be found on GitHub.

We are immensly grateful to Mozilla for creating the Common Voice and DeepSpeech projects.

Mozilla CommonVoice, Paldaruo ac Adnabod Lleferydd Cymraeg

June 7, 2018

techiaith

Mae Mozilla, y cwmni rhyngwladol o Galifornia sy’n gyfrifol am y porwr gwe Firefox, newydd lansio eu cynllun CommonVoice amlieithog. Ar ôl cychwyn gyda Saesneg y llynedd, mae tair iaith newydd yn cael eu hychwanegu yn awr, sef y Gymraeg, Almaeneg, a Ffrangeg. Llwyddodd y Gymraeg i gyrraedd y brig oherwydd cymorth gan yr Uned Technolegau Iaith yng Nghanolfan Bedwyr, Prifysgol Bangor.

Rhagor o Leisiau i Common Voice
https://blog.mozilla.org/press-uk/2018/06/07/more-common-voices/#Cymraeg

Rydyn ni’n hynod o falch am CommonVoice Cymraeg ac yn awyddus iawn i gannoedd a miloedd o siaradwyr Cymraeg gyfrannu eu lleisiau drwy’r wefan neu’r ap.

Ond beth am Paldaruo? – ein ap torfoli sydd eisoes wedi casglu ers 2014 hyd at 38 awr o ddata lleferydd gan dros 500 unigolyn, ac sydd wedi helpu gwireddu meddalwedd cynorthwyydd personol digidol Cymraeg cod agored fel Macsen. Mae’r Uned wedi defnyddio gwaith Paldaruo i gynorthwyo Mozilla darparu CommonVoice ar gyfer y Gymraeg ac ieithoedd eraill llai eu hadnoddau eraill.

Un o’r heriau yw canfod a darparu testunau hwylus i’w ddarllen ond sy’n cynnwys ystod eang a chytbwys o ffonemau’r iaith. Ar gyfer y lansiad, mae 1200 promt gan yr Uned o fewn CommonVoice Cymraeg ond bydd angen mwy. Wrth i ni, a’r gymuned Cymraeg, gyfrannu rhagor o destunau a recordiadau i CommonVoice Cymraeg, rydyn ni’n rhagweld y bydd y corpws yn hwb sylweddol i weithgareddau ymchwil a datblygu adnabod lleferydd Cymraeg yr Uned ac eraill.

Y gobaith yw y bydd y bartneriaeth rhwng Mozilla a Phrifysgol Bangor yn tyfu, ac y bydd y gweithgaredd hwn hefyd yn symbylu cwmnïau mawr eraill i gynnwys y Gymraeg ac ieithoedd eraill llai eu hadnoddau yn eu cynlluniau rhyngwladol.

Cyfeiriad y wefan yw : https://voice.mozilla.org/cy ac mae’r ap ar gael o https://itunes.apple.com/us/app/project-common-voice-by-mozilla/id1240588326

Introducing Lleisiwr – Welsh Open Source Voice Banking and Text to Speech

April 12, 2018

techiaith

Coding Resources Speech Text to Speech No Comments

In November 2017, The Language Technology Unit received a small grant from the Welsh Government’s Technology and the Welsh Language Fund, to work with the NHS as partners on a project to allow patients on the brink of losing their voice to bank their voice and then generate a personal digital synthetic voice. This had never before been availabe for Welsh speakers, and is a great step forward for Welsh speaking patients.

More information about this service can be found here including details for sofware developers about the package’s source code.

Here is a short video that shows you how to register for the service

There has been quite a favourable initial response on the social websites :

#Lleisiwr.techiaith.cymru.. successfully banked my voice in Welsh language!! Mae o'n gwych! Incredible technology! Can't wait to support patients to use this ?@techiaith #prosiectlleisiwr @RCSLTWales @BCUHB #voicebanking pic.twitter.com/drEND7ptnN

— Anna Rhiannon (@Anna24127661) 28 March 2018

New Speech Resources

December 20, 2017

techiaith

Resources Speech Speech Recognition No Comments

New speech resources have just been published by us under the Macsen project, funded by the Welsh Government. See details below. Enjoy!

HTK Acoustic Model

http://techiaith.cymru/htk/paldaruo-16kHz-2017-12-08.tar.gz

Lexicon

http://techiaith.cymru/htk/lexicon-2017-12-08.tar.gz

Prosodylab Aligner

There are also new HTK acoustic models included in the Welsh Prosodylab Aligner:

https://github.com/techiaith/Prosodylab-Aligner/tree/v2.0_paldaruo_4

Kaldi Acoustic Model

http://techiaith.cymru/kaldi/decoders/paldaruo_macsen/tri3-2017-12-18.tar.gz

Training code in GitHub

https://github.com/techiaith/kaldi-cy

RoboLlywydd

April 7, 2017

techiaith

Speech Text to Speech No Comments

Or the ability to create your own naturally sounding Welsh language synthetic voices…

Ar part of our work on the Macsen project, we’ve created tools that will enable you to create naturally sounding Welsh language synthetic voices. The tools make it easy for you to prepare recordings scripts, record an individual’s voice, and with its knowledge of Welsh language pronunciation, build for you a Welsh language synthetic voice that sounds very similar to the recorded individual.

Here are examples of the voices of two members of the techiaith team having been synthesized with the new tools:

Male:

Female:

The team had the opportunity to demonstrate these tools at a recent SeneddLab 2017 event where a new voice was created within one hour, named ‘RoboLlywydd’ and used to speak the answers to questions about the National Assembly for Wales. Although the ‘RoboLlywydd’ name was just for fun, it showed that it’s possible to create and use many different individual voices within your own personal digital assistants. The following video talks more about this (especially after the fifth and a half minute):

We used an already open source system called MaryTTS which can now be used to create Welsh voices using the resources at the following GitHub repository:

techiaith/docker-marytts

Darlith Cymdeithas Wyddonol Gwynedd

November 8, 2016

techiaith

Speech Recognition Speech No Comments

Fe fydd Dewi Bryn Jones o Uned Technolegau Iaith, Canolfan Bedwyr, Prifysgol Bangor yn traddodi ar y pwnc;

Datblygu Adnabod Lleferydd ar gyfer y Gymraeg.

Mae’n gynyddol bosibl i chi siarad gyda dyfeisiadau fel eich ffôn neu gyfrifiadur er mwyn hwyluso defnyddio apiau, gwefannau a hefyd derbyn atebion deallus a pherthnasol i gwestiynau a ofynnwyd mewn iaith naturiol. Apple Siri, Microsoft Cortana, Amazon Alexa a Google Assistant yw rhai o’r cynnyrch a gwasanaethau masnachol poblogaidd sydd yn gyrru’r newid hwn gyda’r iaith Saesneg.

Yn y ddarlith hon bydd Dewi Bryn Jones o Uned Technolegau Iaith, Canolfan Bedwyr, Prifysgol Bangor yn cyflwyno’r gwaith sydd ym Mangor ar ddatblygu adnabod lleferydd ar gyfer cychwyn galluogi’r un ddarpariaeth i ddefnyddwyr Cymraeg. Swyddogaeth adnabod lleferydd yw trosi sain lleferydd unigolyn i destun ac felly bydd Dewi yn esbonio’r dulliau a’r data a ddefnyddir yn ogystal â chyflwyno’r canlyniadau diweddaraf.

Cynhelir y cyfarfod am 7.30 ar nos Lun Tachwedd 14eg yn ystafell 1.07 (llawr cyntaf), Canolfan Bedwyr, Y Ganolfan Reolaeth, Ffordd y Coleg, Bangor.

Introducing Macsen

May 3, 2016

techiaith

Coding Raspberry Pi Speech Speech Recognition No Comments

During 2015-2016 we have been developing new resources that enable you to talking in Welsh with computers. See Start Speaking Welsh to your Computer, Towards a Welsh ‘Siri’

This is a technology which is becoming increasingly prevalent as the human voice is used more and more for question and answer systems on mobile phones and tablets, and voice control for such things as television sets, robots and dictation systems. If Welsh cannot be used in these environments, then the language will be excluded from the digital world and Welsh speakers will have no choice but to speak English with these devices.

In order to pave the way for new Welsh medium technologies we have produced a Welsh question and answer prototype, where a personal assistant called “Macsen” is able to answer questions such as what is the news or weather.

Here is a video that introduces Macsen and demonstrates it at work on a small Raspberry Pi computer:

All of Macsen’s code and resources are available on GitHub so that anyone can expand its capabilities and develop their own Macsen. The homepage for Macsen on the web and where you’ll know where to begin is:

http://techiaith.cymru/macsen

We will continue to work on speech recognition and other open resources for Macsen. Get in touch with us if you’re a software company, coding club, school or a hacker with an interest in including Macsen into your own software projects.

‘Macsen’ was developed within the ‘Welsh Language Communications Infrastructure’ project which was funded by the Welsh Government and S4C.

Start speaking Welsh to your computer

January 19, 2016

techiaith

Speech Speech Recognition No Comments

We are developing Welsh language speech recognition as part of our Welsh Language Communications Infrastructure, sharing it here on the Welsh National Language Technologies Portal with other developers of Welsh language software and apps.

Today we are pleased to share the first version of a Welsh language speech recognition system

Julius Cymraeg (julius-cy)

This project is based on the Julius – an open source large vocabulary continuous speech recognition (LVCSR) system and the files, sripts required to its adaption for supporting to recognize Welsh language speech rather than English or Japanese.

mic_web — http://julius.osdn.jp/en_index.php

The first release allows julius-cy to recognize very simple questions and commands in Welsh concerning the weather, news, time, music as well as asking for a joke or a proverb. This means that julius-cy is limited to recognising specific sentences and vocabulary:

“BETH YDY’R TYWYDD HEDDIW?” ( “What’s today’s weather?” )
“BETH YW TYWYDD YFORY?” ( “What’s tomorrow’s weather?” )
“BETH YW’R NEWYDDION?” ( “What’s the news?” )
“FAINT O’R GLOCH YDY HI?” ( “What time is it?” )
“CHWARAEA GERDDORIAETH CYMRAEG” ( “Play Welsh music?” )

Future versions of julius-cy will attempt to support recognising dictation and more varied speech.

Everything you need to easily get started is available with very liberal licensing on GitHub.

Got to:

https://github.com/techiaith/julius-cy

This is amazing! How does it work?

The background page explains more about the internals of the first release:

https://github.com/techiaith/julius-cy/blob/master/CEFNDIR.md

You can try adding your own texts and questions for julius-cy to recognize after reading this!

Hmm. It doesn’t work very well for me. How can I help?

We are using very initial acoustic models in julius-cy, therefore it may be possible that julius-cy will not be able to fully recognize everyone’s speech successfully.

If this is the case, and you have not already contributed your voice to our Paldaruo Speech Corpus, then please use our Paldaruo ap (http://techiaith.bangor.ac.uk/paldaruo) on any iOS or Android device so that we can improve the acoustic models with your voice.

Towards a Welsh ‘Siri’…..

November 4, 2015

techiaith

Machine Translation Resources Speech Speech Recognition No Comments

It is increasingly possible for you to speak with devices such as your phone or computer in order to command and control applications and devices as well as to receive intelligent and relevant answers to questions voiced in natural language.

Such capabilities are possible as a consequence of recent advancements in speech recognition, machine translation and natural language processing and understanding. As such they are the prime enablers for a disruptive change and a fundamental shift in how users and consumers engage with their devices and how they more widely use technology.

If looked at in its wider historical context, this is only the next step in the evolution of human computer interaction; from keyboard, to mouse, to touch, to voice and language.

There are four main commercial platforms driving this change, namely Siri, Ok Google, Microsoft Cortana and Amazon Alexa, as well as some lesser known open platforms.

To date, these provide their powerful capabilities in English and some other major languages, with little evidence that they are likely to extend their choice of languages to the ‘long tail’ of smaller languages, including Welsh, in the near future.

The Language Technologies Unit has been sponsored by the Welsh Government through its Welsh Language Technology and Digital Media Fund and S4C therefore to fulfill the ‘Welsh Language Communications Infrastructure‘ project, ensuring that users with a preferred language of Welsh are not left behind in such developments.

Our first deliverable as part of the project is a brief report on how we can achieve this. It concludes that the commercial offerings by the large companies do not provide any technical means at the moment for realising a Welsh language digital assistant. Thus only open alternatives such as finer grained online APIs and various open source software allow us to progress.

It is hoped that the project will lay the foundations for a range of Welsh language technologies to be used in such environments, including improving the work done to date on Welsh language speech recognition as well as machine translation for leveraging some of capabilities provided via English language based technologies.

All of the software and resources developed by the project will be available here from the Welsh National Language Technologies Portal. The project will stimulate the development of new Welsh language software and services that could contribute to the mainstreaming of Welsh in the next phase of human-computer interaction.

In the meantime, we need your help! Please contribute your voice to our speech corpus via our Paldaruo app:

paldaruo

iTunes Google Play

Project Raspberry Pi: Symud braich robot gyda’ch llais

October 8, 2015

techiaith

Speech Recognition Coding Speech Raspberry Pi No Comments

Yn yr Eisteddfodau a digwyddiadau Hacio’r Iaith diweddar, rydym wedi arddangos ein breichiau robot sy’n glwm i Raspberry Pis ac sy’n yn ymateb i gyfarwyddyd yn y Gymraeg.

Dyma fideo o dair braich gyda’i gilydd :

Mae’n system adnabod lleferydd syml iawn a nawr, i’r rhai sy’n teimlo’n anturus, dyma gyfarwyddiadau ar sut y gallwch chithau gosod y demo ar eich Raspberry Pi chi.

Byddwch angen yr offer canlynol:

Gosodiad arferol Raspberry Pi yn rhedeg y fersiwn diweddaraf o Raspbian, gyda bysellfwrdd a llygoden yn glwm.
Braich robot (fel y Robot Arm Edge : http://www.maplin.co.uk/p/usb-controlled-robotic-arm-kit-a37jn)
Microffon – fel y Kinobo “Akiro” USB neu unrhyw a argymhellir yma: http://elinux.org/RPi_VerifiedPeripherals

Os rydych yn defnyddio Raspberry Pi hŷn, gyda ddim ond dau borth USB, yna rydych angen hwb USB, fel http://www.modmypi.com/raspberry-pi/accessories/usb-hubs/pihub-official-4-port-raspberry-pi-usb-hub-eu-plug-5v-3a, er mwyn cysylltu popeth.

Mae’r demo yn defnyddio peiriant adnabod lleferydd cod agored o’r enw ‘Julius’. Mae hefyd yn defnyddio modelau acwstig rydym wedi eu cynhyrchu gyda recordiadau 20 unigolyn yn llefaru promtiau arbennig.

Teipiwch y canlynol o linell gorchymyn ar eich Raspberry Pi er mwyn gosod y system ‘Julius’:

$ sudo apt-get update
$ sudo apt-get install alsa-tools alsa-oss flex zlib1g-dev libc-bin libc-dev-bin python-pexpect libasound2 libasound2-dev cvs
$ cvs -z3 -d:pserver:anonymous@cvs.sourceforge.jp:/cvsroot/julius co julius4
$ export CFLAGS="-O2 -mcpu=arm1176jzf-s -mfpu=vfp -mfloat-abi=hard -pipe -fomit-frame-pointer"
$ ./configure --with-mictype=alsa
$ sudo make
$ sudo make install
$ export ALSADEV="plughw:1,0"
$ julius

Os yw’r llinell olaf yn achosi i’r canlynol ymddangos, yna rydych wedi gosod Julius yn llwyddiannus!

Julius rev.4.3.1 - based on
JuliusLib rev.4.3.1 (fast) built for x86_64-unknown-linux-gnu

Copyright (c) 1991-2013 Kawahara Lab., Kyoto University
Copyright (c) 1997-2000 Information-technology Promotion Agency, Japan
Copyright (c) 2000-2005 Shikano Lab., Nara Institute of Science and Technology
Copyright (c) 2005-2013 Julius project team, Nagoya Institute of Technology

Try '-setting' for built-in engine configuration.
Try '-help' for run time options.

Yn nesaf, rhaid i chi lwytho i lawr ein ffeiliau adnabod lleferydd braich robot o’r Porth Technolegau Iaith ar gyfer eu defnyddio gyda Julius.

$ mkdir robot
$ cd robot
$ wget http://techiaith.cymru/gallu/braichrobot.tar.gz
$ tar -zxvf braichrobot.tar.gz

Ac yna er mwyn cael y Raspberry Pi a’r fraich robot i ymateb i’r gorchmynion ar lafar, teipiwch:

$ cd braichrobot
$ sudo python robotarm_voicectl.py

Dylai’r gair ‘siaradwch’ ymddangos. Dyma beth fyddwch nawr yn gallu dweud wrth y fraich:

ysgwydd i fyny
ysgwydd i lawr
penelin i fyny
penelin i lawr
arddwrn i fyny
arddwrn i lawr
gafael agor
gafael cau
troi i’r chwith
troi i’r dde
golau ymlaen

Gobeithio bydd y project bach yma yn hwyl yn enwedig i ddisgyblion Ysgol Pont y Gof, Botwnnog a enillodd un o’n breichiau robot mewn cystadleuaeth codio yng Ngholeg Meirion Dwyfor ym Mhwllheli yn ystod yr haf:

Ysgol Pont y Gof oedd enillwyr @CodeClub campus Pwllheli @rondomedia @GaiaTechologies @delyth @prifysgolbangor #code pic.twitter.com/zUddmUITgl

— Coleg Meirion-Dwyfor (@meiriondwyfor) June 26, 2015

Yn y cyfamser, diolch i nawdd gan Lywodraeth Cymru ac S4C, rydym yn parhau i ddatblygu adnabod lleferydd Cymraeg ac i’w chynnig yn rhad ac am ddim o fewn y Porth Technolegau Iaith. Ein bwriad yw datblygu systemau mwy soffistigedig a mwy defnyddiol.

Ond mae angen eich help! Cyfrannwch eich llais drwy ein ap Paldaruo:

paldaruo

iTunes Google Play