Machine Translation Demo

Below you will find a box for Machine Translation where you can play around with using a translation engine to translate for you. This will allow you to assess the strengths and weaknesses of the various translation engines. As with all translation engines, the accuracy and suitability of the translation depends to a large extent on the nature of the input material, and how similar it is to the data the engine was trained on.

At present there is a choice of three different translation engines, trained on three different parallel corpora, using statistical means to generate translations through the Moses SMT system. The Assembly Record engine (‘CofnodYCynulliad’) works from Welsh to English and English to Welsh. At present the other two engines, one from the Legislature (‘Deddfwriaeth’) and the other from software localisations (‘Meddalwedd’) only work from English to Welsh.

Machine Translation is not acceptable as a substitute for human translators where the translation is intended for official or public consumption.

It is acceptable to use MT hand in hand with human post-editing and this may be incorporated in the workflow with a translation memory system such as Trados, Wordfast or CyfieithuCymru.

A rolling training programme is recommended to update translators’ knowledge of new technological advances as part of their professional development, and this will help in getting the most out of new technology.

For more see here : Note on Machine Translation and Translation Tools for Translation Managers and Commissioners

More on translation resources : Translation – Introduction


Demo

Individual Engine Details

CofnodYCynulliad (Assembly Record)

Trained mainly with data scraped from the Assembly Record website. The total number of parallel segments aligned is 756894 which are available for download from

http://techiaith.cymru/corpws/Moses/CofnodYCynulliad/CofnodYCynulliad.tar.gz

Crown Copyright Data.
The Record of Proceedings for the National Assembly of Wales is Crown copyright. Data from the Record is reproduced under the policy guideline terms for Crown copyright published by HMSO and the National Assembly.

Deddfwriaeth (Legistlation)

Based on data from www.legislation.gov.uk, Crown copyright, prepared under the terms of the Open Government Licence version 3: http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/. The total number of parallel segments aligned is 243692 which are available for download from

http://techiaith.cymru/corpws/Moses/Deddfwriaeth/Deddfwriaeth.tar.gz

Data Copyright David Chan 2014. This work is provided under the Creative Commons Attribution 4.0 International License: http://creativecommons.org/licenses/by/4.0/

Meddalwedd (Software)

Suitable for facilitating the translation of software into Welsh.

Based on the translation of Mozilla Firefox, WordPress, LibreOffice and Linux open source software by Rhoslyn Prys and others. Recognition for translations are available at each project’s website:

  • Mozilla – https://www.mozilla.org
  • LibreOffice – https://www.documentfoundation.org
  • WordPress – https://www.wordpress.org
  • Linux Mint – https://www.linuxmint.org

The total number of parallel segments aligned is 47187 which are available for download from:

http://techiaith.cymru/corpws/Moses/Meddalwedd/Meddalwedd.tar.gz

This work is provided under the General Public License : http://www.gnu.org/licenses/gpl.html