Since we’ve already released our machine translation system in Docker, it’s easy enough to get it running on an OS X system!
First, you will need to install one or two pieces of software on your computer. This tutorial uses a homebrew to install the packages.
(You can look again at the original tutorial if you like).
- Docker needs VirtualBox on OS X (and Windows) to run the Linux virtual engineering. Download VirtualBox from the VirtualBox website.
Installing boot2docker and docker
We will be using a Homebrew in order to install these. Open Terminal and write the following commands:
ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
This will install the homebrew on your computer.
- Next, install boot2docker and docker with the following commands:
brew install boot2docker
brew install docker
- Start boot2docker (so that you can download the virtual engine) like this:
Increasing Virtual Box’s disk space
VirtualBox’s virtual disk will be created with a size limit of 20GB. The machine translation system (Moses SMT), including the language model file, needs more disk space than this, so the disk size will obviously need to be increased. This is unfortunately quite a long process, but the good news is that Docker have written a very simple tutorial on how to do it!
We recommend that you increase the disk size to 30GB (although the machine translation system only needs around 21GB).
Downloading and installing the translation system
Once you’ve increased the disk size in VirtualBox, you will need to start the boot2docker engine. Go back to Terminal, and write:
Make a note of what is printed on the screen at the end of this command. This is important because you will need it to communicate with Docker. It should look something like this:
The last three lines are particularly important. Copy them, and then paste them into your Terminal window so that you can run the export commands.
Docker is ready
Now, after all this work, Docker should be ready!
Download the machine translation file using the following command:
docker pull techiaith/moses-smt
And then start the engine with:
docker run --name moses-smt-cofnodycynulliad-en-cy -p 8008:8008 -p 8080:8080 techiaith/moses-smt start -e CofnodYCynulliad -s en -t cy
Note: this command downloads a translation model which is based on the Proceedings of the National Assembly for Wales corpus. You can change the name ‘CofnodYCynulliad’ after the ‘start’ command to any one of the three below:
- CofnodYCynulliad (en-cy a cy-en) – two large models which are based on the Proceedings of the National Assembly for Wales. One is specifically for translation from English to Welsh (en-cy), and the other is for translation from Welsh to English (cy-en). Size: ~3.7GB each.
- CofnodBachYCynulliad – a much smaller model of the proceedings corpus which is based on a sub-set of the data (we recommend this if you just want to experiment quickly). Size: ~65MB
- Deddfwriaeth – this engine was trained with data from the Legislation corpus. Size: ~900MB
These three language models are also available for download from techiaith.org. See http://techiaith.org/moses/
It’s also important to note that you can use your own language model for this step (if you’ve already trained one)! Remember that the data we provide is a basis only, and it’s fairly simple to train your own language model. See the docs for more information on how to do this here.
See Moses working
The final ‘docker run’ command creates a server on your local computer on the port 8008. To connect with this port, you will need to open ports in the VirtualBox. Open the ‘VirtualBox.app’ program (in your ‘Applications’ folder, and then click on Settings’, and then on the ‘Network’ tab. There is a button at the bottom of the screen called ‘port forwarding’. Add rules as you can see below:
Go to http://127.0.0.1:8008 in your browser and start translating!