1

Currently I am new to Moses and have trained a few sample data set provided on websites. I am looking for more data sets to train the system. Are these available online? What should I be looking at while searching on google?

user2800040
  • 143
  • 2
  • 13

1 Answers1

4

You can find several corpora at: http://opus.lingfil.uu.se

Also, some open-source applications include their bilingual PO files, but you have to check the license.

My advice is to build a vertical (i.e. domain-specific) MT system, rather than a generic one, to get better results. So this decision will affect which corpora you choose.

I hope this helps!

Mina
  • 51
  • 2