I am trying to build a Tamil-English Translation System using Moses. https://github.com/joshua-decoder/indian-parallel-corpora/tree/master/ta-en is my data source for the parallel corpus. The dict files are approx 70k lines long, the others are in…
I am using Arch linux with next packages:
gcc-multilib 4.8.2-4
boost 1.54.0-4
xmlrpc-c 1:1.36.00-1
giza-pp 1.0.7-2
irstlm 5.80.03-6
moses-git 20121023-1 (which is mosesdecoder v1.0)
I am using phrase tables, reordering models and language models…
I am using the WordPunct Tokenizer to tokenize this sentence:
في_بيتنا كل شي لما تحتاجه يضيع ...ادور على شاحن فجأة يختفي ..لدرجة اني اسوي نفسي ادور شيء
My code is:
import re
import nltk
sentence= " في_بيتنا كل شي لما تحتاجه يضيع ...ادور على شاحن…
While installing Giza from here:
wget http://giza-pp.googlecode.com/files/giza-pp-v1.0.2.tar.gz
After I unzip and run "make" I get the following error:
Pointer.h:27:20: fatal error: stream.h: No such file or directory compilation…
I'm using Moses to make a Language model.
I followed the instructions from this link: Baseline System: Moses
I have google 1-gram file that looks like:
95119665584
95119665584
, 30578667846
. 22077031422
…
I have to download a particular version of software (Moses) to run another piece of software.
The install script attempts to run
svn co
https://mosesdecoder.svn.sourceforge.net/svnroot/mosesdecoder/trunk
moses -r 3284
However Moses has…
We're actually looking to integrate Moses into our localization workflow. Our application is in Java and we're looking at using Moses' functionalities using xml-rpc calls.
Specifically, we're looking at APIs for:
Incremental training (i.e. Avoid…
Context, I'm trying to port a Perl code into Python from https://github.com/moses-smt/mosesdecoder/blob/master/scripts/tokenizer/normalize-punctuation.perl#L87 and there is this regex here in Perl:
s/(\d) (\d)/$1.$2/g;
If I try it with the Perl…