The application's purpose is to translate lemmas of words present in the sentence from Russian to English. I'm doing it with help of sdict formatted vocabulary, which is queried by python script which is called by c++ program.
My purpose is to get the following output :
Выставка/exhibition::1 конгресс/congress::2 организаторами/organizer::3 которой/ which::4 являются/appear::5 РАО/NONE::6 ЕЭС/NONE::7 России/NONE::8 EESR/NONE::9 нефтяная/oil::10 компания/company::11 ЮКОС/NONE::12 YUKOS/NONE::13 и/and::14 администрация/administration::15 Томской/NONE::16 области/region::17 продлится/last::18 четыре/four::19 дня/day::20
The following code succeeded for the sentence, however for the second sentence and so on I get a wrong output:
Егор/NONE::1 Гайдар/NONE::2 возглавлял/NONE::3 первое/head::4 российское/first::5 правительство/NONE::6 которое/government::7 называли/which::8 правительством/call::9 камикадзе/government::10
Note: NONE
is used for words lacking translation.
I'm running the following C++ code excerpt which actually calls PyRun_SimpleString
:
for (unsigned int i = 0; i < theSentenceRows->size(); i++){
stringstream ss;
ss << (i + 1);
parsedFormattedOutput << theSentenceRows->at(i)[FORMINDEX] << "/";
getline(lemmaOutFileForTranslation, lemma);
PyObject *main_module, *main_dict;
PyObject *toTranslate_obj, *translation, *emptyString;
/* Setup the __main__ module for us to use */
main_module = PyImport_ImportModule("__main__");
main_dict = PyModule_GetDict(main_module);
/* Inject a variable into __main__, in this case toTranslate */
toTranslate_obj = PyString_FromString(lemma.c_str());
PyDict_SetItemString(main_dict, "start_word", toTranslate_obj);
/* Run the code snippet above in the current environment */
PyRun_SimpleString(pycode);
**usleep(2);**
translation = PyDict_GetItemString(main_dict, "translation");
Py_XDECREF(toTranslate_obj);
/* writing results */
parsedFormattedOutput << PyString_AsString(translation) << "::" << ss.str() << " ";
Where pycode is defined as:
const char *pycode =
"import sys\n"
"import re\n"
"import sdictviewer.formats.dct.sdict as sdict\n"
"import sdictviewer.dictutil\n"
"dictionary = sdict.SDictionary( 'rus_eng_full2.dct' )\n"
"dictionary.load()\n"
"translation = \"*NONE*\"\n"
"p = re.compile('( )([a-z]+)(.*?)( )')\n"
"for item in dictionary.get_word_list_iter(start_word):\n"
" try:\n"
" if start_word == str(item):\n"
" instance, definition = item.read_articles()[0]\n"
" translation = p.findall(definition)[0][1]\n"
" except:\n"
" continue\n";
I've noticed some delay in the second sentence's output, so I added the usleep(2); to C++ while thinking that it happens because calling PyRun_SimpleString
is not synchronous. It didn't help, however and I'm not sure that this is the reason. The delay bug happens for sentences that follow and increases.
So, is the call to PyRun_SimpleString
synchronous? Maybe, sharing of variable values between C++ and Python is not right?
Thank you in advance.