I am currently trying to fill an orientdb graph database using pyorient. In general, everything works well. However, I have stumbled across a parsing issue with one of my commands. In Python if I run the following code:
>>> ab = 'UPDATE Patent SET primary_id = 676, original_abstract = set(original_abstract, "<p num=\\"0000\\">The present invention relates to compounds of the general formula (I) wherein\\n\\nR<sup>1</sup> is the group (A) or (B) or (C) or (D); R<sup>2</sup> is a non aromatic\\n\\nheterocycle, or is OR\' or N(R\\")<sub>2</sub>; R\' is lower alkyl,\\n\\nlower alkyl substituted by halogen or -(CH<sub>2</sub>)<sub>n</sub>-cycloalkyl;\\n\\nR\\" is lower alkyl; R<sup>3</sup> is NO<sub>2</sub>, CN or SO<sub>2</sub>R\';\\n\\nR<sup>4 </sup>is hydrogen, hydroxy, halogen, NO<sub>2</sub>, lower alkyl, lower\\n\\nalkyl, substituted by halogen, lower alkoxy, SO<sub>2</sub>R\' or C(O)OR\\";\\n\\nR<sup>5</sup>/R<sup>6</sup>/R<sup>7</sup> are hydrogen, halogen, lower alkyl\\n\\nor lower alkyl, substituted by halogen; X<sup>1</sup>/X<sup>1\\u00bf</sup>\\n\\nare CH or N, with the proviso that X<sup>1</sup>/X<sup>1\\u00bf</sup> are not simultaneously\\n\\nCH; X<sup>2</sup> is O, S, NH or N(lower alkyl); n is 0, l or 2; and to pharmaceutically\\n\\nactive acid addition salts and to their use in the treatment of neurological and\\n\\nneuropsychiatric disorders.</p>") UPSERT WHERE primary_id = 676'
>>> client.batch(ab)
I get the following errors:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/orient.py", line 402, in batch
.prepare(( QUERY_SCRIPT, ) + args).send().fetch_response()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/commands.py", line 145, in fetch_response
super( CommandMessage, self ).fetch_response()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/base.py", line 256, in fetch_response
self._decode_all()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/base.py", line 240, in _decode_all
self._decode_header()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/base.py", line 192, in _decode_header
[ exception_message.decode( 'utf8' ) ]
pyorient.exceptions.PyOrientCommandException: com.orientechnologies.orient.core.sql.parser.TokenMgrError - Lexical error at line 1, column 311. Encountered: <EOF> after : "\"<p num=\\\"0000\\\">The present invention relates to compounds of the general formula (I) wherein\\n\\nR<sup>1</sup> is the group (A) or (B) or (C) or (D); R<sup>2</sup> is a non aromatic\\n\\nheterocycle, or is OR\' or N(R\\\")<sub>2</sub>"
If I run this using command I also get errors:
>>> client.command(ab)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/orient.py", line 398, in command
.prepare(( QUERY_CMD, ) + args).send().fetch_response()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/commands.py", line 145, in fetch_response
super( CommandMessage, self ).fetch_response()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/base.py", line 256, in fetch_response
self._decode_all()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/base.py", line 240, in _decode_all
self._decode_header()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/base.py", line 192, in _decode_header
[ exception_message.decode( 'utf8' ) ]
pyorient.exceptions.PyOrientCommandException: com.orientechnologies.orient.core.sql.OCommandSQLParsingExceptioncom.orientechnologies.orient.core.exception.OSerializationException - Error on parsing command at position #0: Error on reading parameters in: set(original_abstract, "<p num="0000">The present invention relates to compounds of the general formula (I) wherein
R<sup>1</sup> is the group (A) or (B) or (C) or (D); R<sup>2</sup> is a non aromatic
heterocycle, or is OR' or N(R")<sub>2</sub>; R' is lower alkyl,
lower alkyl substituted by halogen or -(CH<sub>2</sub>)<sub>n</sub>-cycloalkyl;
R" is lower alkyl; R<sup>3</sup> is NO<sub>2</sub>, CN or SO<sub>2</sub>R';
R<sup>4 </sup>is hydrogen, hydroxy, halogen, NO<sub>2</sub>, lower alkyl, lower
alkyl, substituted by halogen, lower alkoxy, SO<sub>2</sub>R' or C(O)OR";
R<sup>5</sup>/R<sup>6</sup>/R<sup>7</sup> are hydrogen, halogen, lower alkyl
or lower alkyl, substituted by halogen; X<sup>1</sup>/X<sup>1¿</sup>
are CH or N, with the proviso that X<sup>1</sup>/X<sup>1¿</sup> are not simultaneously
CH; X<sup>2</sup> is O, S, NH or N(lower alkyl); n is 0, l or 2; and to pharmaceutically
active acid addition salts and to their use in the treatment of neurological and
neuropsychiatric disordersFound invalid ) character at position 229 of text original_abstract, "<p num="0000">The present invention relates to compounds of the general formula (I) wherein
R<sup>1</sup> is the group (A) or (B) or (C) or (D); R<sup>2</sup> is a non aromatic
heterocycle, or is OR' or N(R")<sub>2</sub>. Ensure it is opened and closed correctly.
I have found however that if I do not add the information under original_abstract as a set, it works when using command:
>>> aba = 'UPDATE Patent SET primary_id = 676, original_abstract = "<p num=\\"0000\\">The present invention relates to compounds of the general formula (I) wherein\\n\\nR<sup>1</sup> is the group (A) or (B) or (C) or (D); R<sup>2</sup> is a non aromatic\\n\\nheterocycle, or is OR\' or N(R\\")<sub>2</sub>; R\' is lower alkyl,\\n\\nlower alkyl substituted by halogen or -(CH<sub>2</sub>)<sub>n</sub>-cycloalkyl;\\n\\nR\\" is lower alkyl; R<sup>3</sup> is NO<sub>2</sub>, CN or SO<sub>2</sub>R\';\\n\\nR<sup>4 </sup>is hydrogen, hydroxy, halogen, NO<sub>2</sub>, lower alkyl, lower\\n\\nalkyl, substituted by halogen, lower alkoxy, SO<sub>2</sub>R\' or C(O)OR\\";\\n\\nR<sup>5</sup>/R<sup>6</sup>/R<sup>7</sup> are hydrogen, halogen, lower alkyl\\n\\nor lower alkyl, substituted by halogen; X<sup>1</sup>/X<sup>1\\u00bf</sup>\\n\\nare CH or N, with the proviso that X<sup>1</sup>/X<sup>1\\u00bf</sup> are not simultaneously\\n\\nCH; X<sup>2</sup> is O, S, NH or N(lower alkyl); n is 0, l or 2; and to pharmaceutically\\n\\nactive acid addition salts and to their use in the treatment of neurological and\\n\\nneuropsychiatric disorders.</p>" UPSERT WHERE primary_id = 676'
>>> client.command(aba)
['1']
However client.batch still cannot handle this correctly:
>>> client.batch(aba)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/orient.py", line 402, in batch
.prepare(( QUERY_SCRIPT, ) + args).send().fetch_response()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/commands.py", line 145, in fetch_response
super( CommandMessage, self ).fetch_response()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/base.py", line 256, in fetch_response
self._decode_all()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/base.py", line 240, in _decode_all
self._decode_header()
File "/Users/shaungupta/anaconda/lib/python2.7/site-packages/pyorient/messages/base.py", line 192, in _decode_header
[ exception_message.decode( 'utf8' ) ]
pyorient.exceptions.PyOrientCommandException: com.orientechnologies.orient.core.exception.OSerializationException - Found invalid ) character at position 274 of text UPDATE Patent SET primary_id = 676, original_abstract = "<p num=\"0000\">The present invention relates to compounds of the general formula (I) wherein\n\nR<sup>1</sup> is the group (A) or (B) or (C) or (D); R<sup>2</sup> is a non aromatic\n\nheterocycle, or is OR' or N(R\")<sub>2</sub>; R' is lower alkyl,\n\nlower alkyl substituted by halogen or -(CH<sub>2</sub>)<sub>n</sub>-cycloalkyl;\n\nR\" is lower alkyl; R<sup>3</sup> is NO<sub>2</sub>, CN or SO<sub>2</sub>R';\n\nR<sup>4 </sup>is hydrogen, hydroxy, halogen, NO<sub>2</sub>, lower alkyl, lower\n\nalkyl, substituted by halogen, lower alkoxy, SO<sub>2</sub>R' or C(O)OR\";\n\nR<sup>5</sup>/R<sup>6</sup>/R<sup>7</sup> are hydrogen, halogen, lower alkyl\n\nor lower alkyl, substituted by halogen; X<sup>1</sup>/X<sup>1\u00bf</sup>\n\nare CH or N, with the proviso that X<sup>1</sup>/X<sup>1\u00bf</sup> are not simultaneously\n\nCH; X<sup>2</sup> is O, S, NH or N(lower alkyl); n is 0, l or 2; and to pharmaceutically\n\nactive acid addition salts and to their use in the treatment of neurological and\n\nneuropsychiatric disorders.</p>" UPSERT WHERE primary_id = 676. Ensure it is opened and closed correctly.
Running string ab directly in the orientdb console also gives the same errors, however running string aba in the orientdb console works fine, so I cannot understand why aba does not work with the pyorient batch command (I also have managed to run aba in the orientdb console as part of a begin/commit transaction with success).
Does anybody understand why this command works via client.command and not client.batch? I need to have this command running as part of a batch of commands, so need to find a fix..... Ideally, I would like to have the command in string ab where I am adding original abstract as a set to work, as I need to keep track of any new information for matching nodes.
From what I can see this is a parsing limitation of the command executor, but please do tell me if I am doing something wrong here...
Thanks!!