0

I have a csv file with the following structure:

h1_h2,hashtag1,hashtag2,coccurrence
39108234088393,9230981401776738405,11889764071793228909,2
48887306406636,2844752706633868157,14936885980370043276,2
...

There are 1028112 lines in file. I tried to import it into collection via arangoimp

arangoimp --file E:\current_crawler\Data\edges\edges_for_graph_ep_83.csv --collection edges_temp4 --create-collection true 
--create-collection-type edge --type csv --translate "hashtag1=_from" --translate "hashtag2=_to" --from-collection-prefix hashtags 
--to-collection-prefix hashtags --translate "h1_h2=_key" --server.database "newDB" --server.username username --server.password password

and got the error:

Connected to ArangoDB 'http+tcp://127.0.0.1:8529', version 3.3.4, database: 'newDB', username: 'svm'
----------------------------------------
database:               newDB
collection:             edges_temp4
from collection prefix: hashtags
to collection prefix:   hashtags
create:                 yes
source filename:        E:\current_crawler\Data\edges\edges_for_graph_ep_83.csv
file type:              csv
quote:                  "
separator:
threads:                2
connect timeout:        5
request timeout:        1200
----------------------------------------
Starting CSV import...
2018-08-22T16:49:21Z [3012] INFO processed 1998848 bytes (3%) of input file
2018-08-22T16:49:21Z [3012] INFO processed 3964928 bytes (6%) of input file
2018-08-22T16:49:21Z [3012] INFO processed 5963776 bytes (9%) of input file
2018-08-22T16:49:22Z [3012] INFO processed 7929856 bytes (12%) of input file
2018-08-22T16:49:22Z [3012] INFO processed 9928704 bytes (15%) of input file
2018-08-22T16:49:22Z [3012] INFO processed 11894784 bytes (18%) of input file
2018-08-22T16:49:22Z [3012] INFO processed 13893632 bytes (21%) of input file
2018-08-22T17:09:23Z [3012] ERROR Caught exception Expecting item during import

What does thar error means? The file is OK, there is no empty lines and duplicated _keys in it. Moreover, when I rebooted the system and tried again, there was no such error, it imported successfully.

I'd appreciate all the help I can get. Envoronment:

Storage Engine: RocksDB

Deployment Mode: Single Server

Configuration: Intel Xeon X5650 x2, 32GB RAM

Operating System: Windows 10

Cody Gray - on strike
  • 239,200
  • 50
  • 490
  • 574
elfinorr
  • 189
  • 3
  • 12
  • Can you please provide a bit of context about your setup, e.g. did you import into a single instance or into a cluster, what was the exact command-line to invoke arangoimp, which storage engine, replication factor and number of shards did you use for the target collection? Having that info here may help. By the way, I see you are using ArangoDB 3.3.4, which was released more than 5 months ago. You may retry with ArangoDB 3.3.14, which is the latest release, to see if this an already fixed problem. Thanks! – stj Aug 22 '18 at 18:21
  • Thank you for your answer! I added some information about the setup. Exact command-line is in the post, the second 'code' fragment. – elfinorr Aug 22 '18 at 18:51
  • Thanks for the update! It's really hard to tell what is caused this, because the error message is from a global exception catch point in arangosh. It can be caused by some processing error inside arangosh, but could also be caused by some server-side error. If it's caused by a processing error in arangosh, it should be reproducible however, and you wrote it isn't. So it's probably some server-side error. Have you found any issue (errors/warnings) in the server logs around the date/time the import was running? Is it reproducible with 3.3.14? – stj Aug 23 '18 at 19:13
  • Still wondering if this is reproducible with a newer version of ArangoDB...? – stj Dec 05 '18 at 16:42

0 Answers0