Error while importing beers database using CSV

Question

I have the latest community edition 2.2.17. While importing the beers database using csv I am getting error while importing the beers. (categories, styles etc. all got imported fine). The errors are like:

    OrientDB etl v.2.2.17 (build 2.2.x@rd9bace82ea8437117fd48114fc255e791056014b; 2017-02-16 17:20:27+0000) www.orientdb.com
[csv] INFO column types: {last_mod=ANY, abv=ANY, filepath=ANY, name=ANY, cat_id=ANY, upc=ANY, id=ANY, brewery_id=ANY, style_id=ANY, descript=ANY, ibu=ANY, srm=ANY}
BEGIN ETL PROCESSOR
[file] INFO Reading from file C:/Database_studies/nosql/orientdb/Import/OrientDB_self_study_files/beerdb/openbeerdb_csv/beers.csv with encoding UTF-8
Started execution with 1 worker threads
[csv] ERROR Error on converting row 1 field 'last_mod' , value '2010-07-22 20:00:20' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'abv' , value '4.5' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'filepath' , value '' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'name' , value 'Hocus Pocus' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'cat_id' , value '11' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'upc' , value '0' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'id' , value '1' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'brewery_id' , value '812' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'style_id' , value '116' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'descript' , value 'Our take on a classic summer ale.  A toast to weeds, rays, and summer haze.  A light, crisp ale for mowing lawns, hitting lazy fly balls, and communing with nature, Hocus Pocus is offered up as a summer sacrifice to clodless days.

Its malty sweetness finishes tart and crisp and is best apprediated with a wedge of orange.' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'ibu' , value '0' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 1 field 'srm' , value '0' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 2 field 'last_mod' , value '2010-07-22 20:00:20' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 2 field 'abv' , value '6.7' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 2 field 'filepath' , value '' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 2 field 'name' , value 'Grimbergen Blonde' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 2 field 'cat_id' , value '-1' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 2 field 'upc' , value '0' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 2 field 'id' , value '2' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 2 field 'brewery_id' , value '264' (class:java.lang.String) to type: ANY
[csv] ERROR Error on converting row 2 field 'style_id' , value '-1' (class:java.lang.String) to type: ANY

The command I used to import is same as given in the documentation: ./oetl.sh /temp/openbeer/beers.json

(with the directory name changed to the actual one in my system).

Can someone please suggest.

Here is my beers.json file:

    {
"config" : { "haltOnError": false },
"source": { "file": { "path": "C:/Database_studies/nosql/orientdb/Import/OrientDB_self_study_files/beerdb/openbeerdb_csv/beers.csv" } },
"extractor": { "csv": { "columns": ["id","brewery_id","name","cat_id","style_id","abv","ibu","srm","upc","filepath","descript","last_mod"],
"columnsOnFirstLine": true } },
"transformers": [
{ "vertex": { "class": "Beer" } },
{ "edge": { "class": "HasCategory", "joinFieldName": "cat_id", "lookup": "Category.id" } },
{ "edge": { "class": "HasBrewery", "joinFieldName": "brewery_id", "lookup": "Brewery.id" } },
{ "edge": { "class": "HasStyle", "joinFieldName": "style_id", "lookup": "Style.id" } }
],
"loader": {
"orientdb": {
"dbURL": "plocal:C:/orientdb_install_031217/orientdb-community-2.2.17/databases/openbeerdb",
"dbType": "graph",
"classes": [
{"name": "Beer", "extends": "V"},
{"name": "HasCategory", "extends": "E"},
{"name": "HasStyle", "extends": "E"},
{"name": "HasBrewery", "extends": "E"}
], "indexes": [
{"class":"Beer", "fields":["id:integer"], "type":"UNIQUE" }
]
}
}
}

Thanks, DBuserN

score 0 · Answer 1 · answered Mar 12 '17 at 15:40

0

My suggestion is to explicate the types for each column

"extractor": { "csv": { "columns": ["id:integer","brewery_id:integer","name:string","cat_id:integer","style_id:integer","abv:integer","ibu:integer","srm:integer","upc:integer","filepath:string","descript:string","last_mod:dateTime"]

Check the CSV extractor documentation:

http://orientdb.com/docs/last/Extractor.html

And be sure the default dateTimeFormat is right for your input file.

answered Mar 12 '17 at 15:40

Roberto Franchini

1,016
7
13

Hi Roberto, Thanks a lot...I will try out and update tomorrow....this looks like the exact reason for the issue....also note that the datetime I have is **not** the default one ...it is like this: **7/22/2010 8:00:20** so what format should I used for this purpose? Can you please suggest. (I don't know how to line break here so all in same line, sorry). – dbusern Mar 12 '17 at 16:25
Sorry to correct myself..the last_mod is stored exactly like this: **"2010-07-22 20:00:20"**. Can you pls. suggest how it should be handled for the ETL? – dbusern Mar 12 '17 at 16:48
The default format is yyyy-MM-dd HH:mm, you should provide yyyy-MM-dd HH:mm:ss – Roberto Franchini Mar 12 '17 at 19:58

Error while importing beers database using CSV

1 Answers1