0

I am loading from a mongodb table to mysql table using kettle 4.4 one of the column at some row gives an error : Incorrect string value: '\xF0\x9F\x98\x8D' for column. I checked some of the forums and modified the column to utf8mb4_general_ci , but still it gives the same error.

The table has 12 lakhs + records and I am not able to identify which row is giving the error. In the table there is _id which is object id. I tried to load with order by on _id column and when it stopped with the error, then tried to load with > max(_id) from my previous load. With this it gives duplicates and am not able to identify which row is giving the error.

1) Is there a way to identify which row is giving the error or to solve this error?

2) In the table output step, there is a option called insert ignore errors, i checked it and tried to run. Now it does not give me any errors, but on kettle it shows no. of records written as 1192628 but the table has only 1192626 records. So 2 records are missing. Does ignore insert errors option ignores the lines with errors? If so where are these errors logged ? I tried to read some documents but none of them have this info.

It would be great if anyone of you could help me.

Thank you, Deepthi

Deepthi
  • 79
  • 1
  • 7

2 Answers2

2

In your case it looks like you can re-run the load as often as you want. In a Table Output step, checking 'Ignore Insert Errors' does exactly that. The errors are ignored and not logged.

What I would do in your case is uncheck 'Ignore Insert Errors', and connect an error output from your Table Output and run it to a text file or xml file or another table output. Rerun the load, and the two rows causing the error will be in your error output.

Brian.D.Myers
  • 2,448
  • 2
  • 20
  • 17
  • Thank you it helped .. any solution for the error Incorrect string value: '\xF0\x9F\x98\x8D' for column.? – Deepthi Jul 30 '13 at 08:17
0

it seems you are getting binary data, try this to convert to String: add a new select/rename values before the load in db table output. Click on select/rename step, go to the tab Metadata, select your input field, type: String, Binary to Normal: Y, Encoding: UTF-8 (in my case) Cross fingers!

jacktrade
  • 3,125
  • 2
  • 36
  • 50