0

Error getting while insert bulk rows with Pentaho Data Interrogator. I am using PostgreSQL

ERROR: invalid byte sequence for encoding "UTF8": 0x00 
Md Sirajus Salayhin
  • 4,974
  • 5
  • 37
  • 46

3 Answers3

2

"UTF8": 0x00 = "null character". You can use "Modified Javascript" step, and then apply a mask pattern as follows:

function removeNull(e) {

if(e != null)
    return e.replace(/\0/g, '');
else
    return '';
}

var replacedString = removeNull(fieldToRemoveNullChars);

Select the new field for the Modified Javascript output, and voilla!. Use to have this problem with AS400 incoming data.

ChoCho
  • 439
  • 5
  • 13
0

PostgreSQL is very strict content of text fields, and doesn't allow 0x00 in utf8 encoded fields. You should to fix your input data.

Some possible solution https://superuser.com/questions/287997/how-to-use-sed-to-remove-null-bytes

Pavel Stehule
  • 42,331
  • 5
  • 91
  • 94
0

Finally I got the solution:

  • In Table Input, check the "Enable lazy conversion" option
  • Enter the "Select Values" step Select all fields and on the forced "Metadata" tab by entering the "UTF-8" encoding for all fields.
Md Sirajus Salayhin
  • 4,974
  • 5
  • 37
  • 46