3

I'm building an AWS pipeline to insert CSV files from S3 to an RDS MySQL DB. The problem I'm facing is that when it attempts to load the file, it treats blanks as empty strings instead of NULLs. For example, Line 1 of the CSV is:

"3","John","Doe",""

Where the value is an integer in the MySQL table, and of course the error in the pipeline is:

Incorrect integer value: '' for column 'col4' at row 1

I was researching the jdbc MySQL paramaters to modify the connection string:

jdbc:mysql://my-rds-endpoint:3306/my_db_name?    
jdbcCompliantTruncation=false

jdbcCompliantTruncationis is just an example, is there any of these parameters that can help me insert those blanks as nulls?

Thanks!

EDIT: A little context, the CSV files are UNLOADS from redshift, so the blanks are originally NULLs when I put them in S3.

Prateek Naik
  • 2,522
  • 4
  • 18
  • 38
rodrigocf
  • 1,951
  • 13
  • 39
  • 62

3 Answers3

1

the csv files are UNLOADS from redshift

Then look at the documentation for the Redshift UNLOAD command and add the NULL AS option. For example:

NULL AS 'NULL'

Mark B
  • 183,023
  • 24
  • 297
  • 295
  • Hi! I tried this and the problem is that in the CSV I'll have NULL but the pipeline interprets it as 'NULL' when doing the insert so I keep receiving the same error – rodrigocf Apr 06 '16 at 21:35
  • Incorrect integer value: 'NULL' for column 'col4' at row 1 – rodrigocf Apr 06 '16 at 21:36
  • Right, that was just an example. You need to figure out what the pipeline considers null, and use that in the unload command. It might be some version of the Null character: https://en.wikipedia.org/wiki/Null_character – Mark B Apr 06 '16 at 22:18
1

use null as '\N' converts blank to null

unload ('SELECT * FROM table')
to 's3://path' credentials 
'aws_access_key_id=sdfsdhgfdsjfhgdsjfhgdsjfh;aws_secret_access_key=dsjfhsdjkfhsdjfksdhjkfsdhfjkdshfs'
delimiter '|' null as '\\N' ;
rahul
  • 880
  • 3
  • 14
  • 25
0

I resolve this issue using the NULLIF function:

insert into table values (NULLIF(?,''),NULLIF(?,''),NULLIF(?,''),NULLIF(?,''))