0

I am using AWS Data Pipelines to copy SQL data to a CSV file in AWS S3. Some of the data has a comma between string quotes, e.g.:

{"id":123455,"user": "some,user" .... }

While importing this CSV data into DynamoDB, it takes the comma as the end of the field value. This way it results in errors, as the data given in the mapping does not match the schema we have provided.

My solution for this is - while copying the data from SQL to an S3 bucket - to separate our CSV fields with a ; (semicolon). In that way values within the quotes will be taken as one. And the data would look like (note the blank space within the quote string after the comma):

{"id" : 12345; "user": "some, user";....}

My stack looks like this:

  - database_to_s3:
      name: data-to-s3
      description: Dumps data to s3.
      dbRef: xxx
      selectQuery: >
        select * FROM USER;
      s3Url: '#{myS3Bucket}/xxxx-xxx/'
      format: csv

Is there any way I can use a delimiter to separate fields with a ; (semicolon)?

Thank you!

maslick
  • 2,903
  • 3
  • 28
  • 50
Farukh Khan
  • 199
  • 5
  • 21
  • I have edited your question to make it more readable – maslick Dec 13 '21 at 08:10
  • 1
    The question mentions CSV but data samples like `{"id" : 12345; "user": "some, user";....}` are all JSON and not CSV. – ElmoVanKielmo Dec 13 '21 at 08:13
  • @ElmoVanKielmo true, I am thinking of a better title... – maslick Dec 13 '21 at 08:17
  • @Khan, what's your stack snippet referring to? Is it a Cloudformation template (e.g. AWS::DataPipeline::Pipeline PipelineObject) or what? Please, elaborate – maslick Dec 13 '21 at 10:02
  • @Khan another question: are you exporting from RDS into S3? Or into DynamoDB? Or do you first export data from RDS into S3, and then from S3 into DynamoDB? – maslick Dec 13 '21 at 10:05

1 Answers1

0

give a try to AWS Glue, where you can marshal your data before insert into dynamoDB.

MrOverflow
  • 407
  • 3
  • 8