1

I am working on a requirement, where i am doing multipart upload of the csv file from on prem server to S3 Bucket.

To achieve this using AWS Lambda I create a presigned url and use this url i am uploading the csv file. Now, once i have the file in AWS S3, i want it to be moved to AWS RDS Oracle DB. Initially i was planning to use AWS Lambda for this.

So once i have the file in S3, it triggers lambda(s3 event) and lambda will push this file to RDS. But with this the issue is with the file Size(600 MB).

I am looking for some other way, where whenever there is a file uploaded to S3, it should trigger any AWS service and that service will push this csv file to RDS. I have gone through AWS DMS/Data Pipeline, but not able to find any way to automate this migration

I need to automate this migration on every s3 upload, that is also cost effective.

Nimmo
  • 105
  • 10

1 Answers1

2

Setup S3 Integration and build SPROCS to help automate load. Details found here.

UPDATE:

Looks like you don't even need to create a SPROC. You can just use the RDS procedure as outlined here. You would then just create an event-driven lambda function that is triggered on a given S3 event--e.g. on object PUT(), POST(), COPY, etc..--which passes the S3 metadata requisite to access the event object. Here is a simple Python example of what that Lambda and config might look like. You would then use the metadata passed on the trigger event--as outlined in the Python example--to dynamically create your procedure call then execute that procedure. You can also add the ensuing workflow logic that meets your requirements--i.e. TASK_ID fetch & operational handling, monitoring, etc...--to the same lambda function or separate those concerns by adding additional lambdas. Hope this helps!

benfarr
  • 135
  • 1
  • 6
  • Can i add s3 event in this to trigger the migration on every single upload. – Nimmo Jun 10 '20 at 09:23
  • Also where should i place the SPROCS to automate this. – Nimmo Jun 10 '20 at 09:42
  • 1
    Thanks @benfarr . I tried this, it worked. But still i need to call the stored Proc manually to migrate the data. – Nimmo Jun 10 '20 at 11:59
  • No problem, @Nimmo. I just updated the response that should help you work through the rest. If you needed to add a custom SPROC to complete additional work, you can trigger your custom SPROC from your lambda function as well. – benfarr Jun 10 '20 at 12:04
  • Thank you @benfarr. Its working. I am getting the task id in response while running "SELECT rdsadmin.rdsadmin_s3_tasks.download_from_s3(p_bucket_name =>'destintion', p_directory_name => 'DATA_PUMP_DIR') AS TASK_ID FROM DUAL" query. Can i dump this data in rds table? Also i am not able to see the data in sql. Just i am getting the task id. – Nimmo Jun 10 '20 at 13:05
  • You're welcome, @Nimmo. Yes, you can load the contents of the file to a table. Depending on the export format, there are a number of ways of going about accomplishing that. You can read the contents of the file from with the following--maybe not best idea with file size: `SELECT text FROM table(rdsadmin.rds_file_util.read_text_file('BDUMP','dbtask-.log'));` `DBMS_DATAPUMP` allows you to import the file to the target instance schema. Lots of options with this package as outlined here[link](https://docs.oracle.com/database/121/ARPLS/d_datpmp.htm#ARPLS356) – benfarr Jun 10 '20 at 14:40
  • @Nimmo if these are all standard .csv files, [this](https://docs.oracle.com/cd/E17781_01/server.112/e18804/impexp.htm#BABCJCBD) might be a better/less convoluted option for your use case. – benfarr Jun 10 '20 at 15:29
  • Do we have a procedure like "SELECT rdsadmin.rdsadmin_s3_tasks.download_from_s3(p_bucket_name =>'my-s3-bucket', p_directory_name => 'DATA_PUMP_DIR') AS TASK_ID FROM DUAL". But i need destination as the table which is having my source csv data. I dont want to create a directory in rds. – Nimmo Jun 11 '20 at 10:44
  • I am still facing this issue. I need to copy the data of the file in the table not the file itself. This procedure is create a file in a directory. I want the data in Oracle table. – Nimmo Jun 16 '20 at 11:27