I am migrating data from RDS Postgres DB to s3 via the DMS AWS service.
The type of the DMS task is full load and CDC together. Let's say now I have some data in the postgres table named employee. eg:
emp_id | emp_name |
---|---|
1 | John |
2 | Angel |
When the task is initailly created there will be a full load done and the LOAD00000____.parquet file gets created in the s3 target location. Now I am inserting another row to the table.
emp_id | emp_name |
---|---|
3 | Ram |
Now a CDC action happens and a date folder(20220101/) with a parquet file init gets created.
I am actually trying to retain the table in the target despite of a truncate/drop operation that happens in postgres after table reloads happens.
"ChangeProcessingDdlHandlingPolicy": {
"HandleSourceTableDropped": false,
"HandleSourceTableTruncated": false,
"HandleSourceTableAltered": false
}
I have these configuration in my task settings.
Expecting that when I truncate/drop the table in postgres and then do a reload, the target data should not be truncated/dropped respectively. However, irrespective of the value that I give in the configuration keys of HandleSourceTableDropped
and HandleSourceTableTruncated
. The target folders gets deleted.
My task_setting.json file also has:
"TargetTablePrepMode": "TRUNCATE_BEFORE_LOAD",
Questions:
- Why does the s3 folder gets deleted on reload? Irrespective of the values(True/False) that I provide to the keys in the
ChangeProcessingDdlHandlingPolicy
. ChangeProcessingDdlHandlingPolicy
what does this configuration object mean?