5

I am trying to delete two AWS DMS database "migration" tasks that are in the failed state. After over 30 tries from the console, where it shows a green banner and displays that the tasks are deleted - the tasks are still there. I also attempted to delete the dms tasks using the aws cli and get the same result. The event log for DMS shows the following:

dms-copy-task   replication-task    May 3, 2021, 22:48:37 (UTC-04:00)   Failed to clean task resources for task dms-copy-task during task deletion
dms-copy-task   replication-task    May 3, 2021, 22:47:06 (UTC-04:00)   Replication task has been deleted. 

May someone please tell me how I can get rid of these tasks at this point? My objective is to clean up the entire DMS replication instance, but I cannot without deleting the tasks first.

drobin
  • 286
  • 2
  • 6
  • my Google search came to this. I'm experiencing the same issue. Any luck for you @drobin? – Paul Lam May 10 '21 at 03:26
  • 2
    Hi Paul, I believe this is a bug with AWS DMS. I was able to clean up my environment by creating two new/dummy RDS instances. I then re-configure the DMS endpoints for the affected DMS tasks and tested endpoints to make sure they worked, then restarted the the DMS tasks and waited for them to run/be working. At this point, I was able to stop the tasks and then delete them. Hope this works for you also…it was painful. – drobin May 10 '21 at 21:33

2 Answers2

8

DMS task could not be deleted due to the source PostgreSQL database is unreachable. It occurs when a DMS task with a PostgreSQL compatible source attempts to and is unable to delete the Replication Slots on the source, to overcome this, please consider changing the following Modification to the DMS Task Settings:

Change:
    > "FailTaskWhenCleanTaskResourceFailed": true
to
    > "FailTaskWhenCleanTaskResourceFailed": false

Please kindly note that parameter "FailTaskWhenCleanTaskResourceFailed" is set to true by default to avoid having replication slot active on the source database (Aurora Postgres) which would lead to growth in WAL files. So it is recommend to check the source replication slots manually and verify it is removed so that there wont be growth of space in the source RDS instance due to un-utilised replication slots.

Once it is taken care please updated the "FailTaskWhenCleanTaskResourceFailed" to false following the below steps, This would cause DMS to evaluate the delete process differently allowing the DeleteReplicationTask API Call to complete without encountering the error "Drop the slot manually".

Modifying the Task JSON:

1. Select the DMS Task
2. Choose Modify
3. Scroll to the "Task settings" section
4. Choose the "JSON editor"
5. Scroll towards the bottom and set the following:
    "FailTaskWhenCleanTaskResourceFailed": false
6. Save the Task settings
7. Delete the DMS task.

Soure & Credit:

AWS Tech Support Team

Arjun Mohnot
  • 81
  • 2
  • 2
  • Thanks for posting this, Arjun. In my case, the database was gone, so there were no write ahead logs (WAL) or pglogical replication slots to worry about. Just stuck DMS tasks. – drobin Jul 10 '21 at 14:00
  • I got the same issue recently, and I can confirm that is the right answer – Francisco López Jul 23 '21 at 05:07
1

See comments above. I was able to clean up DMS by moving the DMS tasks back to a working state through re-configuration, then stopping them and then deleting them.

drobin
  • 286
  • 2
  • 6