6

I'm running a task that migrates all data from a postgres 10.4 to a RDS postgres 10.4. Not able to migrate tables which have jsonb column. After error, whole table is getting suspended.Table contain 449 rows only.

I have made following error policy, still whole table suspended. "DataErrorPolicy": "IGNORE_RECORD", "DataTruncationErrorPolicy": "IGNORE_RECORD", "DataErrorEscalationPolicy": "SUSPEND_TABLE", "DataErrorEscalationCount": 1000,

My expectation is that whole table should be transferred, it can ignore record if any json is wrong. I dont know why its giving this error 'invalid input syntax for type json' , i have checked all json and all jsons are valid.

After debugging more, this error has been considered as TABLE error , but why ? Thats why table got suspended since TableErrorPolicy is 'SUSPEND_TABLE'. Why this error considered as table error instead of record error?

Is JSONB column not supported by DMS thats why we are getting below error?

Logs :-

2020-09-01T12:10:04 https://forums.aws.amazon.com/I: Next table to load 'public'.'TEMP_TABLE' ID = 1, order = 0 (tasktablesmanager.c:1817)
2020-09-01T12:10:04 https://forums.aws.amazon.com/I: Start loading table 'public'.'TEMP_TABLE' (Id = 1) by subtask 1. 
Start load timestamp 0005AE3F66381F0F (replicationtask_util.c:755)
2020-09-01T12:10:04 https://forums.aws.amazon.com/I: REPLICA IDENTITY information for table 'public'.'TEMP_TABLE': Query status='Success' Type='DEFAULT' 
Description='Old values of the Primary Key columns (if any) will be captured.' (postgres_endpoint_unload.c:191)
2020-09-01T12:10:04 https://forums.aws.amazon.com/I: Unload finished for table 'public'.'TEMP_TABLE' (Id = 1). 449 rows sent. (streamcomponent.c:3485)
2020-09-01T12:10:04 https://forums.aws.amazon.com/I: Table 'public'.'TEMP_TABLE' contains LOB columns, change working mode to default mode (odbc_endpoint_imp.c:4775)
2020-09-01T12:10:04 https://forums.aws.amazon.com/I: Table 'public'.'TEMP_TABLE' has Non-Optimized Full LOB Support (odbc_endpoint_imp.c:4788)
2020-09-01T12:10:04 https://forums.aws.amazon.com/I: Load finished for table 'public'.'TEMP_TABLE' (Id = 1). 449 rows received. 0 rows skipped.
Volume transferred 190376. (streamcomponent.c:3770)
2020-09-01T12:10:04 https://forums.aws.amazon.com/E: RetCode: SQL_ERROR SqlState: 22P02 NativeError: 1 Message: ERROR: invalid input syntax for type json; 
Error while executing the query https://forums.aws.amazon.com/ (ar_odbc_stmt.c:2648)
2020-09-01T12:10:04 https://forums.aws.amazon.com/W: Table 'public'.'TEMP_TABLE' (subtask 1 thread 1) is suspended (replicationtask.c:2471)

Edit- after debugging more, this error has been considered as TABLE error , but why ?

Vijay
  • 4,694
  • 1
  • 30
  • 38

3 Answers3

10

JSONB column data type must be nullable in target DB.

Note- In my case, after making JSONB column as nullable, this error disappeared.

As mentioned in AWS documentation-

In this case, AWS DMS treats JSONB data as if it were a LOB column. During the full load phase of a migration, the target column must be nullable.

https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.PostgreSQL.html#CHAP_Source.PostgreSQL.Prerequisites

https://aws.amazon.com/premiumsupport/knowledge-center/dms-error-null-value-column/

Vijay
  • 4,694
  • 1
  • 30
  • 38
  • Shouldn't this be mode-dependent? i can see how this would be required in Inline LOB mode should the LOB exceed the size limit. But why would it be required in Full LOB mode? – Cpt. Senkfuss Apr 26 '23 at 07:54
3

AWS DMS treats the JSON data type in PostgreSQL as a LOB data type column. This means that the LOB size limitation when you use limited LOB mode applies to JSON data. For example, suppose that limited LOB mode is set to 4,096 KB. In this case, any JSON data larger than 4,096 KB is truncated at the 4,096 KB limit and fails the validation test in PostgreSQL.

Reference: AWS DMS - JSON data types being truncated

Update: You can tweak the error handling task settings to skip erroneous rows by setting the value for DataErrorPolicy to IGNORE_RECORD which determines the action AWS DMS takes when there is an error related to data processing at the record level. Some examples of data processing errors include conversion errors, errors in transformation, and bad data. The default is LOG_ERROR. IGNORE_RECORD, the task continues and the data for that record is ignored.

Reference: AWS DMS - Error handling task settings

Abdullah Khawer
  • 4,461
  • 4
  • 29
  • 66
  • 1
    I am using full lob mode. Not using limited lob mode. – Vijay Sep 01 '20 at 05:51
  • Should I try increasing lob size with full lob mode ? – Vijay Sep 01 '20 at 08:29
  • @OpsterElasticsearchPro-Vijay, yeah. Also, to make sure that it is being caused due to LOB size limitation, try to import all the tables except the ones having JSON columns to confirm and load a single table having JSON columns with different configurations to save time and confirm the workaround. – Abdullah Khawer Sep 01 '20 at 10:09
  • i have increased size of lob with full lob mode. Still same issue. We are able to migrate non jsonb table. – Vijay Sep 01 '20 at 10:44
  • it looks that some row has wrong json data, thats why it skip whole table. Can we skip wrong row only instead of whole table. Thanks. – Vijay Sep 01 '20 at 11:35
  • @OpsterElasticsearchPro-Vijay, You can tweak the error handling task settings to skip erroneous rows by setting the value for DataErrorPolicy to IGNORE_RECORD which determines the action AWS DMS takes when there is an error related to data processing at the record level. Some examples of data processing errors include conversion errors, errors in transformation, and bad data. Default is LOG_ERROR. IGNORE_RECORD, the task continues and the data for that record is ignored. Reference: https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Tasks.CustomizingTasks.TaskSettings.ErrorHandling.html – Abdullah Khawer Sep 01 '20 at 13:33
  • i have tried with DataErrorPolicy to IGNORE_RECORD. Still same issue.I have checked that json data is correct in JSONB column. Still whole table is suspeded. Dont know why? – Vijay Sep 02 '20 at 03:50
  • Raised question in aws forum, https://forums.aws.amazon.com/thread.jspa?threadID=327213 . You can see more details there. – Vijay Sep 02 '20 at 03:51
  • After debugging more, this error has been considered as TABLE error , but why ? thats why table suspended. – Vijay Sep 02 '20 at 08:06
  • @OpsterElasticsearchPro-Vijay, If it is a table error, then a row cannot be skipped. The whole table will be skipped. – Abdullah Khawer Sep 02 '20 at 08:18
  • Yes but why this error is considered as table error,ideally it should be record error. thanks – Vijay Sep 02 '20 at 08:45
  • @OpsterElasticsearchPro-Vijay, Agreed. I would suggest opening a ticket on the AWS support. They will surely help you and then you can answer the solution here by editing my answer or by posting another answer. – Abdullah Khawer Sep 02 '20 at 08:51
  • @OpsterElasticsearchPro-Vijay, I'm talking about the AWS Support service on the AWS Console, not the AWS forum. – Abdullah Khawer Sep 02 '20 at 09:34
  • 1
    You saved my butt, thank you so much for pointing out the fact that `AWS DMS treats the JSON data type in PostgreSQL as a LOB data type column.`. Thank you! – Frederik Nygaard Svendsen Feb 21 '22 at 16:32
2

You mentioned that you're migrating from PostgreSQL to PostgreSQL. Is there a specific reason to Use AWS DMS?

AWS Docs: https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.PostgreSQL.html#CHAP_Source.PostgreSQL.Homogeneous

When you migrate from a database engine other than PostgreSQL to a PostgreSQL database, AWS DMS is almost always the best migration tool to use. But when you are migrating from a PostgreSQL database to a PostgreSQL database, PostgreSQL tools can be more effective.

...

We recommend that you use PostgreSQL database migration tools such as pg_dump under the following conditions:

  • You have a homogeneous migration, where you are migrating from a source PostgreSQL database to a target PostgreSQL database.
  • You are migrating an entire database.
  • The native tools allow you to migrate your data with minimal downtime.
Ivan Rubinson
  • 3,001
  • 4
  • 19
  • 48