Rows insertion during an initial load leads to synchronization failure. How to prevent it?

Question

I am using SymmetricDS version 3.4.8 to synchronize 2 PostgreSQL DBs.

I have encountered the following problem: I have several dependent tables, e.g. 'node' and 'device' tables where 'device' table has FK to 'node' table; if new rows were added to these tables during the initial sync when 'node' table was already synchronized but 'device' not (so, between 'node' and 'device' tables initial load), processing of 'device' table on slave node fails on trying to insert a new row as it violates foreign key constraints as key isn't present in 'node' table. How can I handle this issue?

Thanks

score 0 · Accepted Answer · answered Mar 03 '14 at 22:00

0

SymmetricDS uses a separate channel for initial load batches by default. When initial loading in a active environment, these FK violations can occur temporarily by design.

When the initial load channel runs into a FK violation the channel is suspended. Your data channel containing the missing PK will start to sync. Once the data channel is complete, the initial load channel will attempt again to sync. This time there won't be an error because the required PK is now available on the target.

answered Mar 03 '14 at 22:00

Austin Brougher

526
2
9

Two parallel channels (initial load and my data update channel) reached race condition: – Yulia Mar 04 '14 at 16:19
Two parallel channels (initial load and my data update channel) reached race condition: - the current load batch contains rows of 'device' table which was added after 'node' table synchronization (the new rows should be synchronized by update channel); - the current update parallel batch contains rows which depend on table which is not bean initial loaded yet;As result channels is in dead lock. – Yulia Mar 04 '14 at 16:27
The initial load channel will fail when the parent record is not available. Your saying that your update channel runs and before updating the missing parent record the init channel needs it will run into it's own missing pk record that the init channel has not inserted yet? Are both your node and device changes made on the data update channel? – Austin Brougher Mar 04 '14 at 16:49
Yes, you are absolutely right. This is the situation. Update channel failed on some additional pair of tables, I see rows which is neede to initial channel in the same batch of update channel but after 'bad' data. Node and device changes made on the data update channel, in additional, initial channel select the same rows from device table which are included in batches of update channel. – Yulia Mar 04 '14 at 17:07
I understand now. There are two ways to handle this. Wait for a quiet time to do initial loads, or drop and rebuild all of your fk relationships with a custom SDS IReloadListener extension. – Austin Brougher Mar 04 '14 at 18:01
Thank you very much for your help. Is there a way to get initial data as snapshot? I mean, even if there are changes in DB, all initial data will be collected as it doesn't change from the beggining on initial load; all updates will be collected but not send and update channel will start work when initial load will be completed. – Yulia Mar 04 '14 at 18:52
Initial load data isn't copied into another table like sym_data. This would cripple a large database due to size and processor constraints. It would also be slow to build up a list referencing each table record. We haven't seen this issue come up very often but it does happen. We are always open to suggestions. – Austin Brougher Mar 04 '14 at 19:07
It is a great honor for me to suggest an additional/alternative behavior. I think about the following - to surround an initial load by DB transaction, in other words, to isolate data collecting from current DB changes. Update channel should be created in "suspend" mode (updates collecting without sending); when initial load will be completed, the update channel should be turned to "active" mode by IReloadListener to complete DB syncronization. – Yulia Mar 05 '14 at 09:44
The tricky part is the isolating of the initial load from new entries. How do you know what records are new when you have a table with millions of records. – Austin Brougher Mar 05 '14 at 13:20
One DB transaction should be isolated from the data changes happen simultaneously in other DB transactions, shouldn't it? Sorry if I'm missing something obvious. – Yulia Mar 05 '14 at 14:06
The problem you are running into is because your application is performing an insert during the initial load. Using a transaction to select entire tables would lock all your tables and prevent your application from performing inserts. – Austin Brougher Mar 20 '14 at 16:15

Rows insertion during an initial load leads to synchronization failure. How to prevent it?

1 Answers1