0

An old system that I can't reach source code is inserting data to a huge table (over millions record've already been inserted.). Select command exceeds tolerance time. So if it is possible, I want to catch new records and send them to new MQ mechanism. So I can process them from the queue easily.

People suggest me to use "Change data capture" or "SQL Triggers". But I don't think that it could handle frequently added rows (maybe 1k new records for five minutes.).

What do you suggest me about this situation?

(SQL Server 2016 - C# consumer)

Thanks in advice!

user3565914
  • 51
  • 1
  • 9
  • 2
    Millions / 300k rows per day is really not all that big in real terms. Since you seem to have access to the RDBMS, would simply adding indexing or partitioning aligned to the select query's needs help? And maybe a scheduled job that purges and/or moves older data? – Marc Gravell Feb 22 '20 at 07:59
  • You do not trust the suggested solutions you already got. Why do you want to trust our suggestions? – Sir Rufo Feb 22 '20 at 08:00
  • 1
    1k rows per minute are not frequent. And it technologies like triggers or CDC could not handle that they would be useless. Utterly totally useless. You literally ask which car can go more than 20km/h, that low are your requirements. – TomTom Feb 22 '20 at 08:08
  • @SirRufo Because there are too many alternative perspectives that may solve much better. Somestimes people look from their window but you are maybe lookin from front and you don't see back garden and its awesome view. – user3565914 Feb 22 '20 at 09:26
  • Also i tried enable CDC on table but it took too long so i canceled it. But now i'll research about it more and now i know that i won't waste my time for nothing. Thanks all! – user3565914 Feb 22 '20 at 09:28

1 Answers1

1

Solution 1: Use better indexing

I can think of another way, to create an index and poll the table from your code, always keeping the last id you have processed (in some persistent way, so that you can access it even if your application fails).

Solution 2: Change Data Capture

This IS the safest way to go. The tables are updated from the transaction log with minimum to none impact at performance or the insert transaction.

You can also add some automatic cleanup so that the table will not get bloated. The througput is not that much and it will handle it easily with good cleanup intervals.

The downside is that with low enough interval you can lose data.

Solution 3: Triggers

By far the best in my opinion. Create a duplicate table (with the same columns) Like MyTableName_LiveData.

Create a trigger in the original table, that will insert the same row into the new table. Sql Server trigger insert values from new row into another table

CREATE TRIGGER yourNewTrigger ON yourSourcetable
FOR INSERT
AS

INSERT INTO yourDestinationTable
        (col1, col2    , col3, user_id, user_name)
    SELECT
        'a'  , default , null, user_id, user_name
        FROM inserted

go

When you access a row and process is, delete from the temp table so that it won't become unusable itself.

The downsides are

  1. If the trigger fails, the insertion to the original table fails too
  2. Any changes to schema must be applied to both tables
  3. The inserts will take a bit more time to happen (negligible but I had to mention it).
Athanasios Kataras
  • 25,191
  • 4
  • 32
  • 61