We have a requirement to insert large number of records (about 2 to 3 millions) in a table. However, we should be able to validate and segregate invalid records - primary key, foreign key and non null violations - into a separate error table for later reference. Per my study, bulk insert in SQL server works well for inserting, but I'm not able to figure out the best way to filter out bad data records. Does having having a staging table in between help? Although we could check for violations using some queues against staging table, we must load the good records into the actual table with another insert - either through insert select or merge - but is this an efficient approach? I'm concerned as it would be akin to doing 2x inserts.
I'm planning to use .net sqlbulkcopy for doing bulk insertions and it doesn't have a clear error reporting as well.
Can somebody point me to a more efficient solution?
EDIT: If this approach is the only solution, what method do you think is best for the second insert? Is it insert...select or MERGE? Would they match the efficiency and speed of BULK INSERT? Or is there any other better alternative?
Thanks!