I have a package that is used for DQS cleansing. I have nearly 650,000+ records to clean however after about 350,000 records processed I get a symptom that hangs my project up. For example, I will close my visual studio project/solution but once I try to go back to open my project I get the message "visual studio is waiting for an internal operation" in the lower right hand corner. Once this happens I can't click or scroll anywhere in my project. I am using SQL 2012 to move data from one table to another but into another database within the SQL 2012 server/instance. In addition, I'm using the DQS client to clean and validate data for last name, state and country. My visual studio is version 2010. I'm running this all on a virtual machine that has 8g of ram and 4 cores. I do have the cumulative service pack installed for SQL 2012. At this point, I have to kill my VS2010 in task manager but can't seem to work my SSIS project any more. I have to delete all my records in my destination table then I can get into my project once more. Thanks for any help or ideas, Michael
Asked
Active
Viewed 636 times
1
-
Why are you processing such a large number of records in a development environment? Why not develop the package using a smaller number of test records, deploy it to your server and then let it process the complete data set? – Pondlife Jun 27 '12 at 16:27
-
Thanks Pondlife. I'm taking the procedure you mention into true consideration. In development I'm processing small amount of records and slowly building up to the 650,000 records. I just lock up after about 350,000 records and would like to figure this out before a production deployment. – M_devera Jun 27 '12 at 17:07
-
OK, but I'm not really sure what you're trying to test. The performance of an SSIS package running in the BIDS environment will probably be completely different from the performance when it's run with dtexec or as a scheduled job. Are you just trying to ensure that your package can process 650,000 rows? If so, why do you feel you need to test it in BIDS? – Pondlife Jun 27 '12 at 18:51
1 Answers
2
DQS Cleansing is a VERY resource intensive task. According to the Data Quality Services Performance Best Practices Guide, even when adhering to hardware recommendations and best practices, DQS cleansing on 1 million rows can take between 2-3.5 hrs.
Also, I agree with Pondlife's comments about running in BIDS vs DTEXEC. BIDS/SSDT is 32bit (limiting memory to 2-3GB) while DTEXEC has a 64bit version which can use way more memory.

Bill Anton
- 2,920
- 17
- 23