I need to import about 150,000 products with categories, manufacturers, images, and attributes into Nop Commerce. I wrote a plugin based on Nop.Services.ExportImport.ImportManager. The more products that are imported, the slower the import process runs. The first 100 products will import in 1.5 minutes. By about 2500 products it is taking over 15 to 20 minutes to import another 100 products. Importing more than 2500 products at a time causes Out of Memory exceptions. Things even run slower after stopping at 2500 and then doing a second import.
Here is the process the plugin uses:
Create a new product entity, populate the details, then insert the product using Nop.Services.Catalog.ProductService.InsertProduct. It is necessary to use the ProductService because we need the value of Product.ID for adding child objects to the product.
Add product manufacturer by first searching if the manufacturer exists. If the manufacturer is not found, then create the manufacturer and add using ManufacturerService. Then add a ProductManufactuer entity to assign the product to the manufacturer. This entity is added to a list object.
Do the same routine for Categories, Specification Attributes, URL Records, and Product Tags.
Download the first image and add using PictureService. PictureService inserts a Picture entity to the database and then saves the image to the images directory. Then a ProductPicture entity is created and added to a list.
Every 100 products, I use SQLBulkCopy/EFBulkInsert to insert the lists of ProductManufacturers, ProductCategories, ProductImages, etc to the database. Then I clear the lists and start again.
So through the whole process we are using Entity Framework to add Products, Categories, Manufacturers, Product Tags, and Specification Attributes to the database one by one. This is necessary because we need to get the ID's of those items so we can assign them to a product by inserting records into the mapping tables. The records that get added to the mapping tables can get added to a list and then bulk inserted to the database using SQLBulkCopy/EFBulkInsert.
Every time a product and its properties are added the whole process goes slower and slower. It also seems to use more and more memory as time goes on. I'm not sure what to do next.
I'm not sure if Entity Framework is causing the problem or not. I've read that Entity Framework tends to use more and more memory as time goes on and that it is a good idea to reset your DBContext every so often. The problem with that is Nop manages the DBContext through it's EFRepository Layer in Nop.Data.
Using EFBulkInsert/SQLBulkCopy to add the product property mapping records to the database seems to speed things up a bit. I thought about maybe going through and using that to Insert all the products first, then going back through a second time and importing the properties. I will still run into the problem where I can't bulk import the Picture entities to the database because I need the PictureID plus the ProductID to associate the picture with the product.
I've also thought about using ADO.NET and stored procedures to do the one by one inserts but am not really sure if that would improve things. Does anyone have any suggestions?