I am building a custom import tool and wondering what would be the best practice of importing a large amount of data. I have the following JSON data structure with a minimum of 500 products across 30 days per import.
"rows": [{
"product_uid": "k110",
"sale_date": "2018-06-06",
"amount": 15
}, {
"product_uid": "k111",
"sale_date": "2018-06-06",
"amount": 22
}, {
"product_uid": "k110",
"sale_date": "2018-06-07",
"amount": 30
}
]
The schema for the table as follows:
daily_sales_id - product_uid - sale_date - amount
I am using nodejs mysql to execute multiple SQL statements in a single connection. It works well for inserting the rows at the first time but in subsequent tries, it will insert duplicate rows. I can truncate the table before inserting but this will fail if the user decide to import a delta snapshot instead of the entire records.
While I can do a for-loop to check if record exist and do an update instead of insert, looping through 15,000+ records and creating 15,000+ select connections doesn't seem to be a good idea.
Is there any other alternatives where I can keep the data structure and perform an update/insert without looping through 15,000+ records? The import csv file doesn't know the daily_sales_id.