0

I have a table in MySQL that is updated every night with a cron task. Every row is altered, because I'm getting the data from a XLS file and have to update all values.

At first my question was if it worth updating each row or on the other hand just delete all and insert it again. I stick to the UPSERT method as I read that was more adequate, fast and even elegant that throwing out all data, but please correct me if I'm wrong.

WELL, so now my question is about rows that are not new and are not updated each time the cron runs. The XLS maybe has not a fixed length, so one day could be a few lines less... or more.

How can I delete those rows in the same "UPSERT" statement?

Edit: I realized that maybe in this case it is just better to drop old data and re add them, although at first my target was avoiding that.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Alberto
  • 367
  • 3
  • 11

2 Answers2

1

Since all rows are updated every time & you have no way to check the no-change rows? How about doing simply truncate & bulk insert all ?

Santosh Kangane
  • 377
  • 1
  • 3
  • 12
  • Thanks for reply. Do you mean deleting entire table and re-insert all data each time? As I said in the question, that was my first intention but seems not reliable for a data-availability and professional / good practice point of view. I wish to avoid that and implement a cleaner solution. I was thinking if there already are a mechanism for doing what I ask, since may be a common scenario for many people. – Alberto Feb 02 '15 at 12:59
  • Its not a common scenario, If you don't want old data then why you are worrying about data - availability. There are no standard ways in which you can identify not matching data in a table from single a file while doing upsert operation. If you dont need old data (not present in given file) then what's the point in keeping that data. – Nachiket Kate Feb 02 '15 at 15:13
  • I don't want to keep old data (my question is about how to delete it). So in brief do you think that in this case a delete-all / insert-all operation would be better? – Alberto Feb 02 '15 at 21:42
  • 1
    Ok.. How about : 1) Create CopyTable & insert all new file data in this table. 2) Then drop old table & rename CopyTable to Actual table name. That way you have data unavailability only when you do drop & rename. which should be very fast oration... a msec job. – Santosh Kangane Feb 03 '15 at 05:23
  • Well that could be a solution. All that process is done at late night so the time spent is not a critical thing. So I conclude that my original question has no answer, at least not a trivial one. I'll just bulk delete and bulk insert for this case. Thank you for helping. – Alberto Feb 03 '15 at 10:56
0

Even better than DROPping or TRUNCATEing is the following. This is because there is zero downtime.

CREATE TABLE new LIKE real;
LOAD DATA or otherwise do bulk INSERT into `new`
RENAME TABLE real TO old, new TO real;
DROP TABLE old;

I like to hand onto old until I have a chance to verify that the bulk load did not somehow goof and loose everything.

The only time the table is blocked is during the RENAME, but this is so fast, I say "zero downtime".

Rick James
  • 135,179
  • 13
  • 127
  • 222