0

I know this problem has already been around but I cannot figure out how to solve it in my case. I try to use the UPDATE statement with sql but I have a problem when dealing with multiples rows.

This is my code:

$body = file_get_contents('php://input');
$jsonArray = json_decode($body, true);

$sql = array();
foreach ($jsonArray as $row) {   

    $sql[] = '("'.$row['firstname'].'", "'.$row['lastname'].'", "'.$row['sex'].'", "'.$row['dateOfBirth'].'", "'.$row['email'].'")';

}

$column_name = "(firstname, lastname, sex, dateOfBirth, email)";
$stringImplode = implode(',', $sql);

$action = $mysqli->query('INSERT INTO tbl_syncList '. $column_name .' VALUES '.$stringImplode .'ON DUPLICATE KEY UPDATE XXXXXXX'); // --> I get stuck here

I do not know how I should write after the ON DUPLICATE KEY UPDATE: I know it should be something like "column_name = value1, column_name2 = value2..." but could it be possible to use the arrays above in a way like "(column_name1, column_name2...) = (value1, value2...) that could be useful if I were to add multiples attributes?

Since this is located outside the loop I am not sure how to refer to my value inside the sql[] array since I could have a lot of different value for the same column_name.

Moreover I would like to add a WHERE clause at the end with something like "WHERE timestamp_column_name < timestamp_value" but I don't know if it is possible..

If anyone could help me to solve this problem that would be great.

Trichophyton
  • 625
  • 5
  • 21

1 Answers1

1

To answer your first question, you can use VALUES() in the ON DUPLICATE KEY UPDATE statement :

$action = $mysqli->query('INSERT INTO tbl_syncList '. $column_name .' VALUES '.$stringImplode .'ON DUPLICATE KEY UPDATE firstname = VALUES(firstname), lastname = VALUES(lastname), sex = VALUES(sex), dateofbirth = VALUES(dateofbirth), email = VALUES(email)');

However, I don't think your second problem can be resolved using this approach. My recommendation would be to first insert all of the records into a staging table which doesn't have any unique constraints on the columns you are inserting/updating. Then use further queries (or preferably a database procedure) to perform separate update and insert queries based on a join between the tables, and whatever other criteria you have.

My MySQL is a bit rusty, but I think this should be roughly what you need once you have inserted the data into the staging table :

UPDATE a
SET a.firstname = b.firstname,
    a.lastname  = b.lastname,
    a.sex       = b.sex,
    ...
FROM stagingtable b
JOIN tbl_syncList a ON a.idcolumn = b.idcolumn
WHERE b.timestamp > a.timestamp

INSERT tbl_syncList(firstname, lastname, sex, dateOfBirth, email)
SELECT 
    b.firstname,
    b.lastname,
    b.sex,
    b.dateOfBirth,
    b.email
FROM stagingtable b
LEFT JOIN tbl_syncList a ON a.idcolumn = b.idcolumn
WHERE a.idcolumn IS NULL
UberDoodles
  • 618
  • 5
  • 13
  • Thanks I will have a try! – Trichophyton Feb 21 '16 at 19:43
  • You are right I cannot use the same approach to update data. But I don't understand why it could not work with the WHERE condition as follows: ON DUPLICATE KEY UPDATE firstname = VALUES(firstname), ... WHERE timestamp (i.e which is a database column_name) < VALUES(lastModification) ) i.e which is a string containing the last modification date'); Because a relatively close statement [SELECT * from tbl_syncList WHERE timestamp < lastModification] is working when applied as a single sql query... Does the UPDATE statement manage things so differently? – Trichophyton Feb 21 '16 at 22:44
  • I am not sure to understand what is table 'a' in your example, is it my tbl_syncList of origin? Or are there more than two tables? – Trichophyton Feb 22 '16 at 10:26
  • I am sorry I tried but I am not sure I understand everything even after reading the documentation – Trichophyton Feb 22 '16 at 16:23
  • The fact that you can't use a WHERE clause in an ON DUPLICATE KEY UPDATE clause is just one of those annoying cases where it's easy to see how a little extra functionality could make your life a whole lot easier, but that little bit of functionality hasn't been developed yet. The 'a's in my example are an alias of 'tbl_syncList', and 'b' is an alias of 'stagingtable'. See http://www.techonthenet.com/mysql/alias.php. – UberDoodles Feb 22 '16 at 17:09
  • Indeed :-). I understand your alias but not sure if your code contains two steps or more. I do not see if you 1) update tbl_syncList (alias a) using the FROM-JOIN-WHERE procedure with my stagingtable elements (in this case I can understand the WHERE clause) and then 2) you add (INSERT procedure) the columns that have not been updated (that should correspond to idcolumn IS NULL). --- or if you 1) update only tbl_syncList (but I would not know why), then 2) JOIN the tables (but in this case they would be pretty much the same) and 3) INSERT element. Do you think you could just help me a bit more? – Trichophyton Feb 23 '16 at 09:46
  • It's the first case you described. The update statement updates any records in tbl_synclist which exist in both tables, and have a later timestamp in stagingtable. The insert statement then inserts any records from the stagingtable into tbl_synclist, where they don't already exist in tbl_synclist. – UberDoodles Feb 23 '16 at 12:44
  • Thanks a lot it's very clear! And (last question) if I were to keep an old entry (say A) in [tbl_a] and add the (previously same) entry that has been modified in the meantime from [tbl_b] (say A') (i.e instead of updating A->A' I'd like A+A'), I should just remove the "WHERE statement" and I would have at the end in [tbl_a]: 1) all the entries that have already been inserted on a previous synchronisation (say A+B+C) and 2) an insertion of those previous entries that have been updated in the meantime (say B'+C'). I'd like to be sure to have no duplicated rows (like A+A+A'+B+B+B' etc). Right? – Trichophyton Feb 24 '16 at 17:10
  • Or is there an other way to get only new and updated (>timestamp) entries from stagingTable (b) as INSERTED rows into tbl_syncList (a) and do nothing when entries are similar? – Trichophyton Feb 25 '16 at 08:29
  • Something like removing the first updating procedure and keeping only the second one by adding in the WHERE statement "a.column IS NULL OR b.timestamp > a.timestamp"? To sum up: INSERT a ... SELECT b.firstname/lastname... FROM b ... LEFT JOIN a ON... WHERE a.idcolumn IS NULL OR b.timestamp > a.timestamp? Does that make sense? – Trichophyton Feb 25 '16 at 08:41