2

I have a question between two very similar PostgreSQL statements:

  1. UPDATE classes SET year = 1
    FROM professors WHERE (professors.class = classes.class)
    AND professors.name = 'Smith'`
    

    This one seems to inner join the classes table and the professors table, and update only the record in classes where the corresponding professor's name is Smith.

  2. UPDATE classes c SET year = 1
    FROM classes cl JOIN professors on (professors.class_id = cl.class_id) 
    WHERE professors.name = 'Smith'`
    

    This updates every single record in classes. Why is this statement different from the first one?

paulinho
  • 292
  • 2
  • 13
  • From what I understand, the first one is the standard way to carry out and UPDATE-JOIN on two tables in PostgreSQL. – paulinho Jun 16 '19 at 16:26
  • 5
    In the second one `c.*`and `cl.*` are not linked. So every record of `c.` is updated by *every* result-row in the `FROM` term. – wildplasser Jun 16 '19 at 16:27
  • if you want to explicitly refer to `classes` in the `from` clause, add another predicate to the `where` clause: `cl.class_id = c.class_id`. also, seems like your data isn't normalized, i.e. it appears that you expect the join to be equivalent with either `class_id` or `class`. – Haleemur Ali Jun 16 '19 at 18:06

1 Answers1

3

In the second, you are referring to classes twice. These are two separate references, and the c and cl references are not correlated. In fact, there are no conditions on c, so all rows are updated.

You could add a correlation condition:

UPDATE classes 
     SET year = 1
FROM classes cl JOIN
     professors p
     ON p.class_id = cl.class_id
WHERE p.name = 'Smith' AND cl.class_id = classes.class_id;

However, the JOIN is unnecessary and the first query is a better approach (for this purpose).

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786