0

I have a table:

select * from users
id | name |   company_name
 1 |  Sam | Sam's Plumbing
 2 |  Pat |   Pat's Bakery
 3 |  Vic |

I want to move users.company_name to a new companies table, with users.company_id referencing companies.id. Preferably, I'd like to do this in one transaction.

This expresses what I want conceptually, but isn't valid SQL:

BEGIN;

-- 1: add companies
CREATE TABLE companies (
  id integer PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
  name varchar(255) not null
);

-- 2: add users.company_id -> companies.id
ALTER TABLE users
ADD COLUMN company_id INT
CONSTRAINT users_company_id_fk REFERENCES companies (id);

-- 3: move users.company_name to companies.name; update FK
UPDATE users
SET users.company_id = inserted_companies.id
FROM (
  INSERT INTO companies (name)
  SELECT company_name FROM users
  WHERE company_name IS NOT NULL
  -- this isn't valid; RETURNING can't reference users
  RETURNING companies.id, users.id AS user_id
) AS inserted_companies;

-- 4: drop users.company_name
ALTER TABLE users 
DROP COLUMN company_name;

COMMIT;

Similar questions that don't quite help:

James A. Rosen
  • 64,193
  • 61
  • 179
  • 261

2 Answers2

1
create table companies
(
    id           serial primary key ,
    company_name text
);
insert into companies (company_name)
select distinct company_name
from users
where company_name is not null;

alter table users add company_id int null;
update users set company_id = companies.id
from companies where companies.company_name = users.company_name;

ALTER TABLE users
DROP COLUMN company_name;

ALTER TABLE users
ADD CONSTRAINT users_company_id_fk foreign key (company_id) REFERENCES companies (id);

DBFiddle demo

Cetin Basoz
  • 22,495
  • 3
  • 31
  • 39
  • I gave this an upvote because it does a good job of what I asked, but one thing I didn't state is that I don't want deduplication. That is, even if User 1 and User 2 happen to have the same `organization_name`, they shouldn't get the same `organization_id`. – James A. Rosen Dec 15 '22 at 17:42
  • @JamesA.Rosen, you should be kidding. That doesn't fit into RDBMS normalization. If you would have all those as separate, then what is the point of creating another table and adding a foreign key in the first place (let alone it wouldn't be a primary key then). – Cetin Basoz Dec 15 '22 at 17:45
  • Imagine you own "AAA Plumbing" in Lawrence, KS and I own "AAA Plumbing" in Fayetteville, AR. We wouldn't want this migration to merge those companies into one row because I might end up being able to edit your company. The solution could be to use `distinct company_name, company_city`, but I don't have company city data yet. That's to be filled out by users later. – James A. Rosen Dec 15 '22 at 17:54
  • @JamesA.Rosen, no that needs to be filled beforehand. At least you should make them different in current data to later fix names. Otherwise, instead of creating another table with a foreign key relation, maybe create one with a 1-to-1 relation, then that new table would have all those companies N times and current table doesn't need to have an company_id column (using the same primary Id key from users). – Cetin Basoz Dec 15 '22 at 17:59
0

Building off @Cetin.Basoz's excellent answer, here's what I ended up with:

-- 1: add companies:
CREATE TABLE companies
(
    id           serial primary key,
    company_name text
);

-- 2: move users.company_name to companies.name
-- using the users.id as the companies.id for the initial import:
INSERT INTO companies (id, company_name)
SELECT id, company_name
FROM users
WHERE company_name IS NOT NULL;

-- 3: update the companies PK index
-- so we don't try to insert duplicate IDs:
SELECT setval('companies_id_seq', (SELECT MAX(id) FROM companies));

-- 4: add references to the newly-inserted companies:
ALTER TABLE users ADD company_id int null;
UPDATE users
SET company_id = users.id
WHERE company_name IS NOT NULL;

-- 5: drop the vestigial users.company_name
ALTER TABLE users DROP COLUMN company_name;

-- 6: add a FK index:
ALTER TABLE users
ADD CONSTRAINT users_company_id_fk
FOREIGN KEY (company_id)
REFERENCES companies (id);

The companies table will have holes in it, but as long as the id space is sufficiently large, I don't see that as a problem. It would have holes if records were deleted.

James A. Rosen
  • 64,193
  • 61
  • 179
  • 261