I want to create a function that will update a column of type varchar
to a preferred string that is referenced in the column of another table to help me clean this column more iteratively.
CREATE TABLE big_table (
mn_uid NUMERIC PRIMARY KEY,
user_name VARCHAR
);
INSERT INTO big_table VALUES
(1, 'DAVE'),
(2, 'Dave'),
(3, 'david'),
(4, 'Jak'),
(5, 'jack'),
(6, 'Jack'),
(7, 'Grant');
CREATE TABLE nameKey_table (
nk_uid NUMERIC PRIMARY KEY,
correct VARCHAR,
wrong VARCHAR
);
INSERT INTO nameKey_table VALUES
(1, 'David', 'Dave_DAVE_dave_DAVID_david'),
(2, 'Jack', 'JACK_jack_Jak_jak');
I want to perform the following procedure:
UPDATE big_table
SET user_name = (SELECT correct
FROM nameKey_table
WHERE wrong
LIKE '%DAVE%')
WHERE user_name = 'DAVE';
but looped over each user_name
in big_table
so that I have a function that can do something like this:
UPDATE big_table SET user_name = corrected_name_fn();
Here is my attempt to do something like this but I can't seem to get it to work:
CREATE FUNCTION corrected_name_fn() RETURNS VARCHAR AS $$
DECLARE entry RECORD;
DECLARE correct_name VARCHAR;
BEGIN
FOR entry IN SELECT DISTINCT user_name FROM big_table LOOP
EXECUTE 'SELECT correct
FROM nameKey_table
WHERE wrong
LIKE ''%$1%'''
INTO correct_name
USING entry;
RETURN correct_name;
END LOOP;
END;
$$ LANGUAGE plpgsql;
I want the final output in big_table
to be:
| mn_uid | user_name |
| 1 | 'David' |
| 2 | 'David' |
| 3 | 'David' |
| 4 | 'Jack' |
| 5 | 'Jack' |
| 6 | 'Jack' |
| 7 | 'Grant' |
I realize rows 6 and 7 provide two unique cases that I want to build into the function with IF ELSE
statements.
- If
user_name
is innameKey_table.correct
, go to next - If
user_name
is not innameKey_table.correct
or does not match a string innameKey_table.wrong
, leave as is.
Thanks for any help on this!!