1

Problem:

One of the owners of the company that I work for has direct database access. He uses Navicat on a windows notebook. Apparently, it has a feature that he likes where he can import data from Excel. The problem is that text fields often (or maybe always) end up with a \r\n at the end of them. Which can lead to display, reporting and filtering issues. I've been asked to clean this up and to stop him from doing it.

I know I can just add a trigger to each table that will do something like:

NEW.customer_name := regexp_replace(NEW.customer_name, '\r\n', '', 'g');

However, I would prefer to not write a separate trigger function for each table that he has access to (there are over 100). My idea was to just write a generic function and then pass in an array of column names I want corrected via the TG_ARGV[] argument.

Is there a way to update a triggers NEW record dynamically based on the TG_ARGV array?

Details:

I'm using PostgreSQL 9.6.6 on x86_64-pc-linux-gnu

Zoe
  • 27,060
  • 21
  • 118
  • 148
David S
  • 12,967
  • 12
  • 55
  • 93

1 Answers1

1

There is no native means to dynamically access the columns of the new record in a plpgsql trigger function. The only way I know is to convert the record to jsonb, modify it and convert it back to record using jsonb_populate_record():

create or replace function a_trigger()
returns trigger language plpgsql as $$
declare
    j jsonb = to_jsonb(new);
    arg text;
begin
    foreach arg in array tg_argv loop
        if j->>arg is not null then
            j = j || jsonb_build_object(arg, regexp_replace(j->>arg, e'\r\n', '', 'g'));
        end if;
    end loop;
    new = jsonb_populate_record(new, j);
    return new;
end;
$$;

The case is much simpler if you can use plpython:

create or replace function a_trigger()
returns trigger language plpython3u as $$
    import re
    new = TD["new"]
    for col in TD["args"]:
        new[col] = re.sub(r"\r\n", "", new[col])
    return "MODIFY"
$$;
klin
  • 112,967
  • 15
  • 204
  • 232
  • Hmmm... I do have the uplpython extension installed on the box. Would be interesting to see which is faster. I'll run a few tests and post my very rough and unscientific benchmarks. – David S Jan 05 '18 at 16:42
  • Simpler not always means faster. I expect python would be a bit faster but the difference should not be great. – klin Jan 05 '18 at 19:00
  • TD["args"] is a list of function arguments, not a list of columns, right? – Tudor Jun 22 '20 at 09:24
  • @Tudor - right. The OP's idea is *to just write a generic function and then pass in an array of column names (...) via the TG_ARGV[] argument.* – klin Jun 22 '20 at 10:02
  • Sorry I wan't specific: you have a typo in the ``for col in TD["args"]:`` as it should be ``for col in new``, right? Thanks – Tudor Jun 22 '20 at 10:56
  • @Tudor - no, `TD["args"]` is an array of column names passed as arguments. The code is correct as is. – klin Jun 22 '20 at 12:30
  • oh.. ok, for me it didn't work as I hadn't pass any arguments, I expected it to clean any tabe on which the trigger is on; I did a ``for col in new:`` followed by an ``if type(new[col]) == str:`` to modify all the text columns. Thanks for replying. – Tudor Jun 24 '20 at 08:15