2

I'm new to regex and I'm using Postgres. I'm trying to make a check constraint in a table so that any data into some column always checks for only alphanumeric, numeric, alphanumeric without spaces, etc.

For example, given a table like:

CREATE TABLE client (
PRIMARY KEY (id),
id                  CHAR(18)    NOT NULL,
name                TEXT        NOT NULL,
promo_code          TEXT        NOT NULL
);

I'd like to make a constraint with a CHECK that only allows alphanumeric without spaces content in id, alphabetic with spaces in name, alphanumeric with symbols without spaces content in promo_code. id has a strict length of 18, but name and promo_code have variable length.

Would something like the following work?

ALTER TABLE client
ADD CONSTRAINT ck_client_one
    CHECK (
        id ~ '%\w%' AND id !~ '%\s%',
        name ~ '%\w%' AND name !~ '%\d%',
        promo_code ~ '%\w%' OR promo_code ~ '%\W%' AND promo_code !~ '%\s%'
    );

ALTER TABLE client
ADD CONSTRAINT ck_client_two
    CHECK (
        id ~ '%^[a-zA-Z0-9]*%' AND id !~ '%\s%',
        name ~ '%^[a-zA-Z]*%',
        promo_code ~ '%^[a-zA-Z0-9]*%' OR promo_code !~ '%^[a-zA-Z0-9]*%' 
            AND promo_code !~ '%\s%'
    );

rd_nielsen

ALTER TABLE client
ADD CONSTRAINT ck_client_one
    CHECK (
        id ~ '.+\w+.+' AND id !~ '.+\s+.+',
        name ~ '.+\w+.+' AND name !~ '.+\d+.+',
        promo_code ~ '.+\w+.+' OR promo_code ~ '.+\W+.+' 
            AND promo_code !~ '.+\s+.+'
    );

Thanks!

Dialvive
  • 356
  • 7
  • 19

2 Answers2

1

Something like your expressions will work, but your expressions contain a mixture of the SQL multiple-character wildcard ("%") and regular expression syntax. You should eliminate the "%" characters and modify the regexes as necessary. For example,

'%\w%'

should be

'.*\w*.*'

If the column is non-nullable, then \w* should be \w+ instead. Other expressions should be modified similarly.

rd_nielsen
  • 2,407
  • 2
  • 11
  • 18
  • Yes, your edit looks good in that it contains legitimate regexes. The CHECK syntax uses commas to separate the expressions for different column, but `AND` should be used instead. Or you could use three different CHECK expressions instead. – rd_nielsen Dec 12 '19 at 16:04
  • You don't need the `.*` at the beginning and the end of the pattern. – Laurenz Albe Dec 12 '19 at 16:12
  • @LaurenzAlba: Ah, yes, you're right for the `'\w'` pattern. It should be included for the `!~ '.*\s.*'` expression. – rd_nielsen Dec 12 '19 at 16:17
0

A few things worth mentioning:

  • The % character is used by LIKE expressions as a zero-or-more wildcard. The regex equivalent is .*.
  • If you want to match the entire string, then you need to anchor the start and end with ^ and $.
    • Conversely, unanchored patterns behave as though there were wildcards on both ends; for example, \s is the same as ^.*\s.*$.
  • \w matches underscores, and \s matches tabs and newlines, which you probably don't want to allow.
  • The ~* operator performs case-insensitive matching, letting you simplify [A-Za-z] to [a-z].

With all of that in mind:

id ~* '^[a-z0-9]{18}$'
name ~* '^[a-z ]*$'
promo_code !~ '\s'

If you want to make sure that name isn't completely blank, use ^[a-z ]+$ instead.

The promo_code check will allow much more than just "symbols"; Unicode contains a lot of control codes and non-printable characters which don't count as "space". You might want to use an explicit whitelist instead.

Nick Barnes
  • 19,816
  • 3
  • 51
  • 63