2

I'm trying to convert some queries from an Oracle environment to Postgres. This is a simplified version of one of the queries:

SELECT * FROM TABLE
  WHERE REGEXP_LIKE(TO_CHAR(LINK_ID),'\D')

I believe the equivalent postgreSQL should be this:

SELECT * FROM TABLE
    WHERE CAST(LINK_ID AS TEXT) ~ '\D'

But when I run these queries in their respective environments on the exact same dataset, the first query outputs no records (which is correct) and the second query outputs all records in the table. I didn't write the original code, but as I understand it, it's looking for any values in the numeric field LINK_ID that are non-digit characters. Is the \D metacharacter supposed to behave differently in Oracle vs. postgres? I'm not seeing anything in documentation to say they should.

geospatial
  • 39
  • 6
  • 2
    Could you please provide a couple example rows of data? – Gregory Arenius May 03 '18 at 21:15
  • 1
    What is the data type of `link_id` in each DBMS? –  May 04 '18 at 12:40
  • Difference in treatment of trailing whitespace? throw in an `rtrim`. – Ben May 04 '18 at 12:46
  • @GregoryArenius Here are some values... 22290704.00000000 40006661.00000000 22291131.00000000 0.00000000 0.00000000 All the values are numeric. – geospatial May 04 '18 at 14:04
  • @a_horse_with_no_name They are all numeric(38,8) – geospatial May 04 '18 at 14:12
  • @Ben No whitespace in numeric fields to trim. – geospatial May 04 '18 at 14:12
  • An `ID` with decimals - very uncommon. –  May 04 '18 at 14:13
  • @a_horse_with_no_name I think that's just a quirk of the pgAdmin4 display I copied and pasted from. In reality they are all integers. – geospatial May 04 '18 at 14:20
  • I assumed by "numeric" you meant "character field intended to contain only numeric characters, which is what I am checking", because why would you be checking a numeric field with a regex? – Ben May 04 '18 at 14:20
  • I.e. what problem are you trying to solve? – Ben May 04 '18 at 14:21
  • @Ben As I stated, I didn't write the original code... I'm only tasked with trying to convert it. The fields truly are numeric and I honestly can't figure out what they're trying to look for. I'm still curious why it works in Oracle and not in postgres. – geospatial May 04 '18 at 14:24
  • 1
    What do you mean by "works"? If you don't know what it is trying to do, how do you know it doesn't "work" on postgres and "not work" on oracle? :-) – Ben May 04 '18 at 14:25
  • @geospatial: well `numeric(38,8)` allows up to 8 decimal digits - that's what I was referring to. If you define that as `bigint` (in Postgres) you wouldn't even need that regex check as you can't have anything else than `0-9` in an integer –  May 04 '18 at 14:26
  • What was the datatype on Oracle? Also `number(38,8)`? or something else? – Ben May 04 '18 at 14:26
  • Psychic debugging: Once upon a time this field was a text field, then it was changed. This code is a hangover from that time. – Ben May 04 '18 at 14:28
  • @Ben I'll know it "works" when it gets the same results in postgres as it does in Oracle. It's less a question of is it working properly in Oracle and not in postgres or vice versa as why aren't the results identical. I'm going to contact the original writer of the code to find out his intent, but I'd still like to understand why they yield different outputs. – geospatial May 04 '18 at 14:28
  • I think he's looking for a nonzero fractional part.... TO_CHAR and CAST AS TEXT don't produce the same output. – Ben May 04 '18 at 14:44
  • @Ben I think you're right... either nonzero fractional part or it used to be a text field. The former would be really weird for an ID field though. I'm reaching out to see if I can make a functional change to things. And thanks for the answer below... that makes total sense. – geospatial May 04 '18 at 14:51

1 Answers1

1

The documentation for Oracle's TO_CHAR(number) states

If you omit fmt, then n is converted to a VARCHAR2 value exactly long enough to hold its significant digits.

This means that the only non-numeric character which might be produced is a negative sign or a decimal point. If the number is positive and has no fractional part, it will not match the regular expression \D.

On the other hand, on PostgreSQL CAST(numeric(38,8)as TEXT) returns a value with the number of decimal places specified by the type specification, in this case 8. E.g.:

cast( cast(12341234 as numeric(38,8)) as TEXT)

Generates 12341234.00000000 The result of such a cast will always contain a decimal point and therefore will always match the regular expression \D.

You may find that replacing it with this solves your problem:

(LINK_ID % 1) <> 0.0

Alternatively, If you need to use the regex (e.g. to simplify migration work), consider changing it to '\.0*[1-9]' i.e. to find a decimal point with any nonzero digit after it.

Ben
  • 34,935
  • 6
  • 74
  • 113