0

Norwegian social security numbers (SSN) is composed like this

substr(ssn,1,6) = ddmmyy
substr(ssn,7,3) =
Person individual numbers:
000–499 persons born 1900–1999.
500–749 persons born 1854–1899.
500–999 persons born 2000–2039.
900–999 persons born 1940–1999.

substr(ssn,11,2)=control digits

I am struggling to convert date of birth to date format DD.MM.YYYY for persons born between 1900 and 1949

select to_date('121049','dd.mm.rrrr') from dual; -- Returns: 12.10.2049
select to_date('121049','dd.mm.rr')from dual; -- Returns: 12.10.2049
select to_date('121049','dd.mm.yy')from dual; -- Returns: 12.10.2049

select to_date('121050','dd.mm.rrrr')from dual; -- Returns: 12.10.1950
select to_date('121050','dd.mm.rr')from dual; -- Returns: 12.10.1950
select to_date('121050','dd.mm.yy')from dual; -- Returns: 12.10.2050

I expecting 121049 to return 12.10.1949. How can I force oracle to return the expected date format given the information I have available to me?

From wikipedia: https://en.wikipedia.org/wiki/National_identification_number#Norway

Historically, the number has been composed of the date of birth (DDMMYY), a three digit individual number, and two check digits. The individual number and the check digits are collectively known as the Personal Number.

The individual number has been selected from a range depending on century of birth: for the years 1854–1899 the range is 500–749, for the years 1900–1999 the range is 000-499, for the years 2000–2039 the range is 500–999. For the years 1940–1999, the range 900–999 was also used for special purposes, such as adoptions from abroad and immigrants. Women have been assigned even individual numbers, men are assigned odd individual numbers.

MrM
  • 389
  • 1
  • 8
  • 23
  • 1
    Your 'person identify number' ranges overlap, is that correct? – Alex Poole Nov 08 '17 at 16:26
  • The issue here is exactly that overlap. How can 500-999 be 2000-2039 and also be 500-749 be 1854-1899. So so if someone has a PIN of 500 are they in the 1800's or the 2000? – xQbert Nov 08 '17 at 16:32
  • @xQbert - looks like that depends on whether the two-digit year is between 00 and 39, or between 54 and 99. I think... The combination of PIN range and two-digit-year range doesn't currently overlap. – Alex Poole Nov 08 '17 at 16:48
  • Ah I see now. For the like of me I was missing the pattern and couldn't get by the overlap. – xQbert Nov 08 '17 at 17:00

1 Answers1

2

The interpretation of the two-digit year and its implied century seems to be based both on its value and the PIN. The ranges for that overlap, but the full year is then restricted; so it looks like you can use a case expression that checks both:

-- CTE for dummy data
with t42 (ssn) as (
  select '12104900000' from dual
  union all select '12105099999' from dual
  union all select '01010000001' from dual
  union all select '02029949902' from dual
  union all select '03035450003' from dual
  union all select '04049974904' from dual
  union all select '05050050005' from dual
  union all select '06063999906' from dual
  union all select '07074090007' from dual
  union all select '08089999908' from dual
)
select ssn, to_date(substr(ssn, 1, 4)
  || case
      when to_number(substr(ssn, 7, 3)) between 0 and 499
        and to_number(substr(ssn, 5, 2)) between 0 and 99 then '19'
      when to_number(substr(ssn, 7, 3)) between 500 and 749
        and to_number(substr(ssn, 5, 2)) between 54 and 99 then '18'
      when to_number(substr(ssn, 7, 3)) between 500 and 999
        and to_number(substr(ssn, 5, 2)) between 0 and 39 then '20'
      when to_number(substr(ssn, 7, 3)) between 900 and 999
        and to_number(substr(ssn, 5, 2)) between 40 and 99 then '19'
    end
  || substr(ssn, 5, 2), 'DDMMYYYY') as dob
from t42;

which for that data, based on your two examples and the ranges involved, gives:

SSN         DOB       
----------- ----------
12104900000 1949-10-12
12105099999 1950-10-12
01010000001 1900-01-01
02029949902 1999-02-02
03035450003 1854-03-03
04049974904 1899-04-04
05050050005 2000-05-05
06063999906 2039-06-06
07074090007 1940-07-07
08089999908 1999-08-08

The case picks a two-digit century value based on on the PIN and then - because they overlap - the two-digit year range.

If the data design changes so the overlaps are no longer unique based on the two-digit year, you have further issues. It'll be interesting to see what happens when we reach 2040...

And if you had an SSN which didn't match the ranges you've shown, say 12105050000 (with PIN 500, but two-digit year not in either range 00-39 or 54-99) then the case expression will return null and the two-digit year will then be interpreted as 0050. You could make it error instead by changing the format model - depends if it can happen and how you want to handle it if it does.


You can probably figure this bit out anyway, but to handle the day+40 scenario mentioned in comments, you can use another case expression to adjust the day number:

select ssn, to_date(
    case
      when substr(ssn, 1, 2) > 31 then to_char(to_number(substr(ssn, 1, 2)) - 40, 'FM99')
      else substr(ssn, 1, 2)
    end
  || substr(ssn, 3, 2)
  || case
      when to_number(substr(ssn, 7, 3)) between 0 and 499
...
Alex Poole
  • 183,384
  • 11
  • 179
  • 318
  • Thank you Alex, Unfortunately exception ORA-01847 (day of month must be between 1 and last day of month) is raised after fetching a couple hundred rows from Person table in local database. – MrM Nov 08 '17 at 19:24
  • In addition, I am assuming ssn not matching any ranges is present which is causing the error. I have updated initial post with wiki details of how ssn is composed. Given the information, is it possible to determine exact date of birth with century and specified ranges alone? – MrM Nov 08 '17 at 19:34
  • @MrM - sounds like you have data which doesn't match your rules then. (Or I've misinterpreted them.... surely not...!). Can you identify a row which throws that error? If you have data which doesn't match the rules the best you can do is flag or ignore it. If they should all be valid, one quick thing to check for is leap years; as the wrong century can break those; do you have SSNs that start 2902? – Alex Poole Nov 08 '17 at 19:34
  • found the issue. Date of birth is out of range. Per wikipedia: People who do not permanently reside in Norway have been assigned a D-number upon registration in the population register. The D-number is like a birth number, but with the day of the month increased by 40 – MrM Nov 08 '17 at 19:41
  • OK, great, so you can filter that out or adjust it back using another case expression. – Alex Poole Nov 08 '17 at 19:42