I'm working on the ETL for a warehouse management system and I've come across a warehouse location map that's all over the place in terms of naming conventions so my usual method of parsing the warehouse location into warehouse location and warehouse location level isn't working. I think REGEXP can be used here, but I don't know it that well.
By looking at the last 5 characters, you can tell if it needs to be parsed. Here's the logic
*If First 3 of the last 5 are numbers and Last 2 are letters. Return anything left of last 2 as "ETL Location". Return last 2 as "Level ID".
If First 2 of the last 5 are numbers and 3rd character of last 5 is letter and Last 2 charcters are numbers. Return anything left of last 3 as "ETL Location". Return last 3 as "Level ID"
If first 4 of last 5 are numbers and last character is letter. Return anything left of last character as "ETL Location". Return last 1 as "Level ID"*
This is the desired result.
Given Location | ETL Location | Level |
---|---|---|
A103AB | A103 | AB |
A103A01 | A103 | A01 |
A103A02 | A103 | A02 |
A103456 | A103456 | null |
A103A | A103 | A |
A103B | A103 | B |
ABCDEFG | ABCDEFG | null |
Can Regexp be used here?
I've done the following, but I'm still getting incorrect results since there's a lot of different naming conventions. If I do it this way and the warehouse makes new locations, logic like this will probably get broken at some point even if it works currently so I'm thinking identifying what is a "Level" and going from there is a better approach here.
case
when loc_tp_code in ('RACK','PICK','RKL2','PLNW')
and length(loc_code) = '5'
THEN substr(loc_code,1,4)
when loc_tp_code in ('RACK','PICK','RKL2','PLNW')
and length(loc_code) = '7'
THEN substr(loc_code,1,6)
when loc_tp_code in ('RACK','PICK','RKL2','PLNW')
and length(loc_code) = '6'
THEN substr(loc_code,1,5)
when loc_tp_code in ('RACK','PICK','RKL2','PLNW')
and length(loc_code) = '8'
and loc_code like 'Z%'
THEN substr(loc_code,1,6)
when loc_tp_code in ('RACK','PICK','RKL2','PLNW')
and length(loc_code) = '8'
THEN substr(loc_code,1,5)
ELSE loc_code
END AS "Location",
case
when loc_tp_code in ('RACK','PICK','RKL2','PLNW')
and length(loc_code) = '5'
THEN substr(loc_code,5,1)
when loc_tp_code in ('RACK','PICK','RKL2','PLNW')
and length(loc_code) = '7'
THEN substr(loc_code,7,1)
when loc_tp_code in ('RACK','PICK','RKL2','PLNW')
and length(loc_code) = '6'
THEN substr(loc_code,6,1)
when loc_tp_code in ('RACK','PICK','RKL2','PLNW')
and length(loc_code) = '8'
and loc_code like 'Z%'
THEN substr(loc_code,7,2)
when loc_tp_code in ('RACK','PICK','RKL2','PLNW')
and length(loc_code) = '8'
THEN substr(loc_code,6,3)
ELSE null
END AS "Level ID"