0

Dynamically Identify Columns in External Tables

We have a process wherein we upload employee data from multiple legislations (ex. US, Philippines, Latin America) via a SQL Loader. This happens at least once a week and the current process is they create a control file every time they load employee information, Load that into Staging Tables using SQL*Loader.

I was hoping to simplify the process by creating an External Table and running a concurrent request to put the data into our staging Tables. There are two stumbling blocks i'm encountering:

There are some columns which are not being used by some legislations.

Example: US uses the column "Veteran_Information", while the Philippines and Latin America don't. Philippines uses "SSS_Number" while US and Latin America Don't. Latin America uses a "Medical_Insurance" Column while US and Philippines don't. Something like below:

US:     LEGISLATION, EMPLOYEE_NUMBER, DATE_OF_BIRTH, VETERAN_INFORMATION
PHL:    LEGISLATION, EMPLOYEE_NUMBER, DATE_OF_BIRTH, SSS_NUMBER
LAT:    LEGISLATION, EMPLOYEE_NUMBER, DATE_OF_BIRTH, MEDICAL_INSURANCE

Business Users don't use a Standard CSV Template/Format.

Since the File is being sent by Non-IT Business Users, they don't usually follow a prescribed format. (Training/User issue, probably). they often don't follow the correct order of columns they often don't follow the correct number of columns they often don't follow the correct names of columns Something like below:

US:     LEGISLATION, EMPLOYEE_ID, VETERAN_INFORMATION, DATE_OF_BIRTH, EMAIL_ADD
PHL:    EMP_NUM, LEGISLATION, DOB, SSS_NUMBER, EMAIL_ADDRESS
LAT:    LEGISLATION, PS_ID, BIRTH_DATE, EMAIL, MEDICAL_INSURANCE

Is there a way for External Tables to identify the correct order and naming of columns even if they're not in the correct order/naming convention in the File?

Taking the Column Data from Problem 2:

US:     LEGISLATION | EMPLOYEE_ID | VETERAN_INFORMATION | DATE_OF_BIRTH | EMAIL_ADD
        US          | 111         | No                  | 1967          | vet@gmail.com  

PHL:    EMP_NUM |  LEGISLATION  | DOB   | SSS_NUMBER | EMAIL_ADDRESS
        222     |  PHL          | 1898  | 456789     | pinoy@gmail.com

LAT:    LEGISLATION | PS_ID     | BIRTH_DATE | EMAIL         | MEDICAL_INSURANCE
        HON         | 333       | 1956       | hon@gmail.com | Yes

I would like it to be like this when it appears in the External Table:

    LEGISLATION | EMPLOYEE_NUMBER | DATE_OF_BIRTH | VETERAN_INFORMATION | SSS_NUMBER | MEDICAL_INSURANCE | EMAIL_ADDRESS
    US          | 111             | 1967          | Y                   | (NULL)     | (NULL)            | vet@gmail.com  
    PHL         | 222             | 1898          | (NULL)              | 456789     | (NULL)            | pinoy@gmail.com
    HON         | 333             | 1956          | (NULL)              | (NULL)     | Yes               | hon@gmail.com

Is there a way for External Tables to do something like above?

Thanks in advance!

Migs Isip
  • 1,450
  • 3
  • 23
  • 50
  • 1
    When you create a external table in Oracle, you have to specify the structure/column order, i.e. it is fixed. I would suggest having a separate table for each file type, and then a merged view/snapshot to drive your local processing off. Or, pre-process each file to standardise its format. – TenG Dec 11 '16 at 11:40

1 Answers1

0

The simplest would be: Use three distinct load scripts for each type of input (US, PHL, HON). Each script just discards the other 2 record types, and places the columns (possibly doing some transformation, like 'No' -> 'N') in the right place and inserts NULL for columns that were not present for that record type.

Rob Heusdens
  • 155
  • 5