0

There is a table 'TICKETS' in PostgreSQL.I perform an ETL job using Pentaho to populate this table. There is also a GUI on which a user makes changes and the result is reflected in this table. The fields in the table are :

"OID" Char(36)  <------ **PRIMARY KEY**
, "CUSTOMER" VARCHAR(255)
, "TICKETID" VARCHAR(255)
, "PRIO_ORIG" CHAR(36)
, "PRIO_COR" CHAR(36)
, "CATEGORY" VARCHAR(255)
, "OPENDATE_ORIG" TIMESTAMP
, "OPENDATE_COR" TIMESTAMP
, "TTA_ORIG" TIMESTAMP
, "TTA_COR" TIMESTAMP
, "TTA_DUR" DOUBLE PRECISION
, "MTTA_TARGET" DOUBLE PRECISION
, "TTA_REL_ORIG" BOOLEAN
, "TTA_REL_COR" BOOLEAN
, "TTA_DISCOUNT_COR" DOUBLE PRECISION
, "TTA_CHARGE_COR" DOUBLE PRECISION
, "TTR_ORIG" TIMESTAMP
, "TTR_COR" TIMESTAMP
, "TTR_DUR" DOUBLE PRECISION
, "MTTR_TARGET" DOUBLE PRECISION
, "TTR_REL_ORIG" BOOLEAN
, "TTR_REL_COR" BOOLEAN
, "TTR_DISCOUNT_COR" DOUBLE PRECISION
, "TTR_CHARGE_COR" DOUBLE PRECISION
, "COMMENT" VARCHAR(500)
, "USER" CHAR(36)
, "MODIFY_DATE" TIMESTAMP
, "CORRECTED" BOOLEAN
, "OPTIMISTICLOCKFIELD" INTEGER
, "GCRECORD" INTEGER
, "ORIGINATOR" Char(36)

I want to update the table when columns TICKETID+ORIGINATOR+CUSTOMERS are same. Otherwise, an insert will be performed.

How should I do it using Pentaho? Is the step Dimension/Lookup update fine for it or just the Update/Insert step will do the work ?

Any help would be much appreciated. Thanks in advance.

Pranjal Kaushik
  • 13
  • 1
  • 10
  • Is there an unique index on columns TICKETID+ORIGINATOR+CUSTOMERS ? Then you may use `INSERT ... ON CONFLICT DO UPDATE SET column = EXCLUDED.column;` https://www.postgresql.org/docs/9.5/static/sql-insert.html – Eugene Lisitsky Oct 12 '17 at 05:29

1 Answers1

0

Eugene Lisitsky suggestion is good practice: you may hard wire it in the database constraints and let PostgesSQL do the job.

For a PDI solution: your table does not look like Slowly Changing Dimension therefore the Insert/Update covers your need.

If you want to use the Dimension_update, you need to alter the table in the Pentaho SCD format: add a version column and valid_form_date/valid_upto_date (with PDI the alter is a one button operation).

After that, when a new row comes in, the TICKETID+ORIGINATOR+CUSTOMERS is searched in the table and if found it receives a valitity_upto=now(). At the same time, a version+1 is created in the table valid from now() to the end-of-time.

The (main) pro is that you can retrieve the state of the database as it was at any date in the past with a simple where now() between validity_from and validity_upto.

The (mian) con is that you have to alter the table which may have some impact on the GUIs (plural).

AlainD
  • 6,187
  • 3
  • 17
  • 31