0

I am having a local table XYZ in greenplum. I am populating that table from data from 5 other tables ( table XYZ has few columns and data from 5 different tables, populated by some join operation ).

This is working fine. But problem i am facing here are:

1> I need my table XYZ to be have most recent data. That is if any new entry comes in 5 tables ( from which XYZ is being populated ), my table XYZ should be updated.

2> If any existing record gets modified then in that case data in table XYZ should also be modified. I have one more table History_of_XYZ, this table contains all the data ( history ) of XYZ. For Example : Lets say their is one entry for customer ABC as he is living in USA. but now ABC has moved to new country lets say Russia. Then my history table will have data corresponding to entry USA and table XYZ will have most recent updated data which is customer living in Russia.

So i am not able to figure out best way to approach step 1 and 2.

How can it be done considering all data is in greenplum database.

I did some research on gpload and other loading options but not sure how to approach step 1 and 2.

Any pointers will be helpful. I am pretty new to DB. So setting all table structure and populating the table was itself a big learning curve for me.

Cœur
  • 37,241
  • 25
  • 195
  • 267
user1188611
  • 945
  • 2
  • 14
  • 38

3 Answers3

0

I guess you need to look at interactive ingesting tools like Spring XD, see the topic streams.

Regards, Moha.

0

Simple use case for triggers both 1 & 2. Use Insert/Update triggers.

Geek
  • 23,089
  • 20
  • 71
  • 85
0

Greenplum does not support triggers. To resolve your problem, you need to maintain Last updated timestamp in all 5 source tables. And based on the frequency of 5 source tables update, schedule your program to load (either insert/update) XYZ table. If there are too many deletes and updates everyday, then its better to follow CTAS operation to maintain free disk space.

Syamala
  • 23
  • 4