In Greenplum where is data temporarily inserted into tables within a function actually stored?

Question

Our project has a database function that stores millions of records from queries into intermediate tables as part of its steps, and then joins those intermediate tables to get the final results, which are stored in a final table. The intermediate tables are standard database tables in the schema (our security team does not allow tables to be created dynamically), and the intermediate tables are truncated before the function returns. When determining the amount of database storage space that our project needs, how do these intermediate tables figure into the determination of space needed? Do I need to figure out the worst-case of how much data is stored in the final table plus these intermediate tables while these functions are executing or does the fact that the data in the intermediate tables does not remain once the function returns mean that I do not need to include the data that would be stored in the intermediate tables when calculating the overall storage needs? I know that those millions of records are stored somewhere before the table is truncated and the function returns, but I was looking for a definitive answer.

Thanks in advance

score 0 · Answer 1 · answered Oct 01 '15 at 11:26

You should count for the worst case of this intermediate table size. Intermediate table is the same table and store data the same way as your target table. If you have inserted data into it this data would reside on the same HDDs (of course, depends on filespace settings, but usually they are the same)

When you truncate the table, its data files are removed from filesystem and table metadata is altered in pg_class

In Greenplum where is data temporarily inserted into tables within a function actually stored?

1 Answers1