0

I need to merge the data from annual tables into one large table. I am confused about how dynamic SQL (or else) should be used for this.

All this is in MonetDB, which follows the SQL 2008 standard, I think. I am not sure they support partitioning though, and I would rather merge my tables in any case.

The discussion on SELECT * FROM sales + @yymm in The Curse and Blessings of Dynamic SQL does not mention a solution in the end.

I received some guidance about the relevant pieces from a MonetDB expert in a comment below his answer over on DBA, but without the loop over years, which I still need.

Think of my data having tables like CIVIL_1969, CIVIL_1970CIVIL_2012. These usually follow the same schema, but have no year column. I would want to have a single CIVIL table, with a year columns as well.

By the way, there are tables where the schema do change from year to year (e.g. tax forms collected different records for different years). Is it possible to merge these tables as well? Sure, some of the columns would have sparse records, missing for many years.

Some very tentative pseudocode on this:

USE dbfarm
DECLARE @i INT
SET @i = 1990
SELECT name FROM tables WHERE name LIKE 'data_@i';
WHILE @i < 2013
DO
    ALTER TABLE data_@i ADD COLUMN "year" INTEGER; UPDATE data_@i SET "year" = @i;
    SET @i = @i +1
END WHILE
CREATE TABLE data AS SELECT * FROM data_1990 UNION ALL SELECT * FROM data_1991 UNION ALL [...] WITH DATA;
Community
  • 1
  • 1
László
  • 3,914
  • 8
  • 34
  • 49

1 Answers1

1

I don't know monetdb, but I assume they support views. One approach is to create a view that describes the merging of data, and then use that to insert into the table. I.e.:

create view v as (
    select 1969 as year, c1, c2, ..., cn, cast(null as ...) -- col introduced in 1973
         , cast(null as ...) -- col introduced in 1987
    union all 
    ...
    select 1973 as year, c1, c2, ..., cn, cn_1, , cast(null as ...) -- introduced in 1987   
    union all
    ...
)

insert into data (...) select ... from v;
Lennart - Slava Ukraini
  • 6,936
  • 1
  • 20
  • 32
  • Thanks, that addresses the issue of the changing schema, right? But there is no way to automate things a bit more than spelling out each year? I have multiple decades, and dozens of tables for each, this would make very messy code, error-prone and very hard to edit. – László May 17 '14 at 21:33
  • You can generate the definition of the view(s) in your favourite language. – Lennart - Slava Ukraini May 17 '14 at 21:35
  • Ah! You mean to generate a text file first and run that in the SQL client? – László May 17 '14 at 21:40
  • Something like that, it will give you a chance to verify the view(s) before executing anything – Lennart - Slava Ukraini May 17 '14 at 21:46
  • Wait, you would also spell out the columns year by year? That is way too many. Why can't this be automated? – László May 17 '14 at 21:52
  • I'm not sure I understand, but surely you can automate. Assuming your partition tables have names like data_1969, data_1970, ..., data_2013. If you loop over your table names you can extract the year as the four last characters of the table name. – Lennart - Slava Ukraini May 17 '14 at 21:57
  • Thanks again! I also meant not spelling out all column names. – László May 17 '14 at 22:42
  • You should be able to extract the names of the columns for a given table from the catalog, have a look at SYS.COLUMNS – Lennart - Slava Ukraini May 17 '14 at 22:46