I need to transform data from transactional database into a star schema. I already created all dimension tables and one fact table (cinjenica) as shown on ERD picture here: left(star-schema) right(transactional database table)
I also created Unique index ON dimension tables so to limit maximum theoretical row count for example in dimenzija_platforma i have UNIQUE index on (widows, linux, mac) atributes, and those atributes can have values 1 or non-one (binary value) which yields 2^3=8 max theoretical rows. Same is done for other dimension tables that will be derived from transactional database table with 29k rows.
Table cinjenica(fact table) contains compound PK/FK and all foreign keys have ON UPDATE CASCADE.
Now to input the data inside my star schema i created following stored procedure:
CREATE DEFINER=`root`@`localhost` PROCEDURE `new_row_to_star_scheme`()
BEGIN
DECLARE i INTEGER;
DECLARE total_row_count INTEGER;
select appid from sppi.steam;
SELECT FOUND_ROWS() into total_row_count;
SET i = 0;
ponavljanje: LOOP
IF i > total_row_count THEN
LEAVE ponavljanje;
END IF;
INSERT INTO dimenzija_datum(ID,datum)
SELECT last_insert_id(),s.release_date
FROM steam as s LIMIT i,1
ON DUPLICATE KEY UPDATE ID=last_insert_id(ID);
SET @row_datum = last_insert_id();
INSERT INTO dimenzija_vlasnik(ID,developer,publisher)
SELECT last_insert_id(),s.developer, s.publisher
FROM steam as s LIMIT i,1
ON DUPLICATE KEY UPDATE ID=last_insert_id(ID);
SET @row_vlasnik = last_insert_id();
INSERT INTO dimenzija_platforme(ID,tekstualno)
SELECT last_insert_id(),s.platforms
FROM steam as s LIMIT i,1
ON DUPLICATE KEY UPDATE ID=last_insert_id(ID);
SET @row_platforme = last_insert_id();
INSERT INTO dimenzija_kategorija(ID,zanr_tagovi,tip_tagovi)
SELECT last_insert_id(),s.genres, s.categories
FROM steam as s LIMIT i,1
ON DUPLICATE KEY UPDATE ID=last_insert_id(ID);
SET @row_kategorija = last_insert_id();
INSERT INTO cinjenica(fk_datum,fk_vlasnik,fk_platforma,fk_kategorija,naziv,positive_rating,negative_rating,avg_playtime,median_playtime,cijena,owners)
SELECT @row_datum, @row_vlasnik, @row_platforme, @row_kategorija, s.name, s.positive_ratings, s.negative_ratings, s.average_playtime, s.median_playtime, s.price, s.owners
FROM steam as s LIMIT i,1
ON DUPLICATE KEY UPDATE ID=last_insert_id(ID);
SET i = i+1;
END LOOP;
END
All other data that isn't inserted into dimensions from shown procedure is created with BEFORE INSERT triggers on each dimension table.
With this stored procedure I can populate my star schema until any of dimension tables encounters duplicate row after that I get Error code 1054 Unknown column 'ID' in 'field list'. After each INSERT INTO if there is duplicate data - ON DUPLICATE KEY UPDATE should update primary key of dimension table(foreign key for fact table) and trigger ON UPDATE CASCADE so in theory it should work. How should I correct my procedure to populate my star schema ?