0

This question is a follow up question from Link. I have a table with person (id) and one characteristic (var0) at different timepoints t. At some timepoints the characteristic is missing and I would like to fill the gaps with the former value. Here is an example of a table:

+---+---+----+            +----+---+------+------+------------------+
|id | t |var0|            | id | t | var0 | var1 | @prev_id   := id |
+---+---+----+            +----+---+------+------+------------------+
| 1 | 1 | a  |            |  1 | 1 | a    | a    |                1 |
| 1 | 3 | \N |            |  1 | 3 | \N   | a    |                1 |
| 1 | 7 | \N |            |  1 | 7 | \N   | a    |                1 |
| 1 | 8 | b  |            |  1 | 8 | b    | b    |                1 |
| 1 | 9 | \N |            |  1 | 9 | \N   | b    |                1 |
| 2 | 2 | \N |            |  2 | 2 | \N   | \N   |                2 |
| 2 | 4 | u  |            |  2 | 4 | u    | u    |                2 |
| 2 | 5 | u  |            |  2 | 5 | u    | u    |                2 |
| 2 | 6 | \N |            |  2 | 6 | \N   | u    |                2 |
| 2 | 7 | \N |            |  2 | 7 | u    | u    |                2 |
| 2 | 8 | v  |            |  2 | 8 | v    | v    |                2 |
| 2 | 9 | \N |            |  2 | 9 | \N   | v    |                2 |
+---+---+----+            +----+---+------+------+------------------+

The left table is the orignal x1 table and the right table is the requested table. Here is the code to get the result:

DROP TABLE IF EXISTS test01.x1;
CREATE TABLE test01.x1 (
  id   INTEGER
, t    INTEGER
, var0 CHAR(1)
) ENGINE = InnoDB 
DEFAULT CHARACTER SET = utf8 COLLATE = utf8_unicode_ci
;

INSERT INTO test01.x1(id,t,var0) VALUES
( 1,1,'a' )
,(1,3,NULL)
,(1,7,NULL)
,(1,8,'b' )
,(1,9,NULL)
,(2,2,NULL)
,(2,4,'u' )
,(2,5,'u' )
,(2,6,NULL)
,(2,7,'u')
,(2,8,'v' )
,(2,9,NULL)
;

DROP TABLE IF EXISTS test01.x2;
CREATE TABLE test01.x2
SELECT id, t
       , var0
       , @prev_var0 := CAST(IF(id = @prev_id AND var0 IS NULL AND @prev_var0 IS NOT NULL 
                           , @prev_var0
                               , var0 
                               ) AS CHAR
                            )  var1
       , @prev_id   := id
FROM test01.x1, (SELECT @prev_id    := NULL
                    ,@prev_var0 := NULL
        ) init
ORDER BY id, t
;

ALTER TABLE test01.x2 MODIFY var1 CHAR(1) DEFAULT NULL;


DROP TABLE IF EXISTS test01.x2;
CREATE TABLE test01.x2
SELECT * FROM test01.x1;


UPDATE test01.x1, (SELECT   @prev_id    := NULL
                          , @prev_var0  := NULL
          ) init
SET var0 = @prev_vr0 := IF(id = @prev_id AND var0 IS NULL AND @prev_var0 IS NOT NULL 
                           , @prev_var0
                               , var0 
                               )
    , @prev_id   := id

ORDER BY id, t

I would be interested in another solution. Instead to create a new table x2 I would like to update var0 of table x1. I tried this:

UPDATE test01.x1, (SELECT   @prev_id    := NULL
                          , @prev_var0  := NULL
          ) init
SET var0 = @prev_vr0 := IF(id = @prev_id AND var0 IS NULL AND @prev_var0 IS NOT NULL 
                           , @prev_var0
                               , var0 
                               )
    , @prev_id   := id                 
ORDER BY id, t

But there are two reasons why it does not work (and maybe others):

  • ORDER BY is not allowed with multiple table UPDATE (see Link)
  • @prev_id := id does not work. Obviously, in SET statement it is not possible to assign a value directly to a user defined variable.

Does anyone have an idea how I can get the left table without gaps?

Thanks for help.

Community
  • 1
  • 1
giordano
  • 2,954
  • 7
  • 35
  • 57

2 Answers2

1

You can always use stored procedures or functions:

declare a stored function:

DELIMITER //
 CREATE FUNCTION fillGap(
   gapID INT, verID INT
 ) RETURNS VARCHAR(255)
 BEGIN
   DECLARE gapValue VARCHAR(255);

 -- gets the value
 SELECT var0
 FROM x1
  WHERE id = gapID AND t <= verID AND var0 IS NOT NULL 
  ORDER BY t DESC
  LIMIT 1
 INTO
  gapValue;

  RETURN gapValue;
END //
DELIMITER ;

Then you can call it in an UPDATE statement:

UPDATE x1 SET var0 = fillGap(id, t) WHERE var0 IS NULL

this functions gets one preceeding value from a database, assuming t is a version number and id is object_id.

The problem will appear in case (id=2, t=2) as there is no preceeding value for this object id. In any case - edit the provided function and add the required logics.

Artjoman
  • 286
  • 1
  • 9
  • Thanks a lot. This piece of code will be the base for many other stored functions. – giordano Nov 20 '14 at 10:50
  • There is no problem with case (id=2,t=2). It gives NULL and that is what I want since there is no prior information. – giordano Nov 20 '14 at 11:07
  • I tried to figure out how I generalize this function if I have several variable var0, var1, var2, etc. I tried with `CREATE FUNCTION fillGap(gapID INT, verID INT, xvar CHAR(1)) ` ..... `WHERE id=gapID t <= verID AND xvar IS NOT NULL)`. Then: `UPDATE x1 SET var0 = fillGap(id, t) WHERE var0 IS NULL`. It does any update. I suppose the problem is due to the parameter xvar in the first clause. I want to assign the name of the variable and not the content. – giordano Nov 20 '14 at 14:54
  • Excuse me, I could not get the thought You have many columns or many tables you want to pass in as a variable? – Artjoman Nov 20 '14 at 15:02
  • Sorry I didn't make myself clear. I have one table with many columns. And the update part should be `UPDATE x1 SET var0 = fillGap(id, t, var0) WHERE var0 IS NULL; ` resp. `UPDATE x1 SET var1 = fillGap(id, t, var1) WHERE var1 IS NULL;`etc.. – giordano Nov 20 '14 at 16:38
  • it is simple as `BEGIN DECLARE gapValue VARCHAR(255); -- gets the value SELECT varr FROM x1 WHERE id = gapID AND t <= verID AND varr IS NOT NULL ORDER BY t DESC LIMIT 1 INTO gapValue; RETURN gapValue; END` And afterwards use this query to call it `UPDATE x1 SET var0 = fillGap( id, t, 'var0' ) WHERE var0 IS NULL` And dont forget to quote the column name with '' – Artjoman Nov 21 '14 at 07:53
  • Thanks. I got the message: SQL Error (1406): _Data too long for column 'varr' at row 2_. The entire code I used is: `DROP FUNCTION IF EXISTS fillGap; CREATE FUNCTION fillGap(gapID INT, verID INT, varr CHAR(1)) RETURNS VARCHAR(255) BEGIN DECLARE gapValue VARCHAR(255); -- gets the value SELECT varr FROM x1 WHERE id = gapID AND t <= verID AND varr IS NOT NULL ORDER BY t DESC LIMIT 1 INTO gapValue RETURN gapValue; END //; UPDATE x1 SET var0 = fillGap( id, t, 'var0' );` What's wrong with the code? – giordano Nov 22 '14 at 21:34
  • dont declare varr as CHAR(1) use at least VARCHAR(255). I have no idea of datatypes used in your project though – Artjoman Nov 24 '14 at 07:42
  • Thanks for answer. I declared as you mentioned but I get still the same sql error 1406. The datatypes are declared in the CREATE TABLE part of my question. I tried several datatypes in the function (CHAR(1), VARCHAR(255)) but noone solved this problem. – giordano Nov 24 '14 at 08:57
  • Ohh, if the table itself has CHAR(1) Datatype then Function should return the same. Replace all VARCHAR(255) with CHAR(1) – Artjoman Nov 24 '14 at 09:06
  • I tried it with CHAR(1) but it does not work. I wanted also to use dynamich sql but MySQL doesn't support dynamic sql in stored function. – giordano Nov 24 '14 at 11:10
  • Well, I can suggest you to increase the length of the columns from CHAR(1) to **at least** VARCHAR(2) Seems "\N" is considered to be 2-char string – Artjoman Nov 24 '14 at 11:32
  • No, it does not work. I suppose that it has something to do with 'var0' as character value. I tried to write a stored procedure which allow dynamic sql but didn't succeed. – giordano Nov 25 '14 at 07:26
0

Thanks to Artjoman I could create a stored procedure to solve my problem. It is not so elegant as the stored function of Artjoman but it allows to pass table name and column name to the procedure. Any improvement or alternative is appreciated.

Firstly, I copied column var0 to test with two columns:

ALTER TABLE test01.x1 ADD var1 CHAR(1) DEFAULT NULL;
UPDATE test01.x1 SET var1 = var0;     

The stored procedure is:

DROP PROCEDURE IF EXISTS sp_fillGap;
DELIMITER //
CREATE PROCEDURE sp_fillGap(IN xtable VARCHAR(64)
                             , IN xvar VARCHAR(64)
                             )
BEGIN

SET @query1 = CONCAT('CREATE TABLE xt1 SELECT * FROM ',xtable,' WHERE ',xvar,' IS NOT NULL;');
SET @query2 = CONCAT('CREATE TABLE xt2 SELECT * FROM ',xtable,' WHERE ',xvar,' IS NOT NULL;');

SET @query1a = CONCAT('DROP TABLE IF EXISTS xt1;');
SET @query2a = CONCAT('DROP TABLE IF EXISTS xt2;');

SET @query3 = CONCAT('UPDATE ',xtable,' a'
                     ,' SET a.',xvar,'  = (SELECT b.',xvar
                                       ,' FROM xt1 b' 
                                   ,' WHERE a.id = b.id'
                                                -- select the last of the former cases
                                       ,'       AND b.t = (SELECT MAX(c.t)'
                                       ,'                  FROM xt2 c'
                                       ,'                  WHERE a.id = c.id'
                                       ,'                        AND c.t <= a.t'
                                       ,'                  )'
                     ,'                   );'
                     );

    PREPARE stmt1a FROM @query1a;
    EXECUTE stmt1a;

PREPARE stmt2a FROM @query2a;
EXECUTE stmt2a;

PREPARE stmt1 FROM @query1;
EXECUTE stmt1;

PREPARE stmt2 FROM @query2;
EXECUTE stmt2;

PREPARE stmt3 FROM @query3;
EXECUTE stmt3;

EXECUTE stmt1a;
EXECUTE stmt2a;

END //
DELIMITER ;

Test:

CALL sp_fillGap('test01.x1','var0');
CALL sp_fillGap('test01.x1','var1');
giordano
  • 2,954
  • 7
  • 35
  • 57