1

I have a table like this one:

 Yr  | Mnth | W_ID | X_ID | Y_ID | Z_ID | Purchases |    Sales | Returns |
2015 |  10  |    1 | 5210 | 1402 |    2 |   1000.00 |      etc |     etc |
2015 |  12  |    1 | 5210 | 1402 |    2 |  12000.00 |      etc |     etc |
2016 |   1  |    1 | 5210 | 1402 |    2 |   1000.00 |      etc |     etc |
2016 |   3  |    1 | 5210 | 1402 |    2 |       etc |      etc |     etc |
2014 |   3  |    9 |  880 |    2 |    7 |       etc |      etc |     etc |
2014 |  12  |    9 |  880 |    2 |    7 |       etc |      etc |     etc |
2015 |   5  |    9 |  880 |    2 |    7 |       etc |      etc |     etc |
2015 |   7  |    9 |  880 |    2 |    7 |       etc |      etc |     etc |

For each combination of (W, X, Y, Z) I would like to insert the months that don't appear in the table and are between the first and last month.

In this example, for combination (W=1, X=5210, Y=1402, Z=2), I would like to have additional rows for 2015/11 and 2016/02, where Purchases, Sales and Returns are NULL. For combination (W=9, X=880, Y=2, Z=7) I would like to have additional rows for months between 2014/4 and 2014/11, 2015/01 and 2015/04, 2016/06.

I hope I have explained myself correctly. Thank you in advance for any help you can provide.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
rffs
  • 45
  • 1
  • 1
  • 5
  • Possible duplicate of [Include missing months in Group By query](http://stackoverflow.com/questions/11479918/include-missing-months-in-group-by-query) – DavidG May 08 '16 at 12:25

2 Answers2

0

The process is rather cumbersome in this case, but quite possible. One method uses a recursive CTE. Another uses a numbers table. I'm going to use the latter.

The idea is:

  • Find the minimum and maximum values for the year/month combination for each set of ids. For this, the values will be turned into months since time 0 using the formula year*12 + month.
  • Generate a bunch of numbers.
  • Generate all rows between the two values for each combination of ids.
  • For each generated row, use arithmetic to re-extract the year and month.
  • Use left join to bring in the original data.

The query looks like:

with n as (
      select row_number() over (order by (select null)) - 1 as n -- start at 0
      from master.spt_values
     ),
     minmax as (
      select w_id, x_id, y_id, z_id, min(yr*12 + mnth) as minyyyymm,
             max(yr*12 + mnth) as maxyyyymm
      from t
      group by w_id, x_id, y_id, z_id
     ),
     wxyz as (
      select minmax.*, minmax.minyyyymm + n.n,
             (minmax.minyyyymm + n.n) / 12 as yyyy,
             ((minmax.minyyyymm + n.n) % 12) + 1 as mm
      from minmax join
           n
           on minmax.minyyyymm + n.n <= minmax.maxyyyymm
     )
select wxyz.yyyy, wxyz.mm, wxyz.w_id, wxyz.x_id, wxyz.y_id, wxyz.z_id, 
       <columns from t here>
from wxyz left join
     t
     on wxyz.w_id = t.w_id and wxyz.x_id = t.x_id and wxyz.y_id = t.y_id and
        wxyz.z_id = t.z_id and wxyz.yyyy = t.yr and wxyz.mm = t.mnth;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

Thank you for your help.

Your solution works, but I noticed it is not very good in terms of performance, but meanwhile I have managed to get a solution for my problem.

DECLARE @start_date DATE, @end_date DATE;
SET @start_date = (SELECT MIN(EOMONTH(DATEFROMPARTS(Yr , Mnth, 1))) FROM Table_Input);
SET @end_date = (SELECT MAX(EOMONTH(DATEFROMPARTS(Yr , Mnth, 1))) FROM Table_Input);

DECLARE @tdates TABLE (Period DATE, Yr INT, Mnth INT);

WHILE @start_date <= @end_date
BEGIN
    INSERT INTO @tdates(PEriod, Yr, Mnth) VALUES(@start_date, YEAR(@start_date), MONTH(@start_date));
    SET @start_date = EOMONTH(DATEADD(mm,1,DATEFROMPARTS(YEAR(@start_date), MONTH(@start_date), 1)));
END

DECLARE @pks TABLE (W_ID NVARCHAR(50), X_ID NVARCHAR(50)
, Y_ID NVARCHAR(50), Z_ID NVARCHAR(50)
, PerMin DATE, PerMax DATE);

INSERT INTO @pks (W_ID, X_ID, Y_ID, Z_ID, PerMin, PerMax)
SELECT W_ID, X_ID, Y_ID, Z_ID
, MIN(EOMONTH(DATEFROMPARTS(Ano, Mes, 1))) AS PerMin
, MAX(EOMONTH(DATEFROMPARTS(Ano, Mes, 1))) AS PerMax
FROM Table1
GROUP BY W_ID, X_ID, Y_ID, Z_ID;


INSERT INTO Table_Output(W_ID, X_ID, Y_ID, Z_ID
, ComprasLiquidas, RTV, DevManuais, ComprasBrutas, Vendas, Stock, ReceitasComerciais)
SELECT TP.DB, TP.Ano, TP.Mes, TP.Supplier_Code, TP.Depart_Code, TP.BizUnit_Code
, TA.ComprasLiquidas, TA.RTV, TA.DevManuais, TA.ComprasBrutas, TA.Vendas, TA.Stock, TA.ReceitasComerciais
FROM 
(
SELECT W_ID, X_ID, Y_ID, Z_ID
FROM @tdatas CROSS JOIN @pks
WHERE Period BETWEEN PerMin And PerMax
) AS TP
LEFT JOIN Table_Input AS TA
ON TP.W_ID = TA.W_ID AND TP.X_ID = TA.X_ID AND TP.Y_ID = TA.Y_ID
AND TP.Z_ID = TA.Z_ID
AND TP.Yr = TA.Yr
AND TP.Mnth = TA.Mnth
ORDER BY TP.W_ID, TP.X_ID, TP.Y_ID, TP.Z_ID, TP.Yr, TP.Mnth;

I do the following:

  1. Get the Min and Max date of the entire table - @start_date and @end_date variables;
  2. Create an auxiliary table with all dates between Min and Max - @tdates table;
  3. Get all the combinations of (W_ID, X_ID, Y_ID, Z_ID) along with the min and max dates of that combination - @pks table;
  4. Create the cartesian product between @tdates and @pks, and in the WHERE clause I filter the results between the Min and Max of the combination;
  5. Compute a LEFT JOIN of the cartesian product table with the input data table.
rffs
  • 45
  • 1
  • 1
  • 5