0

I have a table for employees where I have running total of sales by every employee in each of the 12 months. Some of the values for sales running total are missing, as below:

e_id month sales_run_total
e1 Jan 10
e1 Feb null
e1 March null
e1 April 40
... ... ...
e1 Dec 120
e2 Jan null
e2 Feb null
e2 March 30
e2 April 40
... ... ...
e2 Dec 120

I want to impute these missing values with the previous month's sales_run_total value of an employee. If the missing value is for January month, I want to impute it with zero, as below:

e_id month sales_run_total sales_run_total_new
e1 Jan 10 10
e1 Feb null 10
e1 March null 10
e1 April 40 40
... ... ... ...
e1 Dec 120 120
e2 Jan null 0
e2 Feb null 0
e2 March 30 30
e2 April 40 40
... ... ... ...
e2 Dec 120 120

I have tried to do it using lag() function, as below :

select e_id,
       month,
       lag(sales_run_total) over (partition by e_id, month) as sales_run_total_new
from table_name

But this does not consider whether there is a null or non-null value in actual column. Also, it does not consider the case where we have null for January month.

How can this be done in SQL?

karan
  • 79
  • 1
  • 7

1 Answers1

1

You need to pick the window between proceedings and current row and find most recent not null value.
Use LAST_VALUE (value, ignore_null boolean) window function to pick last non null value ordering by e id and month.
LAG() isnt suitable because it can not reach beyond previous row and you can have null value in previous row.

can you pls try this ?

SELECT e_id,`month`,sales_run_total,
COALESCE(sales_run_total, LAST_VALUE(sales_run_total, TRUE) OVER(ORDER BY e_id, `month` ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW))  as sales_run_total_new
FROM table_name
Koushik Roy
  • 6,868
  • 2
  • 12
  • 33
  • Thanks @Koushik. I solved it in 2 steps: First, whenever the month is January and the sales value is NULL, I imputed it with 0. Then, I used the approach suggested by you to get the final table. – karan Apr 11 '22 at 16:14
  • 1
    First step to do before the solution in answer: SELECT e_id, `month`, sales_run_total, case when sales_run_total is NULL then ( case when service_month = 'Jan' then 0 else sales_run_total end ) else sales_run_total end as sales_run_total_temp FROM table_name – karan Apr 11 '22 at 16:20