0

I want to find the number of ids changing their value every week and month

every week a new dataset get entered into the database, every week same ids are added with their values, some weeks values for some ids change.

I want to find the amount of ids change per month and week for all the data I have over the last 2 years.

all of this is being done in databricks

I have attached an example data-set, where data is entered for 3 ids for two months and 2 ids changed their value. the desired output shows what I need is that the second month showing 2 changes and the first showing 0 changes.

dataset

output needed

Ecstasy
  • 1,866
  • 1
  • 9
  • 17
H.M
  • 3
  • 3

1 Answers1

0

Use JOIN clause to adjacent month and different values and apply GROUP BY clause. On the resulting dataset apply the add_months function.

add_months(startDate, numMonths)

The startDate will take your Date and numMonth will take the values from the resulting GROUP BY clause.

Utkarsh Pal
  • 4,079
  • 1
  • 5
  • 14