0

I am trying to calculate percentage change between elements in a vector of a dataframe:

Year

Year Count
2010 55302
2011 58943
2012 59633
2013 50194

But I the new column I create is only giving NA values instead of the percentage change calculation I want between the count of each year:

Year Count Pct_change
2010 55302 NA
2011 58943 NA
2012 59633 NA
2013 50194 NA

My code is as follows:

df1    <- df %>% 
  group_by(Year) %>% 
  mutate(Pct_change = (Count - lead(Count) / lead(Count)*100)
 

I thought it may be the fact the Count column is integer values so I changed it to numeric, no success. Thanks in advance for any help

MSMagona
  • 1
  • 1
  • 1
    Just remove the `group_by(Year)` and it should work i.e. `df %>% mutate(Pct_change = (Count - lead(Count) / lead(Count)*100)`. There is only a single observation per 'Year' – akrun Dec 14 '22 at 17:39
  • Just tried but it hasn't unfortunately, still getting NA values down the whole column – MSMagona Dec 14 '22 at 17:58
  • This is what I get from your data `df %>% mutate(Pct_change = (Count - lead(Count) / lead(Count)*100)) %>% pull(Pct_change)# [1] 55202 58843 59533 NA` – akrun Dec 14 '22 at 18:00
  • 1
    Perhaps you have non-numeric column. Change it to numeric and do the calcuation – akrun Dec 14 '22 at 18:00
  • Maybe you miss some `(` like this: `df %>% mutate(Pct_change = ((Count - lead(Count)) / lead(Count)*100)) ` – Quinten Dec 14 '22 at 18:04
  • Changed all columns to numeric but still getting the same result – MSMagona Dec 14 '22 at 18:07
  • I have changed the brackets around and still the same result..feel like I've tried everything at this point – MSMagona Dec 14 '22 at 18:11
  • @akrun is my syntax wrong then? Weird that you're getting numbers as I'm still getting NA – MSMagona Dec 14 '22 at 18:20
  • Please check whether your column is `character` class. – akrun Dec 14 '22 at 18:23
  • @akrun all columns are numeric – MSMagona Dec 14 '22 at 18:27
  • Then, I am not sure. Perhaps use `dplyr::lead` instead of just lead (in case you have some packages loaded with same function – akrun Dec 14 '22 at 18:28
  • @akrun still no good, the dataframe that is piping into mutate was collapsed and duplicate years and counts were removed ... would that have anything to do with it? – MSMagona Dec 14 '22 at 18:31
  • You may have to show the `dput(yourexample)` to understand the correct structure – akrun Dec 14 '22 at 18:32
  • @akrun > dput(atm_per_year) structure(list(Year = c(2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019), Count = c(55302, 58943, 59633, 50194, 49276, 58619, 54073, 52968, 55243, 58765)), row.names = c(NA, -10L), vars = "Year", drop = TRUE, indices = list(0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L), group_sizes = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), biggest_group_size = 1L, labels = structure(list( Year = 2010:2019), class = "data.frame", row.names = c(NA, -10L), vars = "Year", drop = TRUE), class = c("grouped_df", "tbl_df", "tbl", "data.frame")) – MSMagona Dec 14 '22 at 18:37
  • I get error `! Corrupt `grouped_df` using old (< 0.8.0) format.`. You may `ungroup` the data and test – akrun Dec 14 '22 at 18:38
  • i.e. `atm_per_year %>% ungroup %>% mutate(Pct_change = (Count - lead(Count) / lead(Count)*100))` – akrun Dec 14 '22 at 18:40
  • @akrun Still no good, looks like I might have to calculate manually :( – MSMagona Dec 14 '22 at 18:41
  • I was able to get the output with `ungroup`ing your data because the `grouped_df` in the class is giving me corrupt data, so i removed that class. IN your dataset, I would assume `ungroup` to work or just convert to `data.frame` and then apply ie.. `atm_per_year %>% as.data.frame %>% mutate(Pct_change = (Count - lead(Count) / lead(Count)*100))` – akrun Dec 14 '22 at 18:41
  • forget that last comment, it's worked!! Thanks @akrun you're a life safer – MSMagona Dec 14 '22 at 18:43

1 Answers1

0

The dataset had some groupings as the class had grouped_df. We could ungroup before applying the code

library(dplyr)
atm_per_year %>%
   ungroup %>% 
   mutate(Pct_change = (Count - lead(Count) / lead(Count)*100))
akrun
  • 874,273
  • 37
  • 540
  • 662