I have a dataframe that has multiple columns named as "avg_metric", "wkday_avg_metric", "event_avg_metric" and "monthly_avg_metric", in which "metric" consists of multiple metrics with these calculations (orders, revenue, etc). I have to check for multiple columns if their rows have NAs and replace them with a row from another column. For that, I created a function that does the same verification for the column "metric" I specify. The thing is that I'm getting the same value for the entire new column that I'm creating, which should not be the case.
I added below an example_fixed on what should be the outcome.
Is there an easier way of doing that? Or am I lacking some logic in the function?
Tks.
Edit: I got the errors on my function, but I'm sure there's a better solution to mine. I tried your solutions, but couldn't apply them for my dataframe. I updated the reprex so you can help me better.
library(tidyverse)
(example <- tibble(country = c("A", "B", "C", "D"),
brand = c("A", "A", "B", "B"),
event = c(1:4),
month = c(1:4),
weekday = c(1:4),
avg_visits = c(5028, NA, NA, NA),
avg_revenue = c(12345, NA, NA, NA),
wkday_avg_visits = c(1234, 4355, NA, NA),
wkday_avg_revenue = c(12345, 54321, NA, NA),
event_avg_visits = c(51271, 59212, 98773, NA),
event_avg_revenue = c(98764, 56435, 35634, NA),
monthly_avg_visits = c(5028, 5263, 6950, 8902),
monthly_avg_revenue = c(63457, 34536, 34574, 23426))) %>%
print(width = Inf)
#> # A tibble: 4 x 13
#> country brand event month weekday avg_visits avg_revenue wkday_avg_visits
#> <chr> <chr> <int> <int> <int> <dbl> <dbl> <dbl>
#> 1 A A 1 1 1 5028 12345 1234
#> 2 B A 2 2 2 NA NA 4355
#> 3 C B 3 3 3 NA NA NA
#> 4 D B 4 4 4 NA NA NA
#> wkday_avg_revenue event_avg_visits event_avg_revenue monthly_avg_visits
#> <dbl> <dbl> <dbl> <dbl>
#> 1 12345 51271 98764 5028
#> 2 54321 59212 56435 5263
#> 3 NA 98773 35634 6950
#> 4 NA NA NA 8902
#> monthly_avg_revenue
#> <dbl>
#> 1 63457
#> 2 34536
#> 3 34574
#> 4 23426
subs_metric <- function(data, metric) {
avg <- paste0("avg_", metric)
wkday_avg <- paste0("wkday_avg_", metric)
event_avg <- paste0("event_avg_", metric)
monthly_avg <- paste0("monthly_avg_", metric)
for (i in nrow(data)) {
value <- if (is.na(data[[avg]][i]) & is.na(data[[wkday_avg]][i]) & is.na(data[[event_avg]][i])) {
data[[monthly_avg]][i]
} else if (is.na(data[[avg]][i]) & is.na(data[[wkday_avg]][i])) {
data[[event_avg]][i]
} else if (is.na(data[[avg]][i])) {
data[[wkday_avg]][i]
} else {
data[[avg]][i]
}
return(value)
}
}
example %>%
mutate(avg_visits_new = subs_metric(., "visits"),
avg_revenue_new = subs_metric(., "revenue")) %>%
print(width = Inf)
#> # A tibble: 4 x 15
#> country brand event month weekday avg_visits avg_revenue wkday_avg_visits
#> <chr> <chr> <int> <int> <int> <dbl> <dbl> <dbl>
#> 1 A A 1 1 1 5028 12345 1234
#> 2 B A 2 2 2 NA NA 4355
#> 3 C B 3 3 3 NA NA NA
#> 4 D B 4 4 4 NA NA NA
#> wkday_avg_revenue event_avg_visits event_avg_revenue monthly_avg_visits
#> <dbl> <dbl> <dbl> <dbl>
#> 1 12345 51271 98764 5028
#> 2 54321 59212 56435 5263
#> 3 NA 98773 35634 6950
#> 4 NA NA NA 8902
#> monthly_avg_revenue avg_visits_new avg_revenue_new
#> <dbl> <dbl> <dbl>
#> 1 63457 8902 23426
#> 2 34536 8902 23426
#> 3 34574 8902 23426
#> 4 23426 8902 23426
(example_fixed <- tibble(country = c("A", "B", "C", "D"),
brand = c("A", "A", "B", "B"),
event = c(1:4),
month = c(1:4),
weekday = c(1:4),
avg_visits = c(5028, NA, NA, NA),
avg_revenue = c(12345, NA, NA, NA),
wkday_avg_visits = c(1234, 4355, NA, NA),
wkday_avg_revenue = c(12345, 54321, NA, NA),
event_avg_visits = c(51271, 59212, 98773, NA),
event_avg_revenue = c(98764, 56435, 35634, NA),
monthly_avg_visits = c(5028, 5263, 6950, 8902),
monthly_avg_revenue = c(63457, 34536, 34574, 23426),
avg_visits_new = c(5028, 4355, 98773, 8902),
avg_revenue_new = c(12345, 54321, 35634, 23426))) %>%
print(width = Inf)
#> # A tibble: 4 x 15
#> country brand event month weekday avg_visits avg_revenue wkday_avg_visits
#> <chr> <chr> <int> <int> <int> <dbl> <dbl> <dbl>
#> 1 A A 1 1 1 5028 12345 1234
#> 2 B A 2 2 2 NA NA 4355
#> 3 C B 3 3 3 NA NA NA
#> 4 D B 4 4 4 NA NA NA
#> wkday_avg_revenue event_avg_visits event_avg_revenue monthly_avg_visits
#> <dbl> <dbl> <dbl> <dbl>
#> 1 12345 51271 98764 5028
#> 2 54321 59212 56435 5263
#> 3 NA 98773 35634 6950
#> 4 NA NA NA 8902
#> monthly_avg_revenue avg_visits_new avg_revenue_new
#> <dbl> <dbl> <dbl>
#> 1 63457 5028 12345
#> 2 34536 4355 54321
#> 3 34574 98773 35634
#> 4 23426 8902 23426
Created on 2020-07-07 by the reprex package (v0.3.0)