0

I have a dataframe

df <- data.frame(Name = c("Terry", "Bob", "Jerry"), 
                     ContractsSold = c(30,40,50), agentsHired = c(10,12, 14),
                     Sales = c(3500,4000, 5000),
                     stringsAsFactors = FALSE)

I also have a list of values: e.g.

days = seq(0,20,5)

I have a for loop that takes each row e.g. Terry, 30, 10, 3500 and for each value performs a function that is currently working, outputs the total Revenue based off of that calculation and outputs it in this format.

days Name  Revenue
0    Terry 10000
1    Terry 10000
2    Terry 10300
3    Terry 19000
4    Terry 14000
5    Terry 10090
....
20   Terry 10000
0    Bob 20000
...
20   Bob 20000

And so on. I want to get this as my output:

days  Terry  Bob   Jerry
0     10000  20000 .....
1     10000  ....
2     13000
3     (the revenues calculated)
....
20

I currently have this as my code: and it produces the first output.

for (j in seq_len(nrow(df)))  for (d in days) {
        totalRevenue <- (function is here and working) 
     return(new_df)
}

Could someone please help me with how in my for loop I can store each name as a column with each of their corresponding revenues depending on what day it is in the "days" column? Thank you!

  • `tidyr::spread(df_with_calculated_revenues, Name, Revenue)` ought to do it (or one of several other options at the duplicate) – Gregor Thomas Feb 28 '20 at 04:32
  • When I ran your code solution, I received this: Error in : Each row of output must be identified by a unique combination of keys. Keys are shared for 2 rows: Do you know how to get around this in this case? – Taylor Coleman Feb 28 '20 at 04:50
  • That means you've got some rows in your input where the `days` and the `Name` are duplicated---like maybe you have 2 rows where `days = 15` and `Name = "Bob"`, so spread doesn't know what value to put in the output because it needs to find a single `Revenue` value for that combination. You should aggregate your data first so there is one row per `days` per `Name` and then you won't have a problem. – Gregor Thomas Feb 28 '20 at 07:16
  • Thank you so much! It works. However, I now have a lot of the same "days" and only one value for each name, with NA's for the other names in that row. Is there a way to consolidate if the "days" are the same, merge the values into that row for each "name"?Thank you again for all of the help. – Taylor Coleman Feb 28 '20 at 21:32
  • If you can make your example reproducible, that would help a lot. You shared your original `df` with `dput`, which is good, but it's not the `df` you are working with. It doesn't really matter to this issue that you started with `df` and then used some unshared code to calculate revenue... just share a subset of the revenue data that illustrates your problem... – Gregor Thomas Feb 29 '20 at 03:07
  • I used `df_rev = structure(list(days = c(0L, 1L, 2L, 0L, 1L, 2L), Name = structure(c(2L, 2L, 2L, 1L, 1L, 1L), .Label = c("Jerry", "Terry"), class = "factor"), Revenue = c(10000L, 10000L, 10300L, 19000L, 14000L, 10090L )), class = "data.frame", row.names = c(NA, -6L))`, based off of what you show, and `tidyr::spread(df_rev, Name, Revenue)` works great on it. I need to see data for the problem you're having to do any more debugging. – Gregor Thomas Feb 29 '20 at 03:08
  • hi Gregor, it worked great with your suggestion, thank you very much! I found that I did have duplicates and removed those and it worked out. – Taylor Coleman Mar 01 '20 at 16:58

0 Answers0