Calculations between columns in data frame: week-on-week 'trending' sales detection

Question

I have a data.frame representing frequency book sales for a set of authors over 25 weeks:

author       week_1 week_2 week_3 week_4 ...
author1      7      4      5          2
author2      3      6      18         5
author3      1      0      2          4
author4      0      1      1          2
author5      0      1      0          0

First, I want to use this data to build a new data frame, which shows the fraction of [currentWeek / previousWeek]. Something like this perhaps:

author       week_1 week_2  week_3 week_4 ...
author1      NA      0.57   1.25   0.2
author2      NA      2      3      0.28
author3      NA      0      2      2
author4      NA      1      1      2   
author5      NA      1      0      0

(I would like to substitute zeros with 1s to avoid dividing by zero.)

Second, I want to run a quick iteration over all the rows, check for any triplets of adjacent weeks where sales for that authors have increased by 100% twice in two consecutive week-pairs, and report this in some kind of output table. Perhaps like this:

author  startTrendWeek endTrendWeek
author2 1              3
author3 2              4

Any ideas for how I could solve either of these in R?

I don't know how to create a new table where each cell is a calculation from another table. I find manipulating data like this intuitive in Perl, but (as yet) R baffles me. — Harry Palmer, Jul 10 '12 at 16:42
`test1<-c(1,2,3,4,5); test2<-c(test1[-1],NA);test3<-test1/test2` Read some introductionary text for R. Also, leave your data.frame in long format and don't use `cast` when creating it (see your recent question). — Roland, Jul 10 '12 at 16:48

score 4 · Accepted Answer · answered Jul 10 '12 at 16:56

4

Recreate your data:

x <- read.table(text=
"author       week_1 week_2 week_3 week_4 
author1      7      4      5          2
author2      3      6      18         5
author3      1      0      2          4
author4      0      1      1          2
author5      0      1      0          0
                ", header=TRUE)

One line of code:

cbind(x[1], t(apply(x[, -1], 1, function(xx)xx[-1]/xx[-length(xx)])))

   author    week_2 week_3    week_4
1 author1 0.5714286   1.25 0.4000000
2 author2 2.0000000   3.00 0.2777778
3 author3 0.0000000    Inf 2.0000000
4 author4       Inf   1.00 2.0000000
5 author5       Inf   0.00       NaN

answered Jul 10 '12 at 16:56

Andrie

176,377
47
447
496

This is very helpful Andrie thank you! I can't quite figure out how all the functions used by that command are working together - like why is there a transpose, a matrix multiplication, and what is the length doing? If you can explain this briefly that would be very helpful. But anyway, thanks very much and I am starting to learn R properly with tutorials and an introductory book. – Harry Palmer Jul 11 '12 at 15:20
1

There's no matrix multiplication here. `xx[-length(xx)]` means take `xx` and cut off the last element. `apply(x, 2, FUN)` applies a function to each row of your data. And you need to transpose the result because the results of `apply` come out as the tranpose of what you need. Good luck with your R journey. – Andrie Jul 11 '12 at 15:51
`xx` indicates the variable in function(xx), i.e. each row of data. – Andrie Jul 11 '12 at 16:25

Calculations between columns in data frame: week-on-week 'trending' sales detection

1 Answers1