Probability distribution over time?

Question

I'm new to R. I've been all over Stack Overflow about this, and perhaps am not searching correctly for the answer I want.

I have a matrix with unique dyadic relationships as rows and years as columns. The cells are populated with a 0 if the two people did not interact in that year, and a 1 if they did.

I am trying to calculate a percent for each cell - the number of times 1 occurs relative to the number of entries after the first occurrence of 1. Colloquially this would just be how often two people have interacted each year since they met.

The first occurrence of 1 in a row would always be 100%. For example, Row B from the example below:

 V1 V2 V3 V4
A 0  0  1  0
B 1  1  0  0

Becomes

 100 100 66 50

I got as far as calculating the cumulative sum for each cell of the matrix

data <- matrix(sample(0:1,5*4,rep=T),4)
test<-t(apply(data,1,cumsum))

And then my idea was to create a function sort of like the below, but I'm stuck on what expression to use for the denominator (below only removes the number of entries prior to the first occurrence of one). I don't quite know how to subset out future cases, or reference the column index of a matrix directly.

mm<-function(x){(x)/(ncol(data)-(which(x>0)[1]))} 
tmp_int<-apply(data, 1:2, mm)

Or is there a much easier way to do this? I tried using the ecdf function but it was returning NAs.

Thanks so much.

Is `t(apply(data,1,cumsum)/(1:ncol(data)))*100` what you are looking for? — nicola, Apr 03 '16 at 22:49

Julius Vainora · Accepted Answer · 2016-04-05T14:09:35.370

data <- matrix(sample(0:1, 5 * 4, rep = TRUE), 4)

f <- function(m) t(apply(m, 1, cumsum))
f(data) / (f(f(data) >= 1) + (f(data) == 0)) * 100
#      [,1] [,2]     [,3]     [,4] [,5]
# [1,]  100   50 66.66667 75.00000   60
# [2,]  100  100 66.66667 50.00000   40
# [3,]    0  100 50.00000 33.33333   25
# [4,]  100   50 66.66667 50.00000   60

Here f is what you already have, f(f(data) >= 1) gives almost appropriate denominators for the element-wise division, while f(data) == 0 makes sure that we do not divide by 0.

Probability distribution over time?

1 Answers1