EDIT: Realised it could be simpler.
2nd EDIT: Adjusted to account for month as well.
First let's grab a year-month variable from your data.
If it isn't already make sure the Usage
column is of type Date
.
software_data$Usage <- as.Date(software_data$Usage)
Once it's of type date we can compress this to a Year-Month Date column.
software_data$Usage_Year_Month <- format(software_data$Usage, format = "%Y-%m")
Once your dataframe is in this form, from this to the output I have at mau2
is just 3 lines of code.
mau <- ddply(software_data, c("id", "software_v", "Usage_Year_Month"), nrow)
mau <- mau[mau[,4] > 1,]
mau2 <- ddply(mau, c("software_v", "Usage_Year_Month"), nrow)
Now let me explain that.
We can use ddply
(from the plyr
package) to apply the nrow
function to each subset of the data. So we subset on id
, software_v
and the variable we created Usage_Year_Month
, then our function returns the number of rows in that subgroup. Then at the end we just need to filter so we only keep those rows with a value of greater than 1.
mau <- ddply(software_data, c("id", "software_v"), numcolwise(length))
mau <- mau[mau[,4] > 1,]
I've set up a mock example of your data as below (just picked an arbitrary sequence of dates for Usage
).
id = round(runif(100)*5)
id = c(id, seq(6,10))
Usage = seq(as.Date("2011-12-01"), as.Date("2011-12-01")+length(id)-1, by = "+1 day")
software_v = round(runif(length(id))*3)
software_data <- data.frame(id, Usage, software_v)
software_data$Usage_Year_Month <- format(software_data$Usage, format = "%Y-%m")
Using this input the code produces the below.

The V1
column contains the number of use cases for each unique id
, software_v
and Usage_Year_Month
grouping. If you just want the unique ids for which you have more than 1 use case, just use unique(mau$id)
.
If you then want this by software version and month, let's just go one more round with ddply
.
mau2 <- ddply(mau, c("software_v", "Usage_Year_Month"), nrow)

In this output software_v
is the unique software version, Usage_Year_Month
is the matching Year and Month, and V1
holds the number of unique users who used this version more than once for that particular software version in that particular month.