Filtering out rows based on a criteria of number of months of data

Question

I have a data frame like the one below:

df

  Device_No Consumer.Account.Id    Transaction_Date Transaction.Amount Transaction.Liter TXT_Month
1  1100110065         1.01014e+11 2014-01-02 13:04:45               0.09               0.3         1
2  1100110071         1.01014e+11 2014-01-03 20:53:58               0.39               1.3         1
3  1100110071         1.01014e+11 2014-01-04 18:08:39               0.06               0.2         1
4  1100110071         1.01014e+11 2014-01-04 18:10:37               1.62               5.4         1
5  1100110071         1.01014e+11 2014-01-04 23:23:04               0.42               1.4         1
6  1100110071         1.01014e+11 2014-01-05 09:47:17               0.63               2.1         1
7  1100110071         1.01014e+11 2014-01-05 15:27:02               0.57               1.9         1
8  1100110071         1.01014e+11 2014-01-08 11:30:20               0.63               2.1         1
9  1100110071         1.01014e+11 2014-01-08 16:42:27               0.72               2.4         1
10 1100110071         1.01014e+11 2014-01-12 15:21:06               0.00               0.0         1

I have about 800 customer id's with about varying amount of information for each customer. I want to filter out the customers who have greater than 10 months of information. My plan was to use DPLYR to group by HH_id, then count the number of unique months for each customer. From there I can easily filter out the customers that have >10 months of info.

I tried:

df_sum<-mutate(df,"TXT_Month"=month(Transaction_Date)%>%
   group_by(df,Consumer.Account.Id)%>%
   summarise("no_months"==length(unique(TXT_Month))

but get the error

"Error in eval(expr, envir, enclos) : 
  column 'Transaction_Date' has unsupported type : POSIXlt, POSIXt"

I have tried formatting Transaction_Date as.numeric and as.character, but get the same error. Any advice would be much appreciated!

score 0 · Answer 1 · answered Aug 19 '16 at 23:17

0

I tracked it down to the date format: I formatted the date like this:

sg_data$Transaction_Date<-strptime(sg_data$Transaction.Date,"%d-%b-%Y %H:%M:%S")

but when changed it to a POSIXct date could use

summarise("no_month"=length(unique(TXT_Month)))

answered Aug 19 '16 at 23:17

Ashley Thomas

61
7

Filtering out rows based on a criteria of number of months of data

1 Answers1