0

Assuming I have a dataframe, df with this info

group wk source revenue 1 1 C 100 1 1 D 200 1 1 A 300 1 1 B 400 1 2 C 500 1 2 D 600

I'm trying to programatically filter's down to rows of unique combinations of group, wk and source, and then perform some operations on them, before combining them back into another dataframe. I want to write a function that can scale to any number of segments (and not just the example scenario here) and filter down rows. All I need to pass would be the column names by which I want to segment

eg. seg <- c("group", "wk", "source")

One unique combination to filter rows in df would be
df %>% filter(group == 1 & wk == 1 & source == "A")

I wrote a recursive function (get_rows) to do so, but it doesn't seem to do what I want. Could anyone provide inputs on where I'm going wrong ?

library(dplyr)

filter_row <- function(df,x)
{
 df %>% filter(group == x$group & wk == x$wk & source == x$source)
}

seg <- c("group", "wk", "source")

get_rows <- function(df,seg,pos = 1, l = list())
{
  while(pos <= (length(seg) + 1))
  {
      if(pos <= length(seg))
        for(j in 1:length(unique(df[,seg[pos]])))
        {
          k <- unique(df[,seg[pos]])
          l[seg[pos]] <- k[j]
          get_rows(df,seg,pos+1,l)
          return()
        }
      if(pos > length(seg))
      {
        tmp <- df %>% filter_row(l)
        <call some function on tmp>
        return()
      }
  }
}

get_rows(df,seg)

EDIT: I understand there are prebuilt methods I can use to get what I need, but I'm curious about where I'm going wrong in the recursive function I wrote.

Karthik g
  • 295
  • 1
  • 3
  • 7
  • 1
    If you want unique combinations, try [this](http://stackoverflow.com/questions/8363278/how-to-filter-for-unique-combination-of-columns-from-an-r-dataframe). If you want operators on unique subsections, try [this](http://www.magesblog.com/2012/01/say-it-in-r-with-by-apply-and-friends.html), in particular data.table. – bwarren2 Jun 25 '15 at 17:09
  • Thanks ! That's super helpful for this context ! I'm still curious tho about where I'm going wrong in my recursive function – Karthik g Jun 25 '15 at 17:31
  • 1
    What makes you think it is recursive? The While loop? – Molx Jun 25 '15 at 19:18
  • `split( df, interaction( df[ , c("group", "wk", "source")] ) )` – IRTFM Jun 25 '15 at 21:46

1 Answers1

1

There might be a data.table/dplyr solution out there, but this one is pretty simple.

# Just paste together the values of the column you want to aggregate over.
# This creates a vector of factors
f <- function(data, v) {apply(data[,v,drop=F], 1, paste, collapse = ".")}

# Aggregate, tapply, ave, and a few more functions can do the same thing
by(data = df,                                   # Your data here
   INDICES = f(df, c("group", "wk", "source")), # Your data and columns here
   FUN = identity, simplify = F)                # Your function here

Can also use library(dplyr) and library(data.table)

df %>% data.table %>% group_by(group, wk, source) %>% do(yourfunctionhere, use . for x)
Vlo
  • 3,168
  • 13
  • 27