0

I am fairly new to R and am trying get all dataframe rows from one column that correspond to unique levels in another. My dataframe, called df, has 2 columns: preds and group which contains 20 unique levels. I am trying to get all the values of preds for each individual level in group.

An example of the dataframe is as such:

   preds           group
1  18       (0,6.49e+03]
2  20       (0,6.49e+04]
3  49       (0,6.49e+02]
4  49       (0,6.49e+03]
5  20       (0,6.49e+04]

My for loop to try and get this is as follow:

for (i in unique(levels(df$group))){
  results <- df$preds[df['group'] == i]
  print(i)
  print(results)}

This should print the preds for unique levels and look as such:

(0,6.49e+03]
18, 49

(0,6.49e+04]
20, 20

(0,6.49e+02]
49

However this seems to just print an empty vector everytime. Can someone help me to understand how to do this and if I am even attempting this the correct way at all?

Thanks

geds133
  • 1,503
  • 5
  • 20
  • 52

3 Answers3

0

You can avoid loops using this approach:

#Data
df <- structure(list(preds = c(18L, 20L, 49L, 49L, 20L), group = c("(0,6.49e+03]", 
"(0,6.49e+04]", "(0,6.49e+02]", "(0,6.49e+03]", "(0,6.49e+04]"
)), class = "data.frame", row.names = c("1", "2", "3", "4", "5"
))

The code:

#Code
aggregate(preds~group,data=df,function(x) paste0(x,collapse = ', '))

The output:

         group  preds
1 (0,6.49e+02]     49
2 (0,6.49e+03] 18, 49
3 (0,6.49e+04] 20, 20
Duck
  • 39,058
  • 13
  • 42
  • 84
  • I think it may be a problem with my code or something but when I try and run this, it produces one result and crashes R studio? – geds133 Aug 26 '20 at 13:43
  • Try `result <- aggregate(preds~group,data=df,function(x) paste0(x,collapse = ', '))` it might be because of a large dataframe. Let me know if that works! – Duck Aug 26 '20 at 13:44
  • Again, this gives me a single group with no `preds` value and proceeds to crash RStudio. – geds133 Aug 26 '20 at 13:58
  • Could you please tell me what is the structure of your dataframe using `str(df)`? – Duck Aug 26 '20 at 13:59
0

Maybe you can try tapply

with(df,tapply(preds,group,c))

or split

with(df,split(preds,group))

which gives

$`(0,6.49e+02]`
[1] 49

$`(0,6.49e+03]`
[1] 18 49

$`(0,6.49e+04]`
[1] 20 20
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
0

If you really want to use your for loop here is an adapted version that gets at what you want.

#Data
preds<-c(18,20,49,49,20)
group<-c("a","b","c","a","b")
df<-data.frame(preds,group)
for (i in 1:length(unique(levels(df$group)))){
  group<-(unique(levels(df$group))[i])
  Value<-(df$preds[df['group'] == unique(levels(df$group))[i]])
  print(paste(group, Value))
  }
Tanner33
  • 120
  • 2
  • 15