2

I have a data frame such as:

x <- data.frame("Names"= c("name1","name2","name3"), "A" = c(0.1,0.1,0.8), "B" = c(0.3,0.4,0.3), "C" = c(0.05,0.9,0.05),"D" =c(0.6,0.1,0.3))

> x
  Names   A   B    C   D
1 name1 0.1 0.3 0.05 0.6
2 name2 0.1 0.4 0.90 0.1
3 name3 0.8 0.3 0.05 0.3

And what I would like is to remove all lines where the Max value of A , B , C or D is below 0.8. And then, get:

> x
  Names   A   B    C   D
2 name2 0.1 0.4 0.90 0.1
3 name3 0.8 0.3 0.05 0.3

The name1 was removed because 0.6 was the max value.

And then I would like to get a file such as I get the NameX with the column name where the value is the max, in this exemple it would be:

Name1 : C with value 0.9
Name2 : A with value 0.8 

Thank you for your help.

zx8754
  • 52,746
  • 12
  • 114
  • 209
bewolf
  • 165
  • 9

4 Answers4

2

You can use pmax, i.e.

x[do.call(pmax, x[-1]) >= 0.8,]
#  Names   A   B    C   D
#2 name2 0.1 0.4 0.90 0.1
#3 name3 0.8 0.3 0.05 0.3
Sotos
  • 51,121
  • 6
  • 32
  • 66
1

To filter rows you could do something like this using any

df <- x[apply(x[, -1], 1, function(x) any(x >= 0.8)), ]
df
#  Names   A   B    C   D
#2 name2 0.1 0.4 0.90 0.1
#3 name3 0.8 0.3 0.05 0.3

As for your second question, I'm not sure what you're trying to do. If this is about generating a vector of "result" strings you could do

apply(df, 1, function(x) {
    idx <- which.max(x[-1])
    sprintf("%s: %s with value %s", x[1], colnames(df)[idx + 1], x[-1][idx]) })
#                         2                          3
#"name2: C with value 0.90"  "name3: A with value 0.8"

Or if you prefer a data.frame perhaps something like this

ret <- data.frame(result = rep("", nrow(df)), stringsAsFactors = F)
for (i in 1:nrow(df)) {
    idx <- which.max(df[i, -1])
    ret$result[i] <- sprintf(
        "%s: %s with value %s", 
        df[i, 1], colnames(df)[idx + 1], df[i, -1][idx])
}
ret
#                   result
#1 name2: C with value 0.9
#2 name3: A with value 0.8
Maurits Evers
  • 49,617
  • 4
  • 47
  • 68
1
x[rowSums(x[-1] >= 0.8) != 0, ]

  Names   A   B    C   D
2 name2 0.1 0.4 0.90 0.1
3 name3 0.8 0.3 0.05 0.3
s_baldur
  • 29,441
  • 4
  • 36
  • 69
1

A data.table solution :

x <- data.table::data.table(x)
x [ pmax(A,B,C,D) >= .8 , , ]
x [  , paste(colnames(x)[1+which(c(A,B,C,D)==(max(A,B,C,D)))], " with value ", max(A,B,C,D)), by=Names]
JCR
  • 71
  • 7