2

I have two sets of data frames and I am trying to create a function which takes in a data frame and row name as an argument and returns the three highest value on the row (in a descending order) and the name of the column of three highest value.


    set.seed(0)
    df <- data.frame(A=c(3,2,1,4,5),B=c(1,6,3,8,4),C=c(2,1,4,8,9), D=c(4,1,2,4,6))
    row.names(df)<-c("R1","R2","R3","R4","R5")

    df2 <- data.frame(E=c(2,5,6,1,4),F=c(2,4,2,5,1),G=c(5,6,2,7,3),H=c(8,2,7,4,1))
    row.names(df2)<-c("R6","R7","R8","R9","R10")

    print(df)

       A B C D
    R1 3 1 2 4
    R2 2 6 1 1
    R3 1 3 4 2
    R4 4 8 8 4
    R5 5 4 9 6

    print(df2)

        E F G H
    R6  2 2 5 8
    R7  5 4 6 2
    R8  6 2 2 7
    R9  1 5 7 4
    R10 4 1 3 1

Here is an example of a result:

Let the function be maxthree. Now


    maxthree(df2, "R7")

    G E F
    6 5 4

Here is what I have done so far:


    maxthree <- function(data,row) {
      if(!row %in% rownames(data)) {
        print("Check value")
      } else { 
        max_col <- which.max(data[row,])
        print(max_col)
      }
    }

This function will now return the maximum value in that row as well as the column name. However, I don't now how to add the second and the third highest values to the function.

Joe
  • 181
  • 7

5 Answers5

3
maxthree = function(data, row) {
  data[row, order(unlist(data[row, ]), decreasing = TRUE)[1:3]]
}

maxthree(df2, "R7")
#    G E F
# R7 6 5 4

The result is a 1x3 data frame.

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
3

This should work great

maxthree <- function(data,roww){
    x <- data[roww,] 
    x[order(x, decreasing = T)][1:3]
}


>  maxthree(df2, "R7")
   G E F
R7 6 5 4
Daniel O
  • 4,258
  • 6
  • 20
2

Try this:

df <- data.frame(A=c(3,2,1,4,5),B=c(1,6,3,8,4),C=c(2,1,4,8,9), D=c(4,1,2,4,6))
row.names(df)<-c("R1","R2","R3","R4","R5")

df2 <- data.frame(E=c(2,5,6,1,4),F=c(2,4,2,5,1),G=c(5,6,2,7,3),H=c(8,2,7,4,1))
row.names(df2)<-c("R6","R7","R8","R9","R10")

maxthree <- function(data,row) {
  named_vec <- t(data)[,row]
  return(sort(named_vec, decreasing = T)[1:3])
}

maxthree(df2, "R7")

# G E F 
# 6 5 4

This approach transposes your data frame "t()" to allow a straightforward subset of the row as a named vector. This allows sort to be used to order the values as desired.

Sef
  • 240
  • 1
  • 5
  • 1
    Transposing the entire data frame is unnecessary and will slow this down on even moderately sized data--extract the row first and then transpose: `t(data[row, ])` instead of `t(data)[, row]` for a much more efficient solution. – Gregor Thomas May 12 '20 at 16:53
2

You can use sort and [1:3] to get the first 3 elements like:

maxthree <- function(data,row) {sort(data[row,], TRUE)[1:3]}
maxthree(df2, "R7")
#   G E F
#R7 6 5 4

In case the rowname should not be shown you can add unlist:

maxthree <- function(data,row) {head(unlist(sort(data[row,], TRUE)),3)}
maxthree(df2, "R7")
#G E F 
#6 5 4 
GKi
  • 37,245
  • 2
  • 26
  • 48
1

You can use the order function.

maxthree <- function(data, row_name) data[row_name, order(-data[row_name,])][, 1:3]
maxthree(df2, 'R7')
   G E F
R7 6 5 4
user2474226
  • 1,472
  • 1
  • 9
  • 9
  • Using `-data[row_name, ]` instead of the `decreasing = TRUE` argument limits your function to only work with numeric data. `decreasing = TRUE` makes it a little more flexible just in case it's ever needed for, say, `character` data. – Gregor Thomas May 12 '20 at 16:56