0

I have been delving into the whole tidyverse ecosystem (just a little bit) and I have been wondering how to solve the follwing problem with it. More in general, I have been wondering how to write custom functions in transmute which deal with (loops over) rows

My problem: I would like to compute a distance metric between the last row of my dataset and all other rows. Afterwards, I would like to add the vector with the distance metrics to my data.

Here is a minimum reproducible example of what I am trying to do in a non-tidyverse way

data(iris)
mydata <- iris[, -5]

mydata$distance <- sapply(1:nrow(mydata), function(j){
  dist(rbind(mydata[nrow(mydata), ], mydata[j, ]))})

This works and gives me what I need.

However, my tidyverse attempt for a solution is failing and I have been banging my head on this problem - would appreciate the help!

mydata <- select(iris, -Species)
mydata %>% transmute(function(x){
 for (i in 1:nrow(x)) {
    dist(rbind(x[i, ],x[nrow(x), ]))
 }
})

Thanks a lot!

iod
  • 7,412
  • 2
  • 17
  • 36
Jean_N
  • 489
  • 1
  • 4
  • 19

1 Answers1

0

If you're looking to add the vector to your data, you'd want to use mutate rather than transmute. transmute just gives you back the single result while mutate adds the column to the data.

You original function works as well, but if you want to go full tidyverse you'd be swapping out sapply for map_dbl (from purrr), and rbind for bind_rows

mydata <- select(iris, -Species) %>% 
  mutate(distance = map_dbl(1:nrow(mydata), ~dist(bind_rows(mydata[nrow(mydata), ], mydata[.x, ]))))
Jake Kaupp
  • 7,892
  • 2
  • 26
  • 36