2

I have a tibble (I'm already using tidyverse), but since it's large, let's use a slightly modified iris dataset:

iris %>%
  as_tibble() %>%
  select(-5) 

I want to create another column that is the square root of the sum of the squares of these columns (which are all dbl, as you may notice). However, this method should generalize for any number of columns and with any name, not only because the original dataset has more columns, but also because it's more elegant.

This code works, but is specific to these columns with these names:

iris %>%
  as_tibble() %>%
  select(-5) %>%
  mutate(linear = sqrt(Sepal.Length^2+Sepal.Width^2+Petal.Length^2+Petal.Width^2)) %>%
  arrange(linear)

This gives the expected output:

> iris %>%
+   as_tibble() %>%
+   select(-5) %>%
+   mutate(linear = sqrt(Sepal.Length^2+Sepal.Width^2+Petal.Length^2+Petal.Width^2)) %>%
+   arrange(linear)
# A tibble: 150 x 5
   Sepal.Length Sepal.Width Petal.Length Petal.Width linear
          <dbl>       <dbl>        <dbl>       <dbl>  <dbl>
 1          4.5         2.3          1.3         0.3   5.23
 2          4.3         3            1.1         0.1   5.36
 3          4.4         2.9          1.4         0.2   5.46
 4          4.4         3            1.3         0.2   5.49
 5          4.4         3.2          1.3         0.2   5.60
 6          4.6         3.1          1.5         0.2   5.75
 7          4.6         3.2          1.4         0.2   5.78
 8          4.8         3            1.4         0.1   5.83
 9          4.7         3.2          1.3         0.2   5.84
10          4.8         3            1.4         0.3   5.84
# … with 140 more rows
Érico Patto
  • 1,015
  • 4
  • 18

3 Answers3

2

Will this work for you?

iris %>%
+   as_tibble() %>%
+   select(-5) %>% mutate(linear = sqrt(rowSums(iris[-5]**2))) %>% arrange(linear)
# A tibble: 150 x 5
   Sepal.Length Sepal.Width Petal.Length Petal.Width linear
          <dbl>       <dbl>        <dbl>       <dbl>  <dbl>
 1          4.5         2.3          1.3         0.3   5.23
 2          4.3         3            1.1         0.1   5.36
 3          4.4         2.9          1.4         0.2   5.46
 4          4.4         3            1.3         0.2   5.49
 5          4.4         3.2          1.3         0.2   5.60
 6          4.6         3.1          1.5         0.2   5.75
 7          4.6         3.2          1.4         0.2   5.78
 8          4.8         3            1.4         0.1   5.83
 9          4.7         3.2          1.3         0.2   5.84
10          4.8         3            1.4         0.3   5.84
# ... with 140 more rows
Karthik S
  • 11,348
  • 2
  • 11
  • 25
  • 1
    Here I was cracking my brain with `purrr` functions and there was this super simple and elegant answer lurking just beyond my grasp... Yes, it works! I'll accept it as soon as StackOverflow lets me! – Érico Patto Apr 07 '21 at 16:01
2

You also use this solution:

library(dplyr)
library(purrr)

iris %>%
  as_tibble() %>%
  select(-5) %>%
  mutate(linear = pmap(., ~ sqrt(sum(c(...) ^ 2)))) %>%
  unnest(linear) %>%
  arrange(linear)

# A tibble: 150 x 5
   Sepal.Length Sepal.Width Petal.Length Petal.Width linear
          <dbl>       <dbl>        <dbl>       <dbl>  <dbl>
 1          4.5         2.3          1.3         0.3   5.23
 2          4.3         3            1.1         0.1   5.36
 3          4.4         2.9          1.4         0.2   5.46
 4          4.4         3            1.3         0.2   5.49
 5          4.4         3.2          1.3         0.2   5.60
 6          4.6         3.1          1.5         0.2   5.75
 7          4.6         3.2          1.4         0.2   5.78
 8          4.8         3            1.4         0.1   5.83
 9          4.7         3.2          1.3         0.2   5.84
10          4.8         3            1.4         0.3   5.84
# ... with 140 more rows

If you would like to know how you can apply purrr functions to problems of this sort, check this comprehensive answer by dear akrun.

Anoushiravan R
  • 21,622
  • 3
  • 18
  • 41
1

This is a fun question. I was hoping to find a map_dbl/pmap_dbl solution, though I ended up with this: Note the rowwise()

iris %>%
  as_tibble() %>%
  select(-5) %>%
  rowwise() %>% 
  mutate(linear = sqrt(sum(cur_data()^2))) %>%
  arrange(linear)

# A tibble: 150 x 5
# Rowwise: 
   Sepal.Length Sepal.Width Petal.Length Petal.Width linear
          <dbl>       <dbl>        <dbl>       <dbl>  <dbl>
 1          4.5         2.3          1.3         0.3   5.23
 2          4.3         3            1.1         0.1   5.36
 3          4.4         2.9          1.4         0.2   5.46
 4          4.4         3            1.3         0.2   5.49
M.Viking
  • 5,067
  • 4
  • 17
  • 33