2

I am currently trying to make squares (polygons) from a data frame in R and in order to do that (according this guide), I need to have a data frame with 4 sets of paired points as their lon-lat coordinates.

Using this example:

sample_df <- data.frame(id = c(1,2),
                        t = c('2020-01-01','2020-01-01'),
                        intensity = c(1.3,0.6),
                        x1 = c(113.75,114.00),
                        x2 = c(114.00,114.25),
                        y1 = c(8.75,8.75),
                        y2 = c(9.00,9.00))
id t intensity x1 x2 y1 y2
1 2020-01-01 1.3 113.75 114.00 8.75 9.00
2 2020-01-01 0.6 114.00 114.25 8.75 9.00

What I would like to achieve is to create a data frame that retains the t and the intensity columns distributed to the id column multiplied into 4 pairs of x1 paired to y1, x2 paired to y1, x2 paired to y2, and x1 paired to y2 values as lon and lat columns.

The expected output would be a data frame looking something like this:

id t intensity lon lat
1 2020-01-01 1.3 113.75 8.75
1 2020-01-01 1.3 114.00 8.75
1 2020-01-01 1.3 114.00 9.00
1 2020-01-01 1.3 113.75 9.00
2 2020-01-01 0.6 114.00 8.75
2 2020-01-01 0.6 114.25 8.75
2 2020-01-01 0.6 114.25 9.00
2 2020-01-01 0.6 114.00 9.00

I am currently stuck but I am playing around the mutate() function of the dplyr package, or the melt() of reshape2.

I would be greatly thankful for your inputs.

2 Answers2

3

This is 2 reshape/pivots I believe:

library(tidyr)
library(dplyr)
sample_df %>%
    pivot_longer(c("x1","x2"), names_to=NULL, values_to="lon") %>%
    pivot_longer(c("y1","y2"), names_to=NULL, values_to="lat")

## A tibble: 8 × 5
#     id t          intensity   lon   lat
#  <dbl> <chr>          <dbl> <dbl> <dbl>
#1     1 2020-01-01       1.3  114.  8.75
#2     1 2020-01-01       1.3  114.  9   
#3     1 2020-01-01       1.3  114   8.75
#4     1 2020-01-01       1.3  114   9   
#5     2 2020-01-01       0.6  114   8.75
#6     2 2020-01-01       0.6  114   9   
#7     2 2020-01-01       0.6  114.  8.75
#8     2 2020-01-01       0.6  114.  9   

The lon variable is right - the printing method for tibbles is really odd however and shows (113.75 or 114.25) and 114.0 as 114. and 114 respectively.

thelatemail
  • 91,185
  • 12
  • 128
  • 188
  • 1
    The 'erroneous' values in the "lon" column were driving me crazy, I'd assumed I'd done something wrong in my attempts. Your point regarding the _tibbles_ print output is appreciated. – L Tyrone May 05 '23 at 05:16
  • Thank you for this solution @thelatemail, I've tried it in R and it works well with the `sf` package in creating polygons. For my case, the `tibble` displays the 114 and 114. but if turned into a `data.frame1, it well displays 114 and 113.75. Again, thank you – cinnamoroll May 05 '23 at 05:57
0

I would unnest a nested tibble (which is generally a phenomenal approach for making combinatorial combinations):

library(tidyr)
x1 = c(113.75,114.00)
x2 = c(114.00,114.25)
y1 = c(8.75,8.75)
y2 = c(9.00,9.00)
tibble(
  id = c(1,2),
  t = c('2020-01-01','2020-01-01'),
  intensity = c(1.3,0.6),
  long = list(c(x1, x2)),
  lat = list(c(y1, y2))
) %>% 
  unnest(c(long, lat))
#> # A tibble: 8 × 5
#>      id t          intensity  long   lat
#>   <dbl> <chr>          <dbl> <dbl> <dbl>
#> 1     1 2020-01-01       1.3  114.  8.75
#> 2     1 2020-01-01       1.3  114   8.75
#> 3     1 2020-01-01       1.3  114   9   
#> 4     1 2020-01-01       1.3  114.  9   
#> 5     2 2020-01-01       0.6  114.  8.75
#> 6     2 2020-01-01       0.6  114   8.75
#> 7     2 2020-01-01       0.6  114   9   
#> 8     2 2020-01-01       0.6  114.  9

Created on 2023-05-04 with reprex v2.0.2

Baraliuh
  • 2,009
  • 5
  • 11
  • 2
    But the `sample_df` doesn't match your input data? – thelatemail May 05 '23 at 04:03
  • @thelatemail That sounds like an XY-problem. – Baraliuh May 05 '23 at 04:03
  • 3
    The `sample_df` is provided in the question. It isn't nested in the example, so I'm not sure how it helps to skip half the manipulation and use a function that wouldn't otherwise work. – thelatemail May 05 '23 at 04:09
  • I'm confused, you give me the data, I'll generalize. Again, XY-problem. The problem: how to get combinations of pairs in groups (Y), by pivot/reshape data to get the desired structure (X). Any appropriate `nested` `tibble` can be made from vectors of the columns. – Baraliuh May 05 '23 at 04:12
  • 3
    @Baraliuh What if OP's data is imported from somewhere else? How can the OP modify his/her original data so that your code can be used directly? It'd better if you can demonstrate how to transform four columns (`x1` to `y2`) into two list-columns (your `long` and `lat`) as the input. – benson23 May 05 '23 at 04:27