11

I was wondering if there is a way to create linestring from two points given in the same row in a dataframe in a new geometry column. In other words longitudes and latitudes of the two points are given in a dataframe like the following:

df <- data.frame(id = c("a", "b"), lon1 = c(1,2), lat1 = c(3,4), lon2 = c(5,6), lat2 = c(7,8))  

where lon1 and lat1 represent the coordinates of the first point and lon2 and lat2 are the coordinates of the second point. The desired dataframe would have two rows and two columns - the id column and a geometry column.

I tried with sf::st_linestring but seems this function only works with matrices.

Desired dataframe:

desired_df <- data.frame(id = c("a", "a", "b", "b"), lon = c(1,2,5,6), lat = c(3,4,7,8)) %>% st_as_sf(coords = c("lon", "lat"), dim = "XY") %>% st_set_crs(4236) %>% group_by(id) %>% summarise(geometry = st_union(geometry), do_union = FALSE) %>% st_cast("LINESTRING")
adl
  • 1,390
  • 16
  • 36

4 Answers4

10

Update - 30th Jan 2021

The issue with my original answer is it doesn't correctly set the bounding box.

Today I would use this approach using sfheaders and data.table

library(data.table)
library(sfheaders)

dt <- as.data.table(df)

## To use `sfheaders` the data needs to be in long form

dt1 <- dt[, .(id, lon = lon1, lat = lat1)]
dt2 <- dt[, .(id, lon = lon2, lat = lat2)]

## Add on a 'sequence' variable so we know which one comes first
dt1[, seq := 1L ]
dt2[, seq := 2L ]

## put back together
dt <- rbindlist(list(dt1, dt2), use.names = TRUE)
setorder(dt, id, seq)

sf <- sfheaders::sf_linestring(
  obj = dt
  , x = "lon"
  , y = "lat"
  , linestring_id = "id"
)

sf

# Simple feature collection with 2 features and 1 field
# geometry type:  LINESTRING
# dimension:      XY
# bbox:           xmin: 1 ymin: 3 xmax: 6 ymax: 8
# CRS:            NA
#   id              geometry
# 1  a LINESTRING (1 3, 5 7)
# 2  b LINESTRING (2 4, 6 8)


Original Answer

An alternative approach using data.table

require(data.table)

dt <- as.data.table(df)

sf <- dt[
    , {
        geometry <- sf::st_linestring(x = matrix(c(lon1, lon2, lat1, lat2), nrow = 2, ncol = 2))
        geometry <- sf::st_sfc(geometry)
        geometry <- sf::st_sf(geometry = geometry)
    }
    , by = id
]

sf::st_as_sf(sf)
# Simple feature collection with 2 features and 1 field
# geometry type:  LINESTRING
# dimension:      XY
# bbox:           xmin: 1 ymin: 3 xmax: 5 ymax: 7
# epsg (SRID):    NA
# proj4string:    NA
# id              geometry
# 1  a LINESTRING (1 3, 5 7)
# 2  b LINESTRING (2 4, 6 8)
SymbolixAU
  • 25,502
  • 4
  • 67
  • 139
8
df = data.frame(id = c("a", "b"), lon1 = c(1,2), lat1 = c(3,4), lon2 = c(5,6), lat2 = c(7,8))  
df
##   id lon1 lat1 lon2 lat2
## 1  a    1    3    5    7
## 2  b    2    4    6    8

Here is another way, going through WKT:

library(sf)
df$geom = sprintf("LINESTRING(%s %s, %s %s)", df$lon1, df$lat1, df$lon2, df$lat2)
df = st_as_sf(df, wkt = "geom")
df
## Simple feature collection with 2 features and 5 fields
## geometry type:  LINESTRING
## dimension:      XY
## bbox:           xmin: 1 ymin: 3 xmax: 6 ymax: 8
## CRS:            NA
##   id lon1 lat1 lon2 lat2                  geom
## 1  a    1    3    5    7 LINESTRING (1 3, 5 7)
## 2  b    2    4    6    8 LINESTRING (2 4, 6 8)
Michael Dorman
  • 950
  • 6
  • 8
  • 3
    Thanks for the WKT implementation. It is a bit less intuitive, but is very easy to read and takes FAR FAR FAR less code if you're starting with a df with start/end columns. – D. Woods Feb 15 '22 at 16:17
  • 1
    In all my sf spatial temporal travels, I somehow often encounter this problem, and this solution is the most elegant in my opinion, far less cumbersome than nesting, mapping, etc... – Xavier GB Oct 28 '22 at 00:54
7

We can loop through the rows, with pmap and apply the st_linestring on a matrix created

library(tidyverse)
library(sf)
out <- pmap(df[-1], ~
               c(...) %>%
                matrix(., , ncol=2, byrow = TRUE) %>% 
                st_linestring) %>%
          reduce(st_sfc) %>%
          mutate(df, geometry = .)

out$geometry
#Geometry set for 2 features 
#geometry type:  LINESTRING
#dimension:      XY
#bbox:           xmin: 1 ymin: 3 xmax: 6 ymax: 8
#epsg (SRID):    NA
#proj4string:    NA
#LINESTRING (1 3, 5 7)
#LINESTRING (2 4, 6 8)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • is there a possibility the linestring column to be a geometry column instead of a list ? – adl Aug 19 '18 at 14:26
  • @adl Can you check if the change is okay – akrun Aug 19 '18 at 14:34
  • the result is still a list. Each row has a list in the lstring column instead of just the linestring – adl Aug 19 '18 at 14:37
  • @adl What about now – akrun Aug 19 '18 at 14:39
  • the format is ok, but the content inside the linestring is not correct. The desired dataframe should be like the following: df <- data.frame(id = c("a", "a", "b", "b"), lon = c(1,2,5,6), lat = c(3,4,7,8)) %>% st_as_sf(coords = c("lon", "lat"), dim = "XY") %>% st_set_crs(4236) %>% group_by(id) %>% summarise(geometry = st_union(geometry), do_union = FALSE) %>% st_cast("LINESTRING") – adl Aug 19 '18 at 14:41
  • 1
    @adl I think I missed the `st_sfc` earlier. – akrun Aug 19 '18 at 14:46
  • I'm trying to figure out the purrr-magic that has happend here. What does `c(...)`in `pmap(df[-1], ~c(...))` do exactly? Would you mind replacing the formula `~` notation with an anonymous function? – Ratnanil Nov 02 '18 at 08:45
  • In addition, `reduce(st_sfc)` ony works with two rows.. if the dataframe is longer than that, an error is returned (`Error in vapply(lst, class, rep(NA_character_, 3)) : values must be length 3, but FUN(X[[1]]) result is length 2`) – Ratnanil Nov 02 '18 at 09:08
  • @Ratnanil It is just to concatenate `c`. You could use anonymous function call, but part of the reason people use these functions are to make it clean, compact etc (that is also subjective). Regarding the error, you may have to check the behavior of `st_sfc` – akrun Nov 02 '18 at 15:38
3

This solution also uses purrr's pmap, obtaining the result in the desired format

library(tidyverse)
library(sf) 

df <- data.frame(id = c("a", "b"), lon1 = c(1,2), lat1 = c(3,4), lon2 = c(5,6), lat2 = c(7,8))  

make_line <- function(lon1, lat1, lon2, lat2) {
    st_linestring(matrix(c(lon1, lon2, lat1, lat2), 2, 2))
}

df %>%
    select(-id) %>% 
    pmap(make_line) %>% 
    st_as_sfc(crs = 4326) %>% 
    {tibble(id = df$id, geometry = .)} %>% 
    st_sf() 

Result:

Simple feature collection with 2 features and 1 field
geometry type:  LINESTRING
dimension:      XY
bbox:           xmin: 1 ymin: 3 xmax: 6 ymax: 8
epsg (SRID):    4326
proj4string:    +proj=longlat +datum=WGS84 +no_defs
# A tibble: 2 x 2
  id            geometry
  <fct> <LINESTRING [°]>
1 a           (1 3, 5 7)
2 b           (2 4, 6 8)
HAVB
  • 1,858
  • 1
  • 22
  • 37