0

I have a data frame of x-y coordinates and I would like to calculate distances between consecutive points. My issue is that I don't know how to calculate a new column using multiple points from another column.

I know how to calculate a new column like covered here: Creating a new column to a data frame using a formula from another variable

However, the distance formula requires two values from the same column to calculate a value (distance = sqrt((x2-x1)^2+(y2-y1)^2)). How do I specify x1 vs x2 and y1 vs y2 if I only have a single column for x and y? Also, will there be any issues for the first point since it doesn't have a point before it? (i.e. there will be an empty distance cell - is that an issue?)

I know I can do this in Excel fairly easily, but I have a lot of different datasets that need the same treatment so I'd like to automate it in R.

N. Anderson
  • 17
  • 1
  • 4

1 Answers1

4

You can refer to the row above. Note that the first row doesn't have a prior coordinate set. FYI, if you have really large datasets you may want to use data.table and the shift function.

n <- 10

df2 <- data.frame(x = rnorm(n = n), y = rnorm(n = n), dist = as.numeric(NA))

df2$dist[2:n] <- sqrt((df2$x[2:n] - df2$x[1:n-1]) ^ 2 + (df2$y[2:n] - df2$y[1:n-1]) ^ 2)
ddunn801
  • 1,900
  • 1
  • 15
  • 20