0

I would like to process some GPS-Data rows, pairwise.

For now, I am doing it in a normal for-loop but I'm sure there is a better and faster way.

n = 100
testdata <- as.data.frame(cbind(runif(n,1,10), runif(n,0,360), runif(n,14,16), runif(n, 46,49)))
colnames(testdata) <- c("speed", "heading", "long", "lat")
head(testdata)

diffmatrix <- as.data.frame(matrix(ncol = 3, nrow = dim(testdata)[1] - 1))
colnames(diffmatrix) <- c("distance","heading_diff","speed_diff")

for (i in 1:(dim(testdata)[1] - 1)) {
  diffmatrix[i,1] <- spDists(as.matrix(testdata[i:(i+1),c('long','lat')]),
                             longlat = T, segments = T)*1000
  diffmatrix[i,2] <- testdata[i+1,]$heading - testdata[i,]$heading
  diffmatrix[i,3] <- testdata[i+1,]$speed - testdata[i,]$speed
}
head(diffmatrix)

How would i do that with an apply-function?

Or is it even possible to do that calclulation in parallel?

Thank you very much!

SeGa
  • 9,454
  • 3
  • 31
  • 70

1 Answers1

2

I'm not sure what you want to do with the end condition but with dplyr you can do all of this without using a for loop.

library(dplyr)
testdata %>% mutate(heading_diff = c(diff(heading),0),
                    speed_diff = c(diff(speed),0), 
                    longdiff =  c(diff(long),0), 
                    latdiff = c(diff(lat),0)) 
         %>% rowwise() 
         %>% mutate(spdist = spDists(cbind(c(long,long + longdiff),c(lat,lat +latdiff)),longlat = T, segments = T)*1000 ) 
         %>% select(heading_diff,speed_diff,distance = spdist)

#   heading_diff speed_diff distance
#          <dbl>      <dbl>    <dbl>
# 1         15.9      0.107   326496
# 2       -345       -4.64     55184
# 3        124       -1.16     25256
# 4         85.6      5.24    221885
# 5         53.1     -2.23     17599
# 6       -184        2.33    225746

I will explain each part below:

The pipe operator %>% is essentially a chain that sends the results from one operation into the next. So we start with your test data and send it to the mutate function.

Use mutate to create 4 new columns that are the difference measurements from one row to the next. Adding in 0 at the last row because there is no measurement following the last datapoint. (Could do something like NA instead)

Next once you have the differences you want to use rowwise so you can apply the spDists function to each row.

Last we create another column with mutate that calls the original 4 columns that we created earlier.

To get only the 3 columns that you were concerned with I used a select statement at the end. You can leave this out if you want the entire dataframe.

jasbner
  • 2,253
  • 12
  • 24