0

I obtained many different GPS traces and am trying to plot all the paths onto one map using ggmap. I am trying to produce something similar to the work done by Flowing Data, where that author used plotkml instead of ggmap (which I prefer). Unfortunately, after adding several paths, my R scripts grinds to halt due to excessive memory, apparently. Can someone help out?

I am using R version 3.0.2 on MacOS.

Here is a fully-functioning script. I've manually put in three example data frames, where one frame is from one trace file of GPS coordinates. In reality, I have over 4000 such data frames.

library(ggmap)

printf <- function(...) cat(sprintf(...))

# ---------------------------------------------------------
# Create three example data frames
df1 <- structure(list(latitude = c(37.789165, 37.827332, 37.851501),
    longitude = c(-122.411667, -122.210167, -122.265831)), .Names = c("latitude",
    "longitude"), class = "data.frame", row.names = c(4634L, 5415L,
    30777L))

df2 <- structure(list(latitude = c(37.331951, 37.332291), longitude = c(-122.029579,
-122.028625)), .Names = c("latitude", "longitude"), class = "data.frame", row.names = c(18781L,
18787L))

df3 <- structure(list(latitude = c(37.789333, 37.780834, 37.622833,
37.7775), longitude = c(-122.408669, -122.423668, -122.330666,
-122.43383)), .Names = c("latitude", "longitude"), class = "data.frame", row.names = c(44107L,
44182L, 44237L, 44277L))

allDataFrames <- list(df1, df2, df3)
numDataFrames <- length(allDataFrames)
# ---------------------------------------------------------


# Get the background map
map <-get_googlemap(center=c(lon=-122.237, lat=37.753),maptype="roadmap", color="bw", zoom=9)
p <- ggmap(map)

dfCounter <- 0

# Loop over all the data frames.
for (df in allDataFrames)
{
    # Plot this data frame as a path.
    p <- p + geom_path(aes(x=longitude, y=latitude), data=df, colour="#ff0000", size=0.5)
    printf("%d / %d\n", dfCounter, numDataFrames)
    dfCounter <- dfCounter+1
}

print(p)

Now, by the time the loop completes its 300th iteration, the script grinds to a halt with no progress. I suspect it's the line:

p <- p + geom_path(aes(x=longitude, y=latitude), data=df, colour="#ff0000", size=0.5)

but I don't know how to improve upon that. I need to reach 4000 data frames, and it's already halted at 300.

The output of "top" on my MacOS shows that RSIZE (physical memory) and VSIZE (virtual memory) are off the hook. What is the problem?

Before running script:

PID    COMMAND      %CPU  TIME     #TH   #WQ  #POR #MREG RPRVT  RSHRD  RSIZE  VPRVT  VSIZE  PGRP  PPID  STATE    UID  FAULTS
42708  R            0.0   00:00.18 1     0    21   126   30M    244K   34M    47M    2440M  42708 448   sleeping 501  9464 

While running script, around iteration 150:

PID    COMMAND      %CPU  TIME     #TH   #WQ  #POR #MREG RPRVT  RSHRD  RSIZE  VPRVT  VSIZE  PGRP  PPID  STATE    UID  FAULTS
42708  R            100.1 00:45.87 1/1   0    22   544+  2002M+ 244K   2018M+ 2034M+ 4429M+ 42708 448   running  501  773737+
Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
stackoverflowuser2010
  • 38,621
  • 48
  • 169
  • 217

1 Answers1

0

Create a single data frame, like is done on the Flowing Data site you link to above. That data frame should have three columns: index, latitude and longitude. The lat and long are self-explanatory. The index takes a unique value for each path. Again, this is already done in the code from the Flowing Data site above. For the rest:

#experiment with zoom and other ggmap options as you see fit. Random lat/long chosen here

baseMap <- get_googlemap(center=c(lon = -80, lat = 38), maptype="roadmap", color="bw", zoom=11)

map <- ggmap(baseMap) +
geom_path(aes(x=longitude, y=latitude, group=index), data=routes, alpha=0.2, color="red")

map
forlooper
  • 237
  • 4
  • 11