2

I am trying to transfer data from the Thingspeak API into a postgres database. The API limits each request to 8000 observations, but I need to pull millions! I'm using R to iteratively pull from the API, do a bunch of wrangling, and then submit the results as a data.frame to my table within the db.

The current way I am doing this relies on the dbWriteTable() function from the RPostgres package. However, this method does not account for existing observations in the db. I have to manually DELETE FROM table_name before running the script or I'll end up writing duplicate observations each time I try to update the db. I'm still wasting time re-writing observations that I deleted, and the script takes ~2 days to complete because of this.

I would prefer a script that incorporates the functionality of postgres-9.5' ON CONLFICT DO NOTHING clause, so I don't have to waste time re-uploading observations that are already within the db. I've found the st_write() and st-read() functions from the sf packages to be useful for running SQL queries directly from R, but have hit a roadblock. Currently, I'm stuck trying to upload the 8000 observations within each df from R to my db. I am getting the following error:

Connecting to database:

# db, host, port, pw, and user are all objects in my R environment
con <- dbConnect(drv = RPostgres::Postgres()
             ,dbname = db
             ,host = host
             ,port = port
             ,password = pw
             ,user = user)

Current approach using RPostgres:

dbWriteTable(con
    ,"table_name" 
    ,df 
    ,append = TRUE
    ,row.names = FALSE)

New approach using sf:

st_write(conn = conn
      ,obj = df
      ,table = 'table_name'
      ,query = "INSERT INTO table_name ON CONFLICT DO NOTHING;"
      ,drop_table = FALSE
      ,try_drop = FALSE
      ,debug = TRUE)

Error message:

Error in UseMethod("st_write") : 
  no applicable method for 'st_write' applied to an object of class "c('tbl_df', 'tbl', 'data.frame')"

Edit:

Converting to strictly a dataframe, i.e. df <- as.data.frame(df) or attributes(df)$class <- "data.frame", resulted in a similar error message, only without the tbl_df or tbl classes.

Most recent approach with sf:

I'm making some progress with using st_write() by changing to the following:

# convert geom from WKT to feature class
df$geom <- st_as_sfc(df$geom)

# convert from data.frame to sf class
df <- st_as_sf(df)

# write sf object to db
st_write(dsn = con # changed from drv to dsn argument
    ,geom_name = "geom"
    ,table = "table_name"
    ,query = "INSERT INTO table_name ON CONFLICT DO NOTHING;"
    ,drop_table = FALSE
    ,try_drop = FALSE
    ,debug = TRUE
    )

New Error:

Error in result_create(conn@ptr, statement) : 
  Failed to fetch row: ERROR:  type "geometry" does not exist at character 345

I'm pretty sure that this is because I have not yet installed the PostGIS extension within my PostgreSQL database. If anyone could confirm I'd appreciate it! Installing PostGIS a pretty lengthy process, so I won't be able to provide an update for a few days. I'm hoping I've solved the problem with the st_write() function though!

philiporlando
  • 941
  • 4
  • 19
  • 31
  • 1
    Can you try by converting to `data.frame` i.e. `df <- as.data.frame(df)` – akrun May 31 '18 at 20:21
  • `attributes(df)$class <- "data.frame"` -might also work – Mike May 31 '18 at 20:22
  • My `df` is already of class `data.frame`. After paying closer attention to the docs, the `obj` argument must be fed a `sf` class object, not a `data.frame`. I'm away from my desktop at the moment, but I'll try this again soon. – philiporlando May 31 '18 at 21:02
  • perhaps try `st_is_valid(df)` to check whether there is a problem with the geometry – sebdalgarno Jun 01 '18 at 19:02

0 Answers0