0

I've been playing around with database queries in R that are executed on a Postgres database with the PostGIS extension. This means I use some of the PostGIS functions that do not have an R equivalent. If it wasn't for that, I could probably just execute the same function on a local test dataframe instead of a database connection, but due to the PostGIS functions, that's not possible.

Is there a simple approach to create test data in a test database and run the query on that and assess the outcome? I have a WKB column which R does not directly support, so I'm not even sure a simple copy_to could work with inserting a character vector into a geometry column, not to speak of resolving potential key constraints. A local sqlite database does not work because it does not provide these functions.

Has someone found a viable solution to this problem?

telegott
  • 196
  • 1
  • 10

1 Answers1

0

It sounds like you can not collect tables from postgresql back into R, hence your comparison has to happen in sql.

I would do the following:

  1. define text strings to generate sql tables
  2. execute the strings to generate the tables
  3. run your code
  4. make your comparison

For doing a comparison in sql that two tables are identical I would follow the method in this question or this one.

This would look something like this:

# Define text strings
create_string = 'CREATE TABLE test1 (code VARCHAR(4), size INTEGER);'
insert_string = 'INSERT INTO test1 (code, size) VALUES ('AAA', 123);'

# Execute strings
db_con = create_connection()
dbExecute(db_con, create_string)
dbExecute(db_con, insert_string)

# optional validate new table with contents now exists in datbase

# run code
test1 = tbl(db_con, "test1")
test2 = my_function_to_test_that_does_nothing(test1)

# comparison
num_records_not_in_both = test1 %>%
    full_join(test2, by = colnames(test2), suffix = c("_1","_2") %>%
    filter(is.na(id_1) | is.na(id_2)) %>%
    ungroup() %>%
    summarise(num = n()) %>%
    collect()
require(num_records_not_in_both == 0)

# optional delete test functions
Simon.S.A.
  • 6,240
  • 7
  • 22
  • 41
  • Thanks for the answer! I'm not sure I understand you correctly, I *can* retrieve PostGIS data into R, however some of the wrangling needs to happen in SQL, and I want to check whether that wrangling does what it should somehow. – telegott Nov 08 '19 at 19:15
  • In that case, you still probably want steps 1-3 to write test data into sql and run the sql commands. But you can replace step 4 by collecting the data back into R and testing the output there. – Simon.S.A. Nov 09 '19 at 00:25