2

I have a postgresql database connection and want to get a table from the database. Presumably it's good practice to keep the connection info in a different file? I have two files just now:

#getthetable.R
library(tidyverse)
library(dbplyr)


## connect to db
con <- src_postgres(dbname = "thedbname",
                    host = "blablabla.amazonaws.com",
                    port = NULL,
                    user = "myname",
                    password = "1234")

thetable <- tbl(con, "thetable") %>% select(id, apples, carrots) %>% collect

And then:

#main.R
library(tidyverse)

## get data from getthetable script with connection
source("rscripts/getthetable.R") 

This now makes both con and thetable variables available in main.R. I just want the variable thetable from getthetable.R. How do I do that? Leaving out con variable?

Also, is there a best practice here when working with db connections in r? Is my thinking logical? Are there drawbacks to what I'm doing or do most people just put the connection in together with the main scripts?

zx8754
  • 52,746
  • 12
  • 114
  • 209
Doug Fir
  • 19,971
  • 47
  • 169
  • 299
  • 1
    Objects in R belongs to *environments* and not to files. If you source two files from the same environment, every object defined there belongs to the environment (unless you explicitly put objects in other environments). You can remove any object with `rm` (and so also your `con` object) or define `con` inside a function. – nicola Jul 12 '17 at 04:43
  • Thanks for the response @nicola. Does my approach make sense to you? Is it logical to separate the connection from the data in my scripts or am I over worrying? Also, I'm not sure how to action your feedback... are you saying I should use rm() within the connections details script? Will that delete my desired tbl `thetable`? What's a conventional approach here? – Doug Fir Jul 12 '17 at 06:16

1 Answers1

3

I also like to capture such things (like connections) in a different file, but also in an designated environment like this:

ConnectionManager <- local({

  con <- src_postgres(dbname = "thedbname",
                      host = "blablabla.amazonaws.com",
                      port = NULL,
                      user = "myname",
                      password = "1234")



  collectTable <- function() {

    tbl(con, "thetable") %>% select(id, apples, carrots) %>% collect

  }

  list(collectTable = collectTable)


})

This way you have only one object ConnectionManager after sourcing the file and can get the table with ConnectionManager$collectTable(). Additionally you can easily extend it to fetch other tables or to include some connection utility functions.

AEF
  • 5,408
  • 1
  • 16
  • 30