0

I'm trying to build an HTTP API (using R/plumber). One feature I'd like is for the user to make a GET request and then return some data depending on the factors they'd like data for. However, querying the data requires me to build an odbc database connection and then submit the query, and return the results as a JSON object.

At the moment I have a function like:

#* @param username
#* @param password
#* @param factors
#* @get /data
function(username, password, factors){
  # build database connection
  # query data
  # return results
}

This would be using HTTPS, but I'm still worried this is unsafe (although the API would only be hosted on internal servers, so I think in general security is less of an issue). Is there a better way to do this? I don't have a better way to authenticate users at the moment - I need to pass in the user/password combination to build a database connection using a different R function and then I can access data from our database. I might be able to get a system database account and just store the username/password from a config file and then get the user's ID from a system environment variable, but this would require navigating my companies bureaucracy :(

Is what I'm doing OK or would you recommend going a different route?

r2evans
  • 141,215
  • 6
  • 77
  • 149
lmeninato
  • 462
  • 4
  • 12
  • Using internal RStudio Connect server for deployment btw. – lmeninato Aug 29 '20 at 22:03
  • (1) DB user/pass: I suggest using env-vars for the username and password, as suggested (among other ideas) at https://db.rstudio.com/best-practices/managing-credentials/. Once you `rsconnect::deployApp` your plumber app, go into the RSC interface and set the env-vars, and all is good. And certainly more secure (and flexible) than adding that stuff in your *code*. (2) For your user-authentication, I strongly discourage handling user/pass in the app: it will leak in logs and be somewhat sniffable. Use RSC's security (ldap, saml, pam) and require all users be authenticated. – r2evans Aug 29 '20 at 23:32
  • (3) Depending on your expected frequency of connections, you can choose to connect/query/disconnect with every call to the function, which might add a *little* overhead to each HTTPS connection; or you can use `pool` (with `odbc`) *outside* of your function, so that the connection pool is controlled and maintained in a persistent fashion. (4) You haven't talked about how `factors` is structured/controlled; be very careful with inadvertent or malicious sql-injection, make sure you are binding your user-provided values, not interpolating them into query strings. – r2evans Aug 29 '20 at 23:45
  • I'll try and go with getting a custom db account we can use to env-vars, but might not be possible. – lmeninato Aug 31 '20 at 15:26
  • You would benefit from having a "service" account (read-only) regardless of how you store the username and password. Since you're deploying your app to RSC, some of the secure password storage mechanisms cannot be used (e.g., windows credentials, macos keychain, linux keyring, and R's `keyring` package for all of them); the `config` package allows you to not put the user/pass in the source code, but it does put it in a text file, potentially readable. Env vars are the last option and are allegedly securely stored in RSC, so ... you're best option regardless of which db-user you are using. – r2evans Aug 31 '20 at 15:36

0 Answers0