0

I'm trying to make my analysis workflow more efficient connecting my SQL Server Database to R-Studio.

What I do now?

  1. I develop my queries using MS SQL Management Studio to extract relevant information for the analysis.
  2. I export the query result to a .csv UTF-8 file.
  3. I read the file in R-Studio and perform analyses.
  4. I export the final dataframe to a .csv UTF-8 file.
  5. I use this file to create a table back in SQL Management Studio.

What am I trying to do?

I want to create a connection between SQL Server and R-Studio so I can directly import/export my dataframe/table without having these files in my computer.

To do that I have followed these steps:

  1. Installed DBI and odbc packages in R.
  2. Configured the connection using "ODBC Data Source Connection Administrator"
    • My connection name is "Enterprise" (SQL Server Native Client 10.0)
  3. In R-Studio, using the Connection tab, I can seen a list of data sources containing "Enterprise".
    • Choosing "Enterprise" data source create an R object with the name "con" that contains the connection information.
    • At this point I can see all databases in the connection tab.
  4. My code for testing in R is the following:
---
title: "R Notebook"
output: html_notebook
---
```{r setup, include=FALSE}
library(odbc)
con <- dbConnect(odbc::odbc(), "Enterprise", timeout = 10) ```

```{sql, connection = con, output.var = "DataFrame"}
SELECT TOP 1000 *
FROM [DB].[dbo].[Table] ```

Data<-DataFrame[1:100,1:10]
head(Data,11)```

Var1    Var2    Var3    Var4    Var5    Var6    Var7    Var8    Var9    Var10
   A       1    2.4      5.5    13.1    30.8    72.4    170.5   401.6   945.7
   B       2    4.7     11.1    26.1    61.5    144.8   341.1   803.2   1891.5
   C       3    7.1     16.6    39.2    92.3    217.3   511.6   1204.8  2837.2
   D       4    9.4     22.2    52.2    123.0   289.7   682.2   1606.4  3782.9
   E       5    11.8    27.7    65.3    153.8   362.1   852.7   2008.0  4728.6
   F       6    14.1    33.3    78.4    184.5   434.5   1023.2  2409.6  5674.4
   G       7    16.5    38.8    91.4    215.3   506.9   1193.8  2811.2  6620.1
   H       8    18.8    44.4    104.5   246.0   579.4   1364.3  3212.8  7565.8
   I       9    21.2    49.9    117.5   276.8   651.8   1534.9  3614.4  8511.5
   J       10   23.5    55.5    130.6   307.5   724.2   1705.4  4016.0  9457.3
   K       11   25.9    61.0    143.6   338.3   796.6   1875.9  4417.6  10403.0
```{sql, connection = con}
SELECT * INTO [DB].[dbo].[New_Table]
FROM ?Data```

Here I'm using the ? to indicate that "Data" is an R dataframe object in the SQL Chunk.

Error in vapply(values, function(x) { : values must be length 1,
 but FUN(X[[1]]) result is length 100
Failed to execute SQL chunk

My guess is I'm doing something wrong in the last chunk.

What will be the proper way to push "Data" to my data base?

Dale K
  • 25,246
  • 15
  • 42
  • 71
Regressionist
  • 53
  • 1
  • 7
  • You are trying to run a SQL query against local data and *not in the SQL server*. How would you insert data into this table on the console? The only ways I know about (short of the literal SQL code `insert (col1,col2,...) values (...)`) are `DBI::dbWriteTable`, `DBI::dbAppendTable`, and (if your data is somewhere on the server) `insert into tablename select ...`. The latter won't work for you since your data is not in a table the server can see. – r2evans Sep 27 '19 at 17:39
  • So, perhaps to answer your question, replace your inserting code chunk with an R-code chunk including `DBI::dbInsertTable(con, "New_Table", Data)`. – r2evans Sep 27 '19 at 17:40
  • (And just to be certain ... unless I'm mistaken, `select ... into ...` will only work if the table does not yet exist; it will not append and will error if the table already exists. Is that your intent?) – r2evans Sep 27 '19 at 17:42
  • @r2evans This is making perfect sense to me. I have changed my chunk for the following: `dbWriteTable(con, "[DB].[dbo].[New_Table]", Data)` I tried with `dbInsertTable` but It was not recognized. The issue now is I'm getting the following error: `nanodbc/nanodbc.cpp:1587: 42000: [Microsoft][SQL Server Native Client 10.0][SQL Server]CREATE TABLE permission denied in database 'master'.` because is trying to copy the table into the master and is ignoring the path to the DB I have the permission. `"[DB].[dbo].[New_Table]"` – Regressionist Sep 27 '19 at 18:33
  • @r2evans This is the SQL interpretation of the last function: `> dbWriteTable(con, "[DB].[dbo].[New_Table]", Data) Error: 'CREATE TABLE "[DB].[dbo].[New_Table]" ( "Var1" varchar(255), "Var2" FLOAT, "Var3" FLOAT, "Var4" FLOAT, "Var5" FLOAT )` – Regressionist Sep 27 '19 at 18:46
  • `dbInsertTable` was a typo, glad you saw through that. You need to figure out the "permissions" issue on your own, and perhaps that discussion will guide you into using `create=` and perhaps even `DBI::dbCreateTable(..., temporary=...)` before inserting the data. – r2evans Sep 27 '19 at 21:24

1 Answers1

2

How I solved it?

I was on the right track until I got to the part of pushing my dataframe to the DB. @r2evans pointed out that I was trying to create a table from the server with information that only existed locally with this code:

```{sql, connection = con}
SELECT * INTO [DB].[dbo].[New_Table]
FROM ?Data```

At this point, I changed the method and tried to pushed the dataframe using r-code:

```{r}
dbWriteTable(con, "[DB].[dbo].[New_Table]", Data)```

The result was the following error:

nanodbc/nanodbc.cpp:1587: 42000: [Microsoft][SQL Server Native Client 10.0][SQL Server]CREATE TABLE permission denied in database 'master'.

The problem here: I was trying to access to the database I have the permission using "[DB].[dbo].[New_Table]". The dbWriteTable function understand this parameter just as a pure name for the table, so I had to find a way to specify the right path to the database.

I did not find the way to do it using code directly. Solution: I went to the "ODBC Data Sources (32-bit)" (Windows APP); and changed the configuration of my connection. The connection I had, was taking the master as the default database. I changed it for the database I have the permission. So now I'm using the following code to push the table from R-notebooks:

```{r}
dbWriteTable(con, "[R_Test2]", Data, overwrite = T)```

It's working perfectly for what I need.

If someone know how to specify this path using code, will be good to know. Maybe in the first configuration chunk:

```{r setup, include=FALSE}
library(odbc)
library(DBI)
con <- dbConnect(odbc::odbc(), "Enterprise", timeout = 10)```
Regressionist
  • 53
  • 1
  • 7