2

I'm using R to do a statistical analysis on a SQL Server 2008 R2 database. My database client (aka driver) is JDBC and thereby I'm using RJDBC package.

My query is pretty simple and I'm sure that query would return a lot of rows (about 2 million rows).

SELECT * FROM [maindb].[dbo].[users]

My R script is as follows.

library(RJDBC);

javaPackageName <- "com.microsoft.sqlserver.jdbc.SQLServerDriver";
clientJarFile <- "/home/abforce/mystuff/sqljdbc_3.0/enu/sqljdbc4.jar";
driver <- JDBC(javaPackageName, clientJarFile);
conn <- dbConnect(driver, "jdbc:sqlserver://192.168.56.101", "username", "password");

query <- "SELECT * FROM [maindb].[dbo].[users]";
result <- dbSendQuery(conn, query);
dbHasCompleted(result)

In the codes above, the last line always returns TRUE. What could be wrong here?

frogatto
  • 28,539
  • 11
  • 83
  • 129

1 Answers1

4

The fact of function dbHasCompleted always returning TRUE seems to be a known issue as I've found other places in the Internet where people were struggling with this issue.

So, I came with a workaround. Instead of function dbHasCompleted, we can use conditional statement nrow(result) == 0.

For example:

result <- dbSendQuery(conn, query);
repeat {
    chunk <- dbFetch(result, n = 10);
    if(nrow(chunk) == 0){
        break;
    } 
    # Do something with 'chunk';
}
dbClearResult(result);
frogatto
  • 28,539
  • 11
  • 83
  • 129