0

I'm storing data as very long character strings in a text field in PostgreSQL, but I'm hitting a limit when I retrieve the data. The table is as follows:

CREATE TABLE test
(
  a integer,
  b text
)

I insert data using R and RODBC with unixodbc configured with MaxLongVarcharSize=256000. As running the code below shows, the data is inserted into the table correctly with no truncation, but extracting the data with sqlQuery truncates the data at 65534 characters.

library(RODBC)
pg <- odbcConnect("pgScarabParallel")

test <- data.frame(a = 1:3, 
    b = c(
        paste(rep("test", 10000), collapse = " "), 
        paste(rep("test", 15000), collapse = " "), 
        paste(rep("test", 20000), collapse = " ")
    )
)
test$b <- as.character(test$b)
nchar(test$b)
sqlSave(pg, test, append = TRUE, rownames = FALSE)

sqlQuery(pg, "SELECT LENGTH(b) FROM test")[[1]]

test2 <- sqlQuery(pg, "SELECT * FROM test", stringsAsFactors = FALSE)
nchar(test2$b)

The inserted fields are 49999, 74999, and 99999 characters long, but when I query them they are truncated to 49999, 65534, and 65534 respectively.

Is there any way to avoid the truncation? Is there an easy way to find out if this is caused by odbc or R?

MiG62
  • 198
  • 4
  • 10
  • 3
    A careful reading of the RODBC documentation (the associated vignette) pays off here. I know it's long, but there is a _ton_ of important information in there. Specifically: "There is a sanity check which will allow only strings of up to 65535 bytes when reading: this can be removed by recompiling RODBC." Of course, there is always RPostgreSQL which is likely to be perform better anyway. – joran Oct 31 '16 at 19:26
  • The limit you're hitting is very possibly a limit set in your ODBC driver. It can be changed. I've done this before myself. – Hack-R Oct 31 '16 at 19:47
  • @joran: Thanks. I hadn't spotted it. This is almost certainly the answer as the variables in the ODBC driver appear to be correctly configured. – MiG62 Oct 31 '16 at 20:16
  • @Hack-R: Yes, there is a configuration option in the ODBC driver that I have configured correctly, which makes it all the more puzzling to find that the limit is duplicated in compiled code. – MiG62 Oct 31 '16 at 20:32

0 Answers0