3

ISSUE

Cannot insert NA values from a data.frame to a DB Table

STEPS

  1. Read table from SQL server into R data.frame. Table is decimal with some NULL. data.frame is numeric with some NA.

  2. dBWriteTable throws the following error

    Error in .local(conn, statement, ...) : execute JDBC update query failed in dbSendUpdate (The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Parameter 57 (""): The supplied value is not a valid instance of data type float. Check the source data for invalid values. An example of an invalid value is data of numeric type with scale greater than precision.)

  3. I overwrite NA with 0s like dataset$column[is.na(dataset$column)] = 0

  4. dBWriteTable successfully writes onto DB

R Details

> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server 7.5 (Maipo)

RJDBC_0.2-7.1       rJava_0.9-10       DBI_1.0.0 
Community
  • 1
  • 1
gmeroni
  • 571
  • 4
  • 16
  • 1
    Has the underlying table been configured to allow nulls? – critical_error Jul 30 '18 at 17:11
  • 1
    This is an interesting error and if you can set up a reproducible example, post the issue on the [rjdbc git page](https://github.com/s-u/RJDBC). If you can download the [SQL Server ODBC driver](https://www.microsoft.com/en-us/download/details.aspx?id=53339) and use `odbc` package (also uses DBI), it would be interesting to know if you can recreate the `dbWriteTable()` error with NAs. – Parfait Jul 30 '18 at 18:29
  • @Parfait: Just forgot to mention that if I use `RODBC` with `sqlSave` everything works smoothly but a lot slower! Anyway thanks for pointing out the git page, there is a fellow there with the same issue of mine [LINK](https://github.com/s-u/RJDBC/issues/63). Reverting to an older version of the package did the trick but now execution times are worse: `0.2-7` 0.97'' vs `0.2-5` 39.8''. I think with `-7` they introduced the bulk insert. – gmeroni Jul 31 '18 at 06:53
  • Do note: `odbc` is not the same as `RODBC`. However, both use ODBC drivers. You may have better luck with that DBI complaint library. – Parfait Jul 31 '18 at 11:34
  • @Parfait I tried even `odbc` but the result is the same from a performance perspective. Thanks a lot for the support. – gmeroni Jul 31 '18 at 12:29

1 Answers1

3

The problem occurs because there are three (maybe more) different values in R for values that are not concretely defined. It's either "Inf" , "NaN" or "NA". Now dbWriteTable can handle NA by converting these to "NULL" when moving it to SQL. However "NaN" and "Inf" are not recognized thus giving the "Check the source data for invalid values" error. The fix is as following:

Suppose this is your table that you want to write to SQL:

Tablename: "df"
USER quality
1    Inf
2    NaN
3    0.3

The first thing you want to do is convert all "NaN" to "NA" because dbWriteTable can recognize these. This can be done simply by:

df[is.na(df)] <- NA

Then you still have your "Inf" values. Unfortunately I haven't found a easy way to go through the entire table in one line. But check which columns have "Inf" and do them one by one as following:

df$quality[is.nan(df$quality)] <- NA

If your table is cleaned up it shouldn't give these errors. Here is another example of how to use dbWriteTable just to make things more clear:

dbWriteTable(ODBCconnectionname, DBMStablename, YourRtablename,
field.types = c(USER="integer",
quality="float(53)",
))

Check which data types you need on this page: https://www.w3schools.com/sql/sql_datatypes.asp

If you need more info just PM me.

Mister O
  • 31
  • 4
  • This minimal example don't work: `x <- data.frame(a=c(1.1,NA_real_)); dbWriteTable(db_conn, "test_tbl", x)` – Marek Mar 16 '20 at 14:45
  • 1
    There's `is.infinite` to check for `NaN` and `Inf`, just for reference and for future visitors. – hannes101 Nov 03 '20 at 09:57