2

I would like to write forecasted data into a sql-server using R and RODBC. Each forecast are for the next six hours and I would like to only save the newest generation of each foreacst. Illustrated here:

set.seed(1)
# First forecast at 00:00:00
df.0 <- data.frame(Dates = seq.POSIXt(from = as.POSIXct("2015-10-29 00:00:00"), 
                                      to = as.POSIXct("2015-10-29 5:00:00"), by = "hour"), 
                   Value = runif(6, min = 0, max = 6))

# Second forecast at 01:00:00
df.1 <- data.frame(Dates = seq.POSIXt(from = as.POSIXct("2015-10-29 01:00:00"), 
                                      to = as.POSIXct("2015-10-29 6:00:00"), by = "hour"), 
                   Value = runif(6, min = 0, max = 6))

Now, at 00:00:00 I would save my first forecast into my data base dbdata:

require(RODBC)

sqlSave(channel = dbdata, data = df.0, tablename = "forecasts", 
append = TRUE, rownames = FALSE, fast = FALSE, verbose = TRUE)

# Query: INSERT INTO "forecast" ( "Dates", "Values") VALUES 
( '2015-10-29 00:00:00', '1.59')
# Query: INSERT INTO "forecast" ( "Dates", "Values") VALUES 
( '2015-10-29 00:00:00', '2.23')
# etc for all 6 forecasts

Now, at 01:00:00 I get a new forecast. I want to save/update this forecast, so I replace all the values from 01:00:00 to 05:00:00 and the add the newest forecast at 06:00:00 as well.

The update works well - so I can overwrite the files - but update can't insert the last 06:00:00 forecast.

sqlUpdate(channel = dbdata, dat = df.1, tablename = "forecasts", 
fast = FALSE, index = c("Dates"), verbose = TRUE)

# Query: UPDATE "forecast" SET "Value" = 5.668 WHERE "Dates" = '2015-10-29 00:01:00'
# etc. until 
# Error in sqlUpdate(channel = prognoser, dat = df.1[, ], 
# table = "forecast",  : 
# [RODBC] ERROR: Could not SQLExecDirect 
# 'UPDATE " "forecast" SET "Value" = 1.059 WHERE "Dates" = '2015-10-29 06:00:00'

So, this can be probably be solved in a lot of ways - but what are the good ways to do this?

I think there must be better ways than to read the table and find out how long the forecast is in the database. Then split the new data into an update and a save part, and write these in.

It is a t-sql, microsoft server. The tables are in the same database - but this a pure coincidence. Which means this: RODBC: merge tables from different databases (channel) shouldn't be a issue and perhaps I can get away with a t-sql "MERGE INTO". But next time I probably won't be able to.

Community
  • 1
  • 1
Thorst
  • 1,590
  • 1
  • 21
  • 35
  • Not familiar with `t-sql`. Can you add a primary key on the `Dates` column? Then, you could use `INSERT` in both queries and the engine (maybe) automatically should update duplicate `Dates` values. – nicola Oct 29 '15 at 07:59
  • That won't work :) Dates are already the key. – Thorst Oct 29 '15 at 09:07

1 Answers1

0

You can try making a conditional insert followed by an update, the conditional insert means you only insert if the Date does not exist yet and the update always succeeds (you do some unnecessary updates if the value was succesfully inserted)

Something like the following for the conditional insert:

INSERT INTO "forecast" ( "Dates", "Values") VALUES ( '2015-10-29 00:00:00', '2.23') where not exists (select 1 from "forecast"  where "Dates"='2015-10-29 00:00:00')
Bert Neef
  • 743
  • 1
  • 7
  • 14