0

I struggle to call a plr function in postgreSQL from a R-script and use it in ggplot2 - geom_function. The following examples are extremly simplified but hopefully show the issue.

Let's assume I have the following plr function:

CREATE OR REPLACE FUNCTION public.mypgfunc(
    x numeric,
    a numeric)
    RETURNS numeric
    LANGUAGE 'plr'
    COST 100
    VOLATILE PARALLEL UNSAFE
AS $BODY$
    return (x*a)
$BODY$;

To call this from ggplot - geom_function I thought, I could write the following simple wrapper function to do the query (I use the rpostgres package):

myWrapper <- function(x , a) {
  con <- dbConnect(drv = RPostgres::Postgres() , dbname='mydb')
  
  q <- dbSendQuery(con , "select mypgfunc( $1 , $2 )")
  dbBind(q , c(x,a))
  y <- dbFetch(q)
  dbClearResult(q)
  
  dbDisconnect(con)
  
  return(y)
}

But if I now call this function from ggplot, I get the following warning message and an empty plot:

Computation failed in stat_function(): Query requires 2 params; 102 supplied.

The ggplot code looks as follows:

ggplot() +
  geom_function(fun = myWrapper , args = list(a = 5))

If I write the plr function in R instead and call this from geom_function, everything works fine. If I call the myWrapper directly (outside ggplot) with just one value for x and a, respectively, everything works fine as well.

So, what do I have to change?

bthebread
  • 253
  • 1
  • 8

2 Answers2

1

I don't now much about using SQL databases in R. But I think I know the reason of your error.

If you look at the help page for stat_function (which is what geom_function uses under the hood), you will see that, by default, it creates 101 values of x over an arbitrary range (I think it is [0,1]).

This is why the error message mentions that "Query requires 2 params; 102 supplied". You are concatenating x (which is a vector of length 101) with a (length 1), when you call dbBind(q , c(x,a)).

The size of x is defined by the parameter n. So if you run:

ggplot() + geom_function(fun = myWrapper , args = list(a = 5), n=1)

You are passing a single value of x to myWrapper(I think x=0) and you should get the same result as the situation you described in the last sentence (I got a warning message because ggplot could not draw a line with only 1 point).

So, basically, you need to do separate queries for each value of the vector x. A straight forward way of doing this is looping over the values of x:

y <- NULL
for (xx in x) {
    q <- dbSendQuery(con , "select mypgfunc( $1 , $2 )")
    dbBind(q , c(xx, a))
    if (is.null(y)) {
        y <- dbFetch(q)
    } else {
        y <- rbind(y, dbFetch(q))
    }        
    dbClearResult(q)
}

Like I said, I have not worked with SQL in R. So I am sure there are smarter ways of doing this where you don't need to call an ugly for loop (not sure also if you need the dbSendQuery inside the loop).

kikoralston
  • 1,176
  • 5
  • 6
0

I did some more research and found another solution which avoids a for-loop.

I just had to change the myWrapper function into the following code:

myWrapper <- function(x , a) {
  con <- dbConnect(drv = RPostgres::Postgres() , dbname='mydb')
  
  a <- rep(a , length(x))
  q <- dbSendQuery(con , "select mypgfunc( $1::numeric , $2::numeric )")
  dbBind(q , list(x,a))
  y <- dbFetch(q)
  dbClearResult(q)
  
  dbDisconnect(con)
  
  return(y[,1])
}
bthebread
  • 253
  • 1
  • 8