0

I want to do something really similar to these:

When mysql WHERE clause is empty, return all rows

Possible to have PHP MYSQL query ignore empty variable in WHERE clause?

I want the where clause to be ignored if the variable I am subsetting by is NULL. However, I am accessing my MySQL database from R using dbGetQuery. So far, I have code like this

write_pid_clause = function(p_id=NULL){
 if(is.null(p_id)){return(NULL)}
 else {return(paste0("where project_id = ",p_id)  )}
}

which writes the correct where statement behaviour if p_id is not specified, together with the code:

  dbGetQuery( con,paste0("select MIN(completion_date) from run ",write_pid_clause(p_id))))

This works properly, however, I struggle if I wish to insert more conditions in the where clause, e.g. if I wanted to add the condition and status = 'complete' when p_id = NULL.

Does anyone have any good ideas on how I can elegantly do this in R?

EDIT Here is some more code to demonstrate what I am trying to do, this is with respect to and clauses after a join (doing the where is a little bit trickier)

make_and_clauses = function(p_id = "",start_date="", end_date=""){

 conditions = list(
   list(" and r.project_id ='", p_id,"'"),
   list(" and r.completion_date >= '",start_date,"'"),
   list(" and r.completion_date <= '", end_date, "'"))
 condition_values = c(p_id,start_date, end_date)
 conditions[which(condition_values =="")] <- ""
 conditions = unlist(conditions,recursive=TRUE)
 paste0(conditions,collapse="")

}

which gives output

> make_and_clauses(2,3,4)
[1] " and r.project_id ='2' and r.completion_date >= '3' and r.completion_date <= '4'"
> make_and_clauses(2,,4)
[1] " and r.project_id ='2' and r.completion_date <= '4'"
> make_and_clauses(,3,2)
[1] " and r.completion_date >= '3' and r.completion_date <= '2'"

>
Community
  • 1
  • 1
Alex
  • 15,186
  • 15
  • 73
  • 127
  • 2
    I don't think there is an easy/simple/clear way to achieve what you want. If you want to be able to add multiple clauses, each dependent upon the contents of a variable, you have to build up each one. The only relevant SQL trick I can think of is to put `WHERE 1=1 ` in the SQL statement, then add `AND column = value` for each value. This way, if you leave one out, the syntax is still correct. – Turophile May 13 '14 at 06:33
  • Hi, that actually sounds really good. See the new code I have written to handle multiple missing cases. In that case the `Where 1 = 1` command is exactly what I need to handle the case where I have to use `where`. If you post it as an answer I'l upvote :) – Alex May 13 '14 at 06:37

2 Answers2

1

I don't think there is an easy/simple/clear way to achieve what you want. If you want to be able to add multiple clauses, each dependent upon the contents of a variable, you have to build up each one. The only relevant SQL trick I can think of is to put WHERE 1=1 in the SQL statement, then add AND column = value for each value. This way, if you leave one out, the syntax is still correct.

I see that @Marek has provided an answer with code; perhaps you can combine the answers to produce a workable solution.

Turophile
  • 3,367
  • 1
  • 13
  • 21
  • `status` could be null as well. Also, the where clause may contain more than 2 conditional checks, and it becomes impractical to write all the possible if clauses. – Alex May 13 '14 at 06:16
1

Start with your function I proposed something like:

make_and_clauses <- function(p_id, start_date, end_date) {
    conditions = c(
        if (!missing(p_id)) sprintf("r.project_id ='%i'", p_id),
        if (!missing(start_date)) sprintf("r.completion_date >= '%i'", start_date),
        if (!missing(end_date)) sprintf("r.completion_date <= '%i'", end_date)
    )
    paste(conditions, collapse=" and ")
}

which returns:

make_and_clauses(2,3,4)
# [1] "r.project_id ='2' and r.completion_date >= '3' and r.completion_date <= '4'"
make_and_clauses(2,,4)
# [1] "r.project_id ='2' and r.completion_date <= '4'"
make_and_clauses(,3,2)
# [1] "r.completion_date >= '3' and r.completion_date <= '2'"
make_and_clauses(,3,)
# [1] "r.completion_date >= '3'"

What important that no arguments at all leads to:

make_and_clauses(,,)
# [1] ""

So it's possible to write WHERE clause:

make_and_clauses <- function(p_id, start_date, end_date) {
    conditions = c(
        if (!missing(p_id)) sprintf("r.project_id ='%i'", p_id),
        if (!missing(start_date)) sprintf("r.completion_date >= '%i'", start_date),
        if (!missing(end_date)) sprintf("r.completion_date <= '%i'", end_date)
    )
    if (length(conditions)>0) paste("WHERE", paste(conditions, collapse=" and ")) else ""
}

and use it as

paste("select MIN(completion_date) from run", make_and_clauses(,3,2))
# [1] "select MIN(completion_date) from run WHERE r.completion_date >= '3' and r.completion_date <= '2'"
paste("select MIN(completion_date) from run", make_and_clauses())
# [1] "select MIN(completion_date) from run "
Marek
  • 49,472
  • 15
  • 99
  • 121
  • Thanks. This is exactly what I needed, without resorting to `where 1 = 1`. I particularly like the usage of `!missing` and `sprintf`. Now to convert the first batch of code to something which can handle an arbitrary list of arguments. – Alex May 14 '14 at 04:51
  • something like `lapply( as.list(args(make_and_clauses), function(arg) {if(!missing(arg)) sprintf(arg, eval(arg)})`... – Alex May 14 '14 at 04:56