0

So I'm using these three packages in R in order to connect to some databases and query them:

library(rJava)
library(RJDBC)
library (RPostgreSQL)

id_query <- dbGetQuery(conn2, "SELECT b.id id FROM table1 a LEFT JOIN table2 b ON a.id = b.id WHERE a.id = 1684 AND b.id <> 40378;")

This produces an output as such:

id
25559
30352
15352
17587
16480
16296
40449
34962
25827
37282

But then I want to take those results and paste them into a WHERE clause that uses IN

results_query <- dbGetQuery(con, "SELECT
i.event_date,
                              i.id,
                              i.id2,
                              i.id3,
                              i.id4,
                              i.id5,
                              COUNT(i.sales) sales,
                              COUNT(c.volume) volume
                              FROM table1 i
                              LEFT JOIN
                              table2 c
                              ON i.id = c.id
                              AND i.id2 = c.id2
                              AND i.id3 = c.id3
                              WHERE i.event_date = DATE('2018-06-18')
                              AND i.id IN (**RESULTS FROM id_query**)
                              GROUP BY 1,2,3,4,5,6
                              LIMIT 10
                              ;")

So I would like for that line to read like this:

AND i.id IN (25559, 30352, 15352, 17587, 16480, 16296, 40449, 34962, 25827, 37282)

I've tried doing this:

results_query <- dbGetQuery(con, "SELECT
    i.event_date,
                                  i.id,
                                  i.id2,
                                  i.id3,
                                  i.id4,
                                  i.id5,
                                  COUNT(i.sales) sales,
                                  COUNT(c.volume) volume
                                  FROM table1 i
                                  LEFT JOIN
                                  table2 c
                                  ON i.id = c.id
                                  AND i.id2 = c.id2
                                  AND i.id3 = c.id3
                                  WHERE i.event_date = DATE('2018-06-18')
                                  AND i.id IN (", paste(id_query$id, collapse = ", "), ")
                                  GROUP BY 1,2,3,4,5,6
                                  LIMIT 10
                                  ;")

Which is syntax I got from this answer but it yields this error:

Error in .verify.JDBC.result(s, "Unable to execute JDBC prepared statement ",  : 
  Unable to execute JDBC prepared statement SELECT
        i.event_date,
                                      i.id,
                                      i.id2,
                                      i.id3,
                                      i.id4,
                                      i.id5,
                                      COUNT(i.sales) sales,
                                      COUNT(c.volume) volume
                                      FROM table1 i
                                      LEFT JOIN
                                      table2 c
                                      ON i.id = c.id
                                      AND i.id2 = c.id2
                                      AND i.id3 = c.id3
                                      WHERE i.event_date = DATE('2018-06-18')
                                      AND i.id IN  ( (Method Connection.prepareStatement is not yet implemented)

Anyone have either:

A.) A solution to fixing my current query OR

B.) An alternative?

  • Instead of `(**RESULTS FROM id_query**)` you could put the actual query inside the parentheses. – wildplasser Jun 27 '18 at 19:14
  • @wildplasser that would not work because the queries are from two completely different databases. FYI I'm not actually putting `(**RESULTS FROM id_query**)` exactly like that. I'm sure you're aware but just wanted to clarify. –  Jun 27 '18 at 19:20
  • Curious, what is the other database you use for RJDBC (i.e., *con* object)? – Parfait Jun 27 '18 at 20:10
  • @parfait one is a Redshift database (the first one) and the second query is using the AWS Athena database. They both require separate logins and everything. Unless I'm wrong and you can combine them and do something like: `i.id IN (id_query <- dbGetQuery(conn2, "SELECT b.id id FROM table1 a LEFT JOIN table2 b ON a.id = b.id WHERE a.id = 1684 AND b.id <> 40378;")` –  Jun 27 '18 at 20:14

1 Answers1

1

try put a paste(..., sep = '') in your original query:

results_query <- dbGetQuery(con,  paste("SELECT 
                                      i.event_date,
                                      i.id,
                                      i.id2,
                                      i.id3,
                                      i.id4,
                                      i.id5,
                                      COUNT(i.sales) sales,
                                      COUNT(c.volume) volume
                                      FROM table1 i
                                      LEFT JOIN
                                      table2 c
                                      ON i.id = c.id
                                      AND i.id2 = c.id2
                                      AND i.id3 = c.id3
                                      WHERE i.event_date = DATE('2018-06-18')
                                      AND i.id IN (", paste(id_query$id, collapse = ", "), ")
                                      GROUP BY 1,2,3,4,5,6
                                      LIMIT 10
                                      ;", sep = ''))
SeaSprite
  • 564
  • 8
  • 12
  • That still gives me the same error with: `AND i.id IN ( (Method Connection.prepareStatement is not yet implemented)` –  Jun 27 '18 at 19:34
  • Are you sure you wrapped the SQL inside a larger `paste` call as this answer recommends? This is NOT the same as your code attempt. – Parfait Jun 27 '18 at 20:50