0

How do I return a string with only single quotes from the glue::glue_sql command? I have tried two different approaches:

Approach 1

con <- RAthena::dbConnect(RAthena::athena(),
                 s3_staging_dir = s3_staging_dir)

column_values <- c("thing1","thing2","thing3")

query1 <- glue::glue_sql("SELECT COUNT (*) as venues FROM schema.table WHERE column IN ({column_values*})", .con = con)
query1

This returns:

'SELECT COUNT (*) as venues FROM schema.table WHERE column IN (\'thing1\', \'thing2\', \'thing3\')'

Approach 2

con <- RAthena::dbConnect(RAthena::athena(),
                 s3_staging_dir = s3_staging_dir)

column_values <- c("thing1","thing2","thing3")

query2 <- glue::glue_sql("SELECT COUNT (*) as venues FROM schema.table WHERE column IN ({`column_values`*})", .con = con)
query2

this returns:

'SELECT COUNT (*) as venues FROM schema.table WHERE column IN ("thing1", "thing2", "thing3")'

My desired output

"SELECT COUNT (*) as venues FROM schema.table WHERE column IN ('thing1', 'thing2', 'thing3')"
r2evans
  • 141,215
  • 6
  • 77
  • 149
Douglas
  • 223
  • 1
  • 9
  • The use of single-quotes or double-quotes is DBMS-specific, but I believe that Athena uses single-quotes for string literals; because of this I think your approach 1 is correct, and approach 2 is wrong since double-quotes do not define string literals. What happens when you send approach 1 to the database? – r2evans Oct 07 '22 at 14:46
  • 1
    I get `Error: SYNTAX_ERROR: line 1:86: Column 'thing1' cannot be resolved` However I think the issue is coming from me using R in a jupyterlab notebook. When I run the same command in rstudio everything looks fine. Don't know how to resolve this issue... – Douglas Oct 07 '22 at 14:50
  • Gotcha. I have no experience with jupyterlab, unfortunately, but I've added the tag, hope somebody else can provide insight. – r2evans Oct 07 '22 at 14:54

2 Answers2

0

Name your vector:

col_val <- c(t1 = "thing1", t2 = "thing2", t3 = "thing2")

Then use the {vars*} syntax:

query3 <- glue::glue_sql("SELECT COUNT (*) as venues FROM schema.table WHERE column IN ({col_val*})", .con = con)
query3 

<SQL> SELECT COUNT (*) as venues FROM schema.table WHERE column IN ('thing1', 'thing2', 'thing2')

Edit: see discussion below - this appears to be local to me and NOT a good answer to the question.

Paul Stafford Allen
  • 1,840
  • 1
  • 5
  • 16
  • What does adding names to `col_val` do here? How is your `col_val` different from the OP's `column_values`? When I run your code with and without names in `col_val`, there is no difference in the output. – r2evans Oct 07 '22 at 14:50
  • As I stated in my comment above, The real issue is doing this in jupyterlab. Perhaps a different question altogether. – Douglas Oct 07 '22 at 14:52
  • @r2evans when It tested this I was getting different outputs depending on named or unnamed variables. I'll have a poke around and see if I can figure out why. If it's not happening at your end it must be something specific to my setup or environment. – Paul Stafford Allen Oct 10 '22 at 07:08
0

Okay the real issue was I am using R in jupyterlab. All I needed to do was to print() the string and the backslashes were removed.

Similar stacks answer: python string adding backslash before single quotes

con <- RAthena::dbConnect(RAthena::athena(),
                 s3_staging_dir = s3_staging_dir)

column_values <- c("thing1","thing2","thing3")

query1 <- glue::glue_sql("SELECT COUNT (*) as venues FROM schema.table WHERE column IN ({column_values*})", .con = con)
print(query1)

returns:

<SQL> SELECT COUNT (*) as venues FROM schema.table WHERE column IN ('thing1', 'thing2', 'thing3')
Douglas
  • 223
  • 1
  • 9