2

I'm learning how to query SQLite dbs from R, and building those queries with glue_sql(). Below is a simplified example of a sub-query from my workflow. Is there a way I can create s10_wtX and s20_wtX without using paste0(), as in the code below?

library(DBI)
library(dplyr)
library(glue)

# example database
set.seed(1)
ps <- data.frame(plot = rep(1:3, each = 4),
                 spp = rep(1:3*10, 2),
                 wtX = rnorm(12, 10, 2) %>% round(1))
con <- dbConnect(RSQLite::SQLite(), "")
dbWriteTable(con, "ps", ps)

# species of interest
our_spp <- c(10, 20)

# for the spp of interest, sum wtX on each plot
sq <- glue_sql(paste0(
  'SELECT ps.plot,\n',
  paste0('SUM(CASE WHEN ps.spp = ', our_spp,
         ' THEN (ps.wtX) END) AS s', our_spp,
         '_wtX',
         collapse = ',\n'), '\n',
  '  FROM ps
    WHERE ps.spp IN ({our_spp*}) -- spp in our sample
    GROUP BY ps.plot'),
  .con = con)

# the result of the query should look like:
dbGetQuery(con, sq)
  plot s10_wtX s20_wtX
1    1    21.9    10.4
2    2    11.0    22.2
3    3     9.4    13.0

In my actual workflow, I have more than two species of interest, so I'd rather not fully write out each line (e.g., SUM(CASE WHEN ps.spp = 10 THEN (ps.wtX) END) AS s10_wtX).

CzechInk
  • 443
  • 2
  • 13
  • You could more easily do simple aggregation in SQL and pivot in R, are you trying to avoid that? – r2evans Aug 13 '21 at 22:20
  • With `glue` you can use `{our_spp}` and avoid the `paste` – akrun Aug 13 '21 at 22:20
  • 2
    (From my previous comment) For instance, you can use `dbGetQuery(con, "select ps.plot, ps.spp, sum(ps.wtX) as wtX from ps where ps.spp in (10,20) group by ps.plot, ps.spp") %>% tidyr::pivot_wider(plot, names_from="spp", values_from="wtX")` (if you can use `dplyr+tidyr`, similar results with `reshape2` or `data.table`) to get the desired output. – r2evans Aug 13 '21 at 22:21
  • @r2evans - great call, I have this nested within a few other queries, but that logic works -- I'll give that a try! – CzechInk Aug 13 '21 at 22:26
  • 1
    I'm not sure if you only have [tag:sqlite], but other DBMS have a `PIVOT` operator. E.g., in Oracle it would be `SELECT * FROM (SELECT ps.plot, ps.spp, ps.wtX FROM ps WHERE ps IN (10, 20)) PIVOT (SUM(wtX) FOR spp IN (10 as s10_wtx, 20 as s20_wtx)`. But... doing processing in R probably makes more sense. – Cole Aug 13 '21 at 22:32
  • @Cole, unfortunately it has to be SQLite, so alas, no `PIVOT` option (to my knowledge) – CzechInk Aug 13 '21 at 23:10
  • 1
    Marked @akrun's answer as it showed `glue_collapse()` and fit easily into my workflow; but will be revisiting @r2evans' idea to avoid `glue_sql()` when possible ++ – CzechInk Aug 13 '21 at 23:26

2 Answers2

2

To formalize this a little (even if it is not what you ultimately use), here are my comments, in detail:

out <- DBI::dbGetQuery(con, "
  select ps.plot, ps.spp, sum(ps.wtX) as wtX
  from ps
  where ps.spp in (10,20)
  group by ps.plot, ps.spp")
out
#   plot spp  wtX
# 1    1  10 21.9
# 2    1  20 10.4
# 3    2  10 11.0
# 4    2  20 22.2
# 5    3  10  9.4
# 6    3  20 13.0

This can be easily pivoted to what you need. Using tidyr::pivot_wider, for instance,

tidyr::pivot_wider(out, plot, names_from="spp", values_from="wtX")
# # A tibble: 3 x 3
#    plot  `10`  `20`
#   <int> <dbl> <dbl>
# 1     1  21.9  10.4
# 2     2  11    22.2
# 3     3   9.4  13  

(The names will need to be cleaned up.)

r2evans
  • 141,215
  • 6
  • 77
  • 149
2

The OP's original question is

Is there a way I can create s10_wtX and s20_wtX without using paste0(), as in the code below?

If we want to construct only with glue, use glue_collapse as well

library(glue)
sq1 <- glue_sql('SELECT ps.plot,', glue_collapse(glue('SUM(CASE WHEN ps.spp = {our_spp} THEN (ps.wtX) END) AS s{our_spp}_wtX'), sep = ",\n"), '\nFROM ps\n WHERE ps.spp IN ({our_spp*}) -- spp in our sample\n    GROUP BY ps.plot', .con = con)
dbGetQuery(con, sq1)
  plot s10_wtX s20_wtX
1    1    21.9    10.4
2    2    11.0    22.2
3    3     9.4    13.0
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Curious (since I don't use `glue` frequently), is `glue_collapse` effectively just `paste`-ing it with `collapse=""`? – r2evans Aug 14 '21 at 00:24
  • 1
    It uses `paste` and `collapse` inside and returns the format in glue as it calls `as_glue` – akrun Aug 14 '21 at 00:26