I have a database that I want to query in R using duckdb. The two tables in question are large, 183 million rows by eight columns. When I execute the following code:
compareid <- dbGetQuery(con, "(SELECT UserId FROM d14072021 EXCEPT SELECT UserId FROM d15072021 ) UNION ALL
( SELECT UserId FROM d15072021 EXCEPT SELECT UserId FROM d14072021 )")
R straight-up crashes with a fatal error. Since I don't actually get an error message in-console I have no idea what's going wrong
I suspect it might be a memory issue? I've tried limiting the size by only looking at the first 100 rows:
compareid <- dbGetQuery(con, "(SELECT UserId FROM d14072021 EXCEPT SELECT UserId FROM d15072021 LIMIT 100) UNION ALL
( SELECT UserId FROM d15072021 EXCEPT SELECT UserId FROM d14072021 LIMIT 100)")
but this crashes all the same.
EDIT: I just found out that LIMIT is an output modifier, so this doesn't actually limit the number of rows the query selects.