I have a partitioned table and am trying to limit my search to a few partitions. To do this I am running a query (using legacy SQL) that looks like the following:
SELECT
*
FROM
[project:dataset.table]
WHERE
_PARTITIONTIME >= "2018-07-10 00:00:00"
AND _PARTITIONTIME < "2018-07-11 00:00:00"
AND col IN (
SELECT
col
FROM
[project:dataset.table]
WHERE
_PARTITIONTIME >= "2018-07-10 00:00:00"
AND _PARTITIONTIME < "2018-07-11 00:00:00"
AND col2 > 0)
I limit the main query and the subquery using _PARTITIONTIME
, so big query should only need to search those partitions. When I run this query though I get billed as if I'd just queried the entire table without using _PARTITIONTIME
. Why does this happen?
UPDATE The equivalent query using standard SQL does not have this problem, so use that as a workaround. I'd still like to know why this happens though. If it's just a bug or if legacy SQL actually does attempt to access all the data in a table for a query like this.