0

What's the problem?

I'm trying to select two different date periods at once within a query (to be used in Data Studio), to do some complex period comparison calculations.

This has to all happen in the query, because it's getting used in Data Studio (and other reporting platforms).

However the logic I'm using to prune the partitions, seems to be inconsistent on the BigQuery side.

One of them prunes partitions correctly, the other doesn't. Some times variations seem to use arbitrary amounts of data.

Examples of partition pruning

Here's the first example. It either selects the correct amount ~ 12MB. Or the entire table ~5GB.

SELECT
  columns...
FROM
  table
WHERE
    (
      date(_PARTITIONTIME) >= PARSE_DATE('%Y%m%d', "20220509")
      AND date(_PARTITIONTIME) <= PARSE_DATE('%Y%m%d', "20220509")
    ) 
  OR
    (
      date(_PARTITIONTIME) >= DATE_SUB(PARSE_DATE('%Y%m%d', "20220509"), INTERVAL 1 DAY)
      AND date(_PARTITIONTIME) <= DATE_SUB(PARSE_DATE('%Y%m%d', "20220509"), INTERVAL 1 DAY)
    )
    AND REGEXP_CONTAINS (path, r"/c/")

enter image description here enter image description here

I can't consistently recreate the 12MB version, it's more typically returning the 5GB version.

Question 1: I'm assuming this is some sort of weird cache thing, where it's somehow run the date calculation before?

Question 2: I'm also not sure why it's not pruning the tables here as I'm not doing any subqueries, just some date calculations. Why is it pruning?

Dominic Woodman
  • 719
  • 2
  • 8
  • 18
  • It appears that this issue has to be investigated further, so if you have a support plan please create a new GCP [support case](https://cloud.google.com/support/). Otherwise, you can open a new issue on the [issue tracker](https://cloud.google.com/support/docs/issue-trackers) describing your issue. – Prajna Rai T Dec 03 '22 at 11:58

0 Answers0