11

I'm running a query over multiple tables unioned together over a particular time range.

In the past, a particular field did not exist in the "schema" but about halfway through that time range, the field began its existence and started getting populated with data.

Is there a way to conditionally select it if it exists, else arbitrarily populate a named field with a value?

like this:

SELECT
  (CASE WHEN exists(my_field) THEN my_field ELSE "0" END) as "my_field"
FROM <somewhere>
Kristian
  • 21,204
  • 19
  • 101
  • 176
  • in regular sql, apparently it cannot be done: http://stackoverflow.com/questions/16952442/select-columnvalue-if-the-column-exists-otherwise-null so, is that also true of BigQuery's SQL functions? – Kristian Aug 28 '15 at 17:50
  • 1
    This has been asked multiple times and it's not possible. – Pentium10 Aug 28 '15 at 17:57
  • @Pentium10 for BQ specifically? or is the problem the same no matter what platform you're talking about? – Kristian Aug 28 '15 at 18:07
  • 2
    Create views on the table that have all the columns. Columns not in the data can be given a `NULL` value. Then query through the views. – Gordon Linoff Aug 28 '15 at 18:15

1 Answers1

12

Below should give you direction

SELECT * FROM
(SELECT * FROM <somewhere w/o my_field>),
(SELECT * FROM <somewhere with my_field>)

Assuming you have a, b and c as a fields in your original table () - above can be used (see below) if you need to change missing values from NULL to 0:

SELECT a, b, c, COALESCE(my_field, 0) as my_field
FROM
(SELECT * FROM <somewhere w/o my_field>),
(SELECT * FROM <somewhere with my_field>)
Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230
  • creative solution! i wonder, though, if the initial aggregation of all data skips over any performance you'd gain through the mapping and reducing that happens on the backend. i predict it would be the same as normal – Kristian Aug 29 '15 at 01:31
  • 4
    BigQuery engineer here, and this is the right solution. It has zero performance overhead. One caveat, though, is that you may get errors if your source tables contain independently repeating fields. (Also, this class of errors is a bug in BQ that we hope to fix soonish. Unfortunately, the fix is more complex than it might seem, so it might take some time.) – Jeremy Condit Aug 29 '15 at 04:19
  • 1
    old answer, but is there a solution for standard sql? – Sagi Mann Jun 28 '20 at 14:27