BigQuery's documentation on cost optimization states that:
BigQuery can provide incredible performance because it stores data as a columnar data structure. This means SELECT * is the most expensive way to query data. This is because it will perform a full query scan across every column present in the table(s), including the ones you might not need.
However I could not find anywhere in the documentation that discuss whether the BigQuery engine optimizes subqueries/virtual views that use SELECT *
or not. Examples:
(1)
CREATE VIEW my_view AS ( SELECT * FROM my_table );
SELECT a, b FROM my_view -- does this only access {a, b} or all columns?
(2)
SELECT a, b FROM (SELECT * FROM my_table) AS t -- does this only access {a, b} or all columns?
So the question is: Does BigQuery optimize subquery/view SELECT * to minimize costs, and if so, does it do it reliably?