We have found BigQuery to work great on data sets larger than 100M rows, where the 'initialization time' doesn't really come into effect (or is negligible compared to the rest of the query).
However, on anything under that, the performance is quite slow and poor, which makes it (1) ill-suited to working in an interactive BI tool; and (2) inferior to other products, such as Redshift or even ElasticSearch where the data size is under 100M rows. Actually, we had an engineer at our organization that was evaluating a technology for doing queries on data sizes between 1M and 100M rows for an analytics product that has about 1000 users, and his feedback was that he could not believe how slow BigQuery was.
Without a defense of the BigQuery product, I was wondering if there were any plans on improving:
- The speed of BigQuery -- especially its initialization time -- on queries of non-massive data sets?
- Will BigQuery ever be able to deliver sub-second response times on 'regular' queries (such as a simple aggregation group by) on datasets under a certain size?