2

Looking into getting near-realtime analytics data from bigquery, and considering costs vs. accuracy, it seems like using materialized view might be a great win.
Considering "near real time" will change to a minimum of 1 minute refresh_interval_minutes, my main concern is that while from documentation it seems that the query will be only on the delta data, the billing will be on the "standard" minimum of 10MB per table.
As I see it, if this minimum is being forced, it dismiss using materialized view as a valid solution for near-realtime.

I would have used "standard" caching over the queries, but caching does not work when querying table with "buffered data", as far as I understand it.

Please advise,
Shushu

Shushu
  • 774
  • 6
  • 19

1 Answers1

0

The following are key characteristics of BigQuery Materialized Views:

Zero maintenance: A materialized view is recomputed in background once the base table has changed. All incremental data changes from the base tables are automatically added to the materialized views. No user inputs are required.

In other words it means that incremental changes like streaming data into, is automatically added to the materialized view. That means you don't need to set 1 minute refresh period. The maximum refresh frequency cap is 7 days. You could set it to 7 days, or even disable manual refresh as you don't have deletes.

Pentium10
  • 204,586
  • 122
  • 423
  • 502
  • Thanks, it clarifies the issue of refreshing data. It gives no answer regarding to the main question, which is the costs. Does it mean we will pay only based on total incremental data, regardless of the number of usage ? – Shushu Aug 03 '20 at 06:15
  • What costs? What size of incremental data? You have these costs 1) loading data to the base table 2) refresh of materialized view as often as you set, perhaps every 7 days in your situation 3) query costs based on the query. The advantage is that over to the base table you are able to use MViews to have a smaller table for your queries, to have different clustering fields than the base table. This way you ensure even better optimized queries that reduce costs if clustering is used right. – Pentium10 Aug 03 '20 at 18:21
  • As far as I know, google bigquery bill a minimum of 10MB per query. With standard view - when I query the view, it triggers the query, and I am billed accordingly. If caching is enabled - I don't pay for querying the same data over and over again. What I don't know is how I will be billed with MViews - is in every time the view is used ? Will minimum 10MB is applied here ? I can't tell from docs. – Shushu Aug 03 '20 at 19:47
  • As you say a query has a minimum of 10mb. It's a query, so no mater if that is for a base table, for a view, for a materialized view etc. – Pentium10 Aug 03 '20 at 21:09