3

Athena has some default service limits that can help ~ cap the cost from accidental "runaway" queries on a large data lake in S3. They are not great (based on ~ time, not volume of data scanned), but it's still helpful.

What about Redshift Spectrum? What mechanisms does it provide can be easily used to cap cost or mitigate the risk of "accidentally" scanning too much data in a single runaway query against S3? What's a good way of tackling this problem?

Amelio Vazquez-Reina
  • 91,494
  • 132
  • 359
  • 564

1 Answers1

5

Amazon Redshift allows you to apply granular controls over Spectrum query execution using WLM Query Monitoring Rules.

There are 2 Spectrum metrics available: Spectrum scan size (Number of mb scanned by the query) and Spectrum scan row count (Number of rows scanned by the query).

You can also use Query execution time to enforce a maximum duration but this will apply to all query types not just Spectrum.

Please note that these are sampled metrics. Queries are not aborted at precisely the point when they exceed the rule, they are aborted at the next sample interval.

If you have been running Spectrum queries on your cluster already you can get started with QMR by using our script wlm_qmr_rule_candidates to generate candidate rules. The generated rules are based on the 99th percentiles for each metric.

Joe Harris
  • 13,671
  • 4
  • 47
  • 54
  • Thanks @joe. That's helpful. What about things like the rate of queries executed (e.g. #/hour)? or the total number of queries it runs concurrently? I don't see them in the list of metrics/rules. Any ideas on them? – Amelio Vazquez-Reina Aug 06 '18 at 18:17
  • 1
    Concurrent queries on Redshift are governed by the cluster's WLM configuration. Each WLM queue allows a specific number of concurrent queries. This applies to Spectrum queries the same as "normal" queries because Spectrum query execution is shared between the Redshift cluster and the Spectrum layer. – Joe Harris Aug 07 '18 at 13:03
  • 1
    Queries per hour would need to be calculated externally. – Joe Harris Aug 07 '18 at 13:04