0

I have a Cloud Spanner query that is the UNION ALL of potentially many subqueries, which are all identical except for a few parameters that vary with each subquery:

(SELECT t.FirstName, t.BirthDate FROM Singers AS t WHERE SingerId=2 AND t.LastName="Trentor")
UNION ALL
(SELECT t.FirstName, t.BirthDate FROM Singers AS t WHERE SingerId=3 AND t.LastName="Smith")
UNION ALL
...

It is sometimes taking several seconds for this query to execute. Is there something I can do to make it faster?

Dan McGrath
  • 41,220
  • 11
  • 99
  • 130
Mike Curtiss
  • 1,838
  • 2
  • 17
  • 33

1 Answers1

0

It might help for you to parameterize your query. This will also make your query more concise. Spanner has a query cache that stores the shapes of recent queries so that it can use the same execution plan if it sees another query of similar shape. A UNION ALL of 20 subqueries has a different shape than the UNION ALL of 21 subqueries, so this may be thwarting the cache. It's a bit tricky to pass in an array of structs (i.e. n-tuples) as query parameters, but here's an approach that works:

# bind seq1 to [2,3] in your query parameters
# bind seq2 to ["Trentor","Smith"] in your query parameters

SELECT Singers.FirstName,Singers.BirthDate FROM
 (SELECT *
  FROM
  (SELECT id, x_1 FROM UNNEST(@seq1) AS id WITH OFFSET AS x_1)
  JOIN
  (SELECT name, x_2 FROM UNNEST(@seq2) AS name WITH OFFSET AS x_2)
  ON x_1 = x_2) AS params
JOIN Singers
ON params.id=Singers.SingerId AND params.name=Singers.LastName

The inner query that produces params is zipping seq1 and seq2 together, producing a table containing correlated entries from seq1 and seq2.

Mike Curtiss
  • 1,838
  • 2
  • 17
  • 33