0

I have a question about using subqueries in BigQuery in Datalab. This is specific to google.datalab.bigquery.Query. I can use subqueries with %bq command-line in Datalab just fine.

Assuming, I did these in Datalab by using %bq as:

%bq query --name my_table1
select col1, col2 from dataset1.table1

%bq query --subqueries my_table1
select count(col1) as some_count from my_table1 where col1 is null

Above %bq command-line codes work fine. But, I would like to use python Datalab API to do it in a more programmatic way. So in Datalab, I did:

sql_str1 = '''select col1, col2 from dataset1.table1'''

my_table1 = bq.Query(sql_str1)

sql_str2 = '''select count(col1) as some_count from my_table1 where col1 is null'''

bq.Query(sql_str2, subqueries= my_table1).execute().result()

Then, I got error messages:

TypeErrorTraceback (most recent call last)
<ipython-input-6-c625b8e326b9> in <module>()
----> 1 bq.Query(sql_command, subqueries=web_activity).execute().result()

/usr/local/lib/python2.7/dist-packages/google/datalab/bigquery/_query.pyc in __init__(self, sql, env, udfs, data_sources, subqueries)
     79 
     80     if subqueries:
---> 81       _expand_objects(subqueries, Query, self._subqueries)
     82     if udfs:
     83       _expand_objects(udfs, _udf.UDF, self._udfs)

/usr/local/lib/python2.7/dist-packages/google/datalab/bigquery/_query.pyc in _expand_objects(obj_container, obj_type, target_list)
     59     # and add them to the target dictionary
     60     def _expand_objects(obj_container, obj_type, target_list):
---> 61       for item in obj_container:
     62         # for a list of objects, we should find these objects in the given environment
     63         if isinstance(obj_container, list):

TypeError: 'Query' object is not iterable

But according to document at http://googledatalab.github.io/pydatalab/google.datalab.bigquery.html, I should be able to use this:

class google.datalab.bigquery.Query(sql, env=None, udfs=None, data_sources=None, subqueries=None)

What did I do wrong? Any suggestions?

Choppy
  • 372
  • 1
  • 3
  • 9

1 Answers1

1

You just need to pass in an array:

bq.Query(sql_str2, subqueries=[my_table1]).execute().result()
yelsayed
  • 5,236
  • 3
  • 27
  • 38
  • Thanks yelsayed. But, I got another error message when I try to use list of subqueries. So, I went back to the document. Then, I tried to put subqueries into a dictionary. Finally, I got a successful query result! – Choppy Oct 04 '17 at 13:01