1

I am using CosmosDB (Azure documentDB) in my project, written in Python 3.

I have been looking for a while now, but I cannot find out how to query my table. I have seen some example code, but I do not see an example of how to query... all I can do is get all documents (not ideal when my DB is > 80GB).

The GitHub repo shows a very tiny set of operations for database and collections: https://github.com/Azure/azure-documentdb-python/blob/master/samples/CollectionManagement/Program.py

And the following SO post shows how to read all documents... but not how to perform querying such as "WHERE = X;"

I'd really appreciate it if someone can point me in the right direction, and possibly supply an example showing how to run queries.

pookie
  • 3,796
  • 6
  • 49
  • 105

3 Answers3

5

Based on my understanding, I think you want to know how to perform a SQL-like query using Python to retrieve documents on Azure CosmosDB of DocumentDB API, please refer to the code below from here.

A query is performed using SQL

# Query them in SQL
query = { 'query': 'SELECT * FROM server s' }    

options = {} 
options['enableCrossPartitionQuery'] = True
options['maxItemCount'] = 2

result_iterable = client.QueryDocuments(collection['_self'], query, options)
results = list(result_iterable);

print(results)

The above code is using the method QueryDocuments.

Any concern, please feel free to let me know.


Update: Combine with my sample code for the other SO thread you linked, as below.

from pydocumentdb import document_client

uri = 'https://ronyazrak.documents.azure.com:443/'
key = '<your-primary-key>'

client = document_client.DocumentClient(uri, {'masterKey': key})

db_id = 'test1'
db_query = "select * from r where r.id = '{0}'".format(db_id)
db = list(client.QueryDatabases(db_query))[0]
db_link = db['_self']

coll_id = 'test1'
coll_query = "select * from r where r.id = '{0}'".format(coll_id)
coll = list(client.QueryCollections(db_link, coll_query))[0]
coll_link = coll['_self']

query = { 'query': 'SELECT * FROM server s' }    
docs = client.QueryDocuments(coll_link, query)
print list(docs)
Peter Pan
  • 23,476
  • 4
  • 25
  • 43
  • Thanks, can you please define `collection` for completeness? I have an existing collection, I am not creating one. – pookie Jun 26 '17 at 09:14
  • @pookie Please see my update, it combined with my sample code for the other SO thread you linked. Is it you want? – Peter Pan Jun 26 '17 at 09:20
  • Thanks, this worked with some modifications: need to include the `options` in the `QueryDocuments` call, changed `print` to Python3, `list` does not work (causes program to hang). I instead used: constraint = 'something' query = "SELECT * FROM r WHERE r. ='{0}'".format(constraint) docs = client.QueryDocuments(coll_link, query, options) for doc in docs: print(doc) I could not edit your answer as it kept telling me that there was code not formatted correctly (it was a lie!) – pookie Jun 26 '17 at 09:56
  • I am following this same procedure however I cannot get my query to work, I use an `order by` clause like so; `SELECT * FROM c ORDER BY c._ts desc` – mp252 Aug 22 '18 at 09:35
1
query = 'SELECT * FROM c'
docs = list(client.QueryItems(coll_link,query))

QueryDocuments has been replaced with QueryItems.

Anurag
  • 117
  • 10
0

I have a similar problem recently. You can fetch blocks (not entire query set) by calling fetch_next_block().

query = "select * from c"
options = {'maxItemCount': 1000, 'continuation': True}
q = db_source._client.QueryDocuments(collection_link, query, options)
block1 = q.fetch_next_block()
block2 = q.fetch_next_block()
Boyang
  • 93
  • 1
  • 6