3

I have an issue with cosmos SQL db python sdk and I have no idea how to fix it.

I have a data explorer with some data in. And I am using python sdk to query this data and save the output in a json file. So far everything works just fine. But I wanted to take it to the next step, and rather than saving this query result into a json file, I would like to pass this query result directly to a cosmosdb to be stored.

and here is the main problem.

I followed the guide about azure-cosmos. connected to my cosmosdb and I am able to connect using python.

Than I used this block of code:

######################################################
##                   COSMOS-DB                      ##
######################################################

url = "<my-url>"
key = "my-key"
client = CosmosClient(url, key)
database_name = "My-Database"
container_name = "Table"
database = client.get_database_client(database_name)
container = database.get_container_client(container_name)
data = json.dumps(str(df))
data_dict = json.loads(data)
print(data_dict)
container.create_item(body=str(data_dict))

the df is a data frame which was giving me problems, so I parsed it to a dictionary.

but when I try to use the container.createitem(body=data_dict)

I get this error:

Traceback (most recent call last):
  File "query.py", line 72, in <module>
    container.create_item(body=data_dict)
  File "/Users/user/opt/anaconda3/lib/python3.8/site-packages/azure/core/tracing/decorator.py", line 83, in wrapper_use_tracer
    return func(*args, **kwargs)
  File "/Users/user/opt/anaconda3/lib/python3.8/site-packages/azure/cosmos/container.py", line 511, in create_item
    result = self.client_connection.CreateItem(
  File "/Users/user/opt/anaconda3/lib/python3.8/site-packages/azure/cosmos/_cosmos_client_connection.py", line 1084, in CreateItem
    options = self._AddPartitionKey(database_or_container_link, document, options)
  File "/Users/user/opt/anaconda3/lib/python3.8/site-packages/azure/cosmos/_cosmos_client_connection.py", line 2512, in _AddPartitionKey
    partitionKeyValue = self._ExtractPartitionKey(partitionKeyDefinition, document)
  File "/Users/user/opt/anaconda3/lib/python3.8/site-packages/azure/cosmos/_cosmos_client_connection.py", line 2526, in _ExtractPartitionKey
    return self._retrieve_partition_key(partition_key_parts, document, is_system_key)
  File "/Users/user/opt/anaconda3/lib/python3.8/site-packages/azure/cosmos/_cosmos_client_connection.py", line 2539, in _retrieve_partition_key
    partitionKey = partitionKey.get(part)
AttributeError: 'str' object has no attribute 'get'

I am totally lost at this point and I don't understand how to solve this issue.

UPDATE: this is the data I am trying to pass to cosmos:

[
  {
    "_timestamp": 1622036400000,
    "name": "User Log Off",
    "message": "message",
    "userID": "userID",
    "Events": "SignOff event",
    "event_count": 1
  },
  {
    "_timestamp": 1622035800000,
    "name": "User Log Off",
    "message": "message",
    "userID": "userID",
    "Events": "SignOff event",
    "event_count": 1
  }
]

those are just 2 samples of the whole array, they are around 300

I fixed the previous error.

Now I have a proper json file being dumps. Which it looks like the one previously posted. I run the container.create_item(item) but I got this error:

azure.cosmos.exceptions.CosmosHttpResponseError: (BadRequest) Message: {"Errors":["The input content is invalid because the required properties - 'id; ' - are missing"]}

I was confident that cosmos will add the id automatically

Nayden Van
  • 1,133
  • 1
  • 23
  • 70
  • Can you edit your question and provide how your input data looks like? – Gaurav Mantri Jun 03 '21 at 14:58
  • Is this what is getting passed to `create_item` method? Or in other words, is your `data_dict` is an array? – Gaurav Mantri Jun 03 '21 at 15:37
  • Also, what's the partition key for the container that you created? – Gaurav Mantri Jun 03 '21 at 15:38
  • I am so sorry mate. I just realised that my IDE wast running properly. I restarted and dumped the json into a file. and what I found is that I am passing to create_item, is ONE string with all the objects inside. I am even more confused now. My `partitionKey` is `/_timestamp`. I am so sorry to bother you with this issue but I am completely new to this. – Nayden Van Jun 03 '21 at 15:46
  • 1
    What you have to do is loop through each item in the array (`data_dict`) and save each item separately. – Gaurav Mantri Jun 03 '21 at 15:49
  • Done, I got the error about the id being missing – Nayden Van Jun 03 '21 at 15:53
  • Please see my answer below. You will need to import `uuid` package and assign a random GUID as id to the document. – Gaurav Mantri Jun 03 '21 at 15:57

1 Answers1

2

Considering your data_dict is an array of items, what you would want to do is loop through this array and save each item separately.

Please try this code:

import uuid

url = "<my-url>"
key = "my-key"
client = CosmosClient(url, key)
database_name = "My-Database"
container_name = "Table"
database = client.get_database_client(database_name)
container = database.get_container_client(container_name)
data = json.dumps(str(df))
data_dict = json.loads(data)
print(data_dict)
#Loop through each item in your "data_dict" array.
for item in data_dict:
    #Assign id to the item
    item['id'] = str(uuid.uuid4())
    print(item)
    container.create_item(body=item)
Gaurav Mantri
  • 128,066
  • 12
  • 206
  • 241