0

I hope somebody can help me solve this issue.

I have several storage account with some data in table storage. I want to loop over the table and copy their content into another storage account.

So far I got this far:

source_table_service_client = TableServiceClient(endpoint="https://<storagename>.table.core.windows.net/", credential=credential)
destination_table_service_client = TableServiceClient(endpoint="https://<storagename>.table.core.windows.net/", credential=credential)
query_size = 1000
def queryAndSaveAllDataBySize(source_table_name, target_table_name, resp_data:ListGenerator,source_table:TableService, destination_table:TableService, query_size:int):
    for item in resp_data:
        tb_name = source_table_name
        del item.etag
        del item.Timestamp
        print("INSERT data:" + str(item) + "into TABLE:" + tb_name)
        destination_table.insert_or_replace_entity(target_table_name, item)
    if resp_data.next_marker:
        data = table_out.query_entities(table_name=source_table_name,num_results=query_size,marker=resp_data.next_marker)
        queryAndSaveAllDataBySize(source_table_name, target_table_name, data, source_table, destination_table)


source_table_service_client_list = source_table_service_client.list_tables()
for tb in source_table_service_client_list:
    table = tb.name
    print(table)
    destination_table_service_client.create_table_if_not_exists(table_name=table)
    # first query
    data = source_table_service_client.query_tables(query_filter=str(tb.name))
    print(data)
    queryAndSaveAllDataBySize(tb.name, table, data, source_table_service_client, destination_table_service_client, query_size)

I am able to retrieve the tables and the objects as you can see

table2
<iterator object azure.core.paging.ItemPaged at 0x10491a040>

After this it gets stuck for a moment and return the following error

Traceback (most recent call last):
  File "/Users/users/Documents/GitHub/my-project/venv/lib/python3.8/site-packages/azure/data/tables/_models.py", line 312, in _get_next_cb
    return self._command(
  File "/Users/users/Documents/GitHub/my-project/venv/lib/python3.8/site-packages/azure/data/tables/_generated/operations/_table_operations.py", line 120, in query
    raise HttpResponseError(response=response)
azure.core.exceptions.HttpResponseError: Operation returned an invalid status 'Internal Server Error'
Content: {"odata.error":{"code":"InternalError","message":{"lang":"en-US","value":"Server encountered an internal error. Please try again after some time.}}}

So basically what I am trying to do here is the following: Copy all the content in table(source) into a table(destination, if table does not exist I want to create it).

Also I am keeping in mind that I am trying to avoid using any connection string or key. I have a managed identity that I can use to this purpose. But in the code up, I am authenticating with azure cli during the runtime.

Just to give a bigger picture. Before I was using the following library:

    table_service_out = TableService(account_name=client_keyvault.get_secret("secretKey").value, account_key=client_keyvault.get_secret("key-value").value)
    table_service_in = TableService(account_name=client_keyvault.get_secret("secret name").value, account_key=client_keyvault.get_secret("secretKey-in").value)


    query_size = 1000

    #save data to storage2 and check if there is lefted data in current table,if yes recurrence
    def queryAndSaveAllDataBySize(source_table_name, target_table_name,resp_data:ListGenerator ,table_out:TableService,table_in:TableService,query_size:int):
        for item in resp_data:
            tb_name = source_table_name + today
            del item.etag
            del item.Timestamp
            print("INSERT data:" + str(item) + "into TABLE:"+ tb_name)
            table_in.insert_or_replace_entity(target_table_name,item)
        if resp_data.next_marker:
            data = table_out.query_entities(table_name=source_table_name,num_results=query_size,marker=resp_data.next_marker)
            queryAndSaveAllDataBySize(source_table_name, target_table_name, data,table_out,table_in,query_size)


    tbs_out = table_service_out.list_tables()
    print(tbs_out)

    for tb in tbs_out:
        table = tb.name + today
        #create table with same name in storage2
        table_service_in.create_table(table_name=table, fail_on_exist=False)

        #first query
        data = table_service_out.query_entities(tb.name,num_results=query_size)
        queryAndSaveAllDataBySize(tb.name, table,data,table_service_out,table_service_in,query_size)

This was working just fine but the TableService can only authenticate with account name and account key, so if one day I want to rotate my keys, I need to update all those secrets.

Nayden Van
  • 1,133
  • 1
  • 23
  • 70
  • Shouldn't you be using table client and query_entities function here: `data = source_table_service_client.query_tables(query_filter=str(tb.name))`? – Gaurav Mantri Oct 06 '22 at 18:00
  • @GauravMantri thank you for your reply. I tried that but I got this error `AttributeError: 'TableServiceClient' object has no attribute 'query_entities'` this is the reason why I changed it to query tables – Nayden Van Oct 06 '22 at 18:04
  • Sorry I was not clear. You will need to use TableClient and query_entities function. So your code would be something like `data = source_table_service_client.get_table_client(table_name=table).query_entities()` and that should give you the entities in the table. – Gaurav Mantri Oct 06 '22 at 18:20
  • Yep that made the trick, but I am getting a permission error as I was fearing `ErrorCode:AuthorizationPermissionMismatch` is it possible to make this work with a managed identity? – Nayden Van Oct 06 '22 at 18:41
  • @GauravMantri Sorry I realised the above question is not related to the OP. But I am facing another problem. `AttributeError: 'ItemPaged' object has no attribute 'next_marker' ` – Nayden Van Oct 06 '22 at 21:54
  • `ErrorCode:AuthorizationPermissionMismatch is it possible to make this work with a managed identity?` - Not locally. Managed identity can be used only when your code is running in Azure as you assign Managed identity to an Azure resource. – Gaurav Mantri Oct 07 '22 at 02:03
  • @gauravmantri thank you for time mate. I really cannot go over the second error with the item page. Any idea what's wrong :( – Nayden Van Oct 07 '22 at 06:05
  • Can you try iterating by page? Something like `for page in resp_data.by_page():` and the next line would be `for item in page`. I think you should find `next_marker` in `page`. – Gaurav Mantri Oct 07 '22 at 06:21
  • @GauravMantri Hi, I tried your advice. But rest_data has only `next_marker` and `items` no `by_page`. This is driving me crazy – Nayden Van Oct 07 '22 at 08:12
  • @GauravMantri solved the problem by rolling back to TableService and using the secrets from the key vault. I have another issue but I will open a new question for it. Thank you so much for your time and help – Nayden Van Oct 07 '22 at 14:18
  • **@Nayden Van / @Gaurav Mantri**, Glad that you have solved your issue. Can you please post your answer so that it will be helpful for other community members. – RajkumarPalnati Oct 12 '22 at 08:26

0 Answers0