1

I'm trying to query my delta tables using Azure Synapse Serverless SQL Pool. Login in Azure Data Studio using the SQL admin credentials.

This is a simple query to table that I'm trying trying to make:

SELECT
    TOP 100 *
FROM
    OPENROWSET(
        BULK 'https://(...).dfs.core.windows.net/(...)/table/',
        FORMAT = 'DELTA'
    ) AS [result]

I get the error:

Content of directory on path 'https://.../table/_delta_log/*.*' cannot be listed.

If I query any other table, e.g. table_copy I have no error.

I can query every table I have, except this table one.

Following every piece of documentation and threads I find, tried the following:

  • (IAM) setting up Storage Blob Contributor, Storage Blob Owner, Storage Queue Data Contributor and Owner
  • Going in ACL setting up Read, Write, Execute Access and Default permissions, for the Managed Identity (Synapse Studio),
  • Propagating the ACL into every children
  • Restored the default permissions for the folder
  • Making a copy of the table, deleting the original, and overwrite it again (pyspark)
# Read original table
table_copy = spark.read.format("delta")
                  .option("recursiveFileLookup", "True")
                  .load(f"abfss://...@....dfs.core.windows.net/.../table/")

# Create a copy of it
table_copy.write.format('delta')
                .mode("overwrite")
                .option("overwriteSchema","true")
                .save(f"abfss://...@....dfs.core.windows.net/.../table_copy/")

# Remove original one
dbutils.fs.rm('abfss://...@....dfs.core.windows.net/.../table/',recurse=True)

# Overwrite it
table_copy.write.format('delta')
                .mode("overwrite")
                .option("overwriteSchema","true")
             .save(f"abfss://...@....dfs.core.windows.net/.../table/")

If I make a copy of the table to table_copy, I can read it.

Note that in Azure Synapse UI I can query the table. Outside of it I can't.

Dinis Rodrigues
  • 558
  • 6
  • 19

1 Answers1

0

It seems like the permission and firewall settings are set up correctly.

One thing you can try and check the table is in correct format (Delta format) and it has correct schema and also check you directory delta_log create or not.

Try this approach:

First I don't have any delta table . so I created sample dataframe df using spark.read. enter image description here

Then, I overwrite dataframe df into delta format with abfss://<container_name>@<storage_account_name>... path and also parallelly created a table using saveAsTable name: test_table

table_path = f"abfss://<container_name>@<storage_account_name>.dfs.core.windows.net/<folder>"

df.write.format("delta").mode("overwrite").option("path", table_path).saveAsTable("test_table")

enter image description here

You can check test_table and abfss storage location. I successfully got the data in delta format.

enter image description here

enter image description here

Another Alterative way that you can create a new delta table and copy the data from old table to the new delta table. You can use the query like this:

enter image description here

B. B. Naga Sai Vamsi
  • 2,386
  • 2
  • 3
  • 11
  • 1
    Thank you for the answer. My delta table is properly set up. I also tried rewriting the table. But I still can't access and query it. I'm in contact with Azure technical support for a few days now, trying to solve the issue. Will give an update once it's solved – Dinis Rodrigues Jan 26 '23 at 13:20
  • 1
    Close the ticket with Azure support today. The problem was that I had overlapping server credentials. One was global, so I could access any table, but i had another with outdated credentials that was specific to this table. I reseted the credentials and everything works as expected. – Dinis Rodrigues Feb 15 '23 at 17:20