0

I have a streaming data incoming which I am saving as a deltalake table using the below code:

cast_dataframe.writeStream.format("delta").outputMode("append")
.option("checkpointLocation",checkpointLocation)
.table(databasename+"."+tablename)

Here database is the name of database and tablename is the name of table used to create the table.

When i use show create the table to show the schema of a table, the location field is pointing to a random location something like this abfs://storageaccount@storageaccount.dfs.core.net/default/db_name/table_name. I however want to point the location to my specified location. How can I do that ?

I have tried the below approach but it didn't work.

 cast_dataframe.writeStream.format("delta").outputMode("append")
 .option("checkpointLocation",checkpointLocation)
 .table(databasename+"."+tablename)
  .save(my_specified_location)

But the approach above doesn't work. Please help me up with this.

Jacek Laskowski
  • 72,696
  • 27
  • 242
  • 420
Pankaj Mishra
  • 151
  • 2
  • 10

1 Answers1

1

.table and .save are actions and simply cannot be used at the same time. That has actually nothing to do with delta data source.


In the official documentation in Delta Lake table as a sink you can find the following:

.start("/delta/events") // as a path

which seems exactly to meet your "I however want to point the location to my specified location."

Jacek Laskowski
  • 72,696
  • 27
  • 242
  • 420
  • 1
    In addition to providing the path when you invoke `.start("/delta/path")` you can also provide the location as an option, e.g. `.option("path", "/delta/path")` and then invoke `.start()` without an argument. [docs here, scroll down to file sink options table](https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#output-sinks) – Jim Hibbard Mar 29 '23 at 21:08