Questions tagged [databricks-autoloader]
69 questions
0
votes
1 answer
databricks autoLoader - why new data is not write to table when original csv file is deleted and new csv file is uploaded
I have a question about autoload writestream.
I have below user case:
Days before I uploaded 2 csv files into databricks file system, then read and write it to table by autoloader.
Today, I found that the files uploaded days before has wrong data…

peace
- 299
- 2
- 16
0
votes
1 answer
Can Databricks Auto loader infer partitions?
By default, when you're using Hive partitions directory structure,the auto loader option cloudFiles.partitionColumns add these columns automatically to your schema (using schema inference).
This is the code:
checkpoint_path =…

alxsbn
- 340
- 2
- 14
0
votes
0 answers
Databricks Autoloader - dealing with combined files
I'm working with some files that have some complexities
multiple tab files concatenated into 1
csv files with some meta data prior to the csv data
csv files with an extra row after the header that should be ignored
csv files with log information…

stuartp
- 55
- 3
0
votes
1 answer
Databricks autoloader writing data with invalid characters in column name
when trying to use databricks' autoloader for writing data, the nested columns contain invalid characters
Found invalid character(s) among " ,;{}()\n\t=" in the column names of your schema.
How to deal with this issue?
Note again that it is the…

Preben Brudvik Olsen
- 63
- 1
- 1
- 5
0
votes
1 answer
Trigger workflow job with Databricks Autoloader
I have requirement to monitor S3 bucket for files (zip) to be placed. As soon as a file is placed in S3 bucket, the pipeline should start processing the file. Currently I have Workflow Job with multiple tasks the performs processing. In Job…

Saravanan Ponnaiah
- 51
- 1
- 2
- 5
0
votes
0 answers
Creating a spark Dataframe within foreach() while using autoloader with BinaryFile option in databricks
I am using autoloader with BinaryFile option to decode .proto based files in databricks. I am able to decode the proto file and write it in csv format using foreach() and pandas library. But having challenge in writing it in delta format. End of the…

pavan
- 821
- 1
- 8
- 13
0
votes
1 answer
Not able to access certain JSON properties in Autoloader
I have a JSON file that is loaded by two different Autoloaders.
One uses schema evolution and besides replacing spaces in the json property names, writes the json directly to a delta table, and I can see all the values are there properly.
In the…

Chris de Groot
- 342
- 1
- 9
0
votes
1 answer
Read data from mount in Databricks (using Autoloader)
I am using azure blob storage to store data and feeding this data to Autoloader using mount. I was looking for a way to allow Autoloader to load a new file from any mount. Let's say I have these folders in my mount:
mnt/
├─ blob_container_1
├─…

Mansimar anand
- 17
- 4