Questions tagged [azure-data-lake-gen2]

Ask question related to Azure Data Lake Storage Gen2.

Azure Data Lake Storage Gen2 is a set of capabilities dedicated to big data analytics, built on Azure Blob storage. Data

669 questions
2
votes
1 answer

Save json to ADLS Gen2 using the Azure Synapse Copy Activity with content type application json

I'm using Azure Synapse REST api source in copy activity and trying to save api response as json with content-type as application/json in a azure data lake gen2 storage container. After saving the json documents I see the content-type set to…
pbj
  • 663
  • 1
  • 8
  • 19
2
votes
1 answer

How does streaming get triggered in Databricks with the File Notification option

How does spark readStream code get triggered in Databricks AutoLoader. I understand it is event driven process and a new file notification causes the file to be consumed. Should the below code be run as a job? If that's the case, how is…
2
votes
1 answer

How can i stream changes from Azure CosmosDB( MongoDB API ) and save the data to Azure Data Lake

Problem and Research: Trying to get real time data from CosmosDB to Data Lake. this is what i have understood from my research, that i have to create a function app to monitor the changes in Cosmos using Change Feed then i have to bind it to event…
2
votes
2 answers

Unity Catalog - External location AbfsRestOperationException

I'am trying to setup connection between Databricks and Azure data lake storage gen2 using Unity Catalog External Locations feature. Assumptions: Adls is behind private endpoint Databricks workspace is in private vnet, i've added Private and Public…
2
votes
1 answer

Azure datalake storage account

When try to create new data lake gen2 account, I am getting an error "There was an error trying to validate storage account name. Please try again" I tried with mulitple names but didn't work
Jo5689
  • 27
  • 4
2
votes
0 answers

azure automation runbook python parameter and path on data lake

I'm working on generating json schema documents based on json documents we import from our vendors and partners.The below code works fine. But, now as I'm working on building on it to create a module where it would take input parameters for the path…
paone
  • 828
  • 8
  • 18
2
votes
0 answers

Informatica Developer 10.5.2.1 - To read multiple part parquet files from Azure DataLake Storage Gen2

I have one folder in ADLS Gen2 which has more than one part parquet files. I need to read all these parquet files in one shot with Informatica Developer and i need to write all of them into another folder in ADLS Gen2. Do you have any…
2
votes
1 answer

Delta Live Table able to write to ADLS?

I have a architectural requirement to have the data stored in ADLS under a medallion model, and are trying to achieve writing to ADLS using Delta Live Tables as a precursor to creating the Delta Table. I've had had success using CREATE TABLE…
2
votes
2 answers

How to mount adls Gen2 account in databrikcs using access keys

i want to mount adls gen 2 storage accounts in azure databricks .but I am using an azure account where i don't have access to create service principal.So i am trying to mount the containers using access keys, But i keep on getting…
2
votes
2 answers

How to ingest data from ADLS into Azure Data Explorer by subscribing to Event Grid

I am trying to ingest data from ADLS gen2 to Azure data explorer through Event Grid. I could find a few of MSFT docs explaining about how to ingest blobs into ADX through event grid but not ADLS. the file path to the ADLS storage account is…
2
votes
2 answers

Performance issue in Synapse serverless SQL pool while reading CSV stored in ADLS

I have enabled the Export to data lake feature in F&O D365 and created external table in Serverless SQL pool database in Synapse to read the CSV. It's working fine since 6 month however now I am facing performance issue due huge amount of data and…
2
votes
3 answers

Found more columns than expected column count in Azure data factory while reading CSV stored in ADLS

I am exporting F&O D365 data to ADLS in CSV format. Now, I am trying to read the CSV stored in ADLS and copy into Azure Synapse dedicated SQL pool table using Azure data factory. However, I can create the pipeline and it's working for few tables…
2
votes
1 answer

How can I upload a .parquet file from my local machine to Azure Storage Data Lake Gen2?

I have a set of .parquet files in my local machine that I am trying to upload to a container in Data Lake Gen2. I cannot do the following: def upload_file_to_directory(): try: file_system_client =…
2
votes
1 answer

Read CSV from Azure Data Lake Storage Gen 2 to Pandas Dataframe | NO DATABRICKS

I am trying for the last 3 hours to read a CSV from Azure Data Lake Storage Gen2 (ADLS Gen2) into a pandas dataframe. This is very easy in Azure Blob Storage (ABS) but I can't figure out how to do this in ADLS Gen2. I have developed the following…
Mayank
  • 93
  • 1
  • 9
2
votes
0 answers

Long Azure Data Factory mapping data flow "file system init duration"

I have an ADF mapping data flow that uses an ADLS gen2 source with a large number of small, say 10kB, files. 98% of the flows time is spent in "file system init duration" in this source. I can't seem to find any documentation on what may affect…
1 2
3
44 45