Questions tagged [azure-data-lake-gen2]

Ask question related to Azure Data Lake Storage Gen2.

Azure Data Lake Storage Gen2 is a set of capabilities dedicated to big data analytics, built on Azure Blob storage. Data

669 questions
3
votes
1 answer

Read schema information from a parquet format file stored in azure data lake gen2

I have a parquet format table stored in stored in azure data lake gen2 which is directly connected to an external table in azure synapse. I am trying to formulate a logic in sql language which will read the schema of that parquet file table and…
3
votes
2 answers

Accessing Azure ADLS gen2 with Pyspark on Databricks

I'm trying to learn Spark, Databricks & Azure. I'm trying to access GEN2 from Databricks using Pyspark. I can't find a proper way, I believe it's super simple but I failed. Currently each time I receive the following: Unable to access container…
QbS
  • 425
  • 1
  • 4
  • 17
3
votes
1 answer

Read and write file from Azure Data Lake Storage Gen2 in python

AS per Microsoft documents: Connect to Azure Data Lake Storage Gen2 by using an account key: def initialize_storage_account(storage_account_name, storage_account_key): try: global service_client service_client =…
Sohel Reza
  • 281
  • 1
  • 6
  • 23
3
votes
2 answers

Why does Pyspark throw : " AnalysisException: `/path/to/adls/mounted/interim_data.delta` is not a Delta table ". even though the file exists...?

I am using databricks on azure, Pyspark reads data that's dumped in azure data lake storage [adls] Every now and then when i try to read the data from adls like so: spark.read.format('delta').load(`/path/to/adls/mounted/interim_data.delta` ) it…
3
votes
1 answer

Delta lake and ADLS Gen2 transactions

We are running a Delta lake on ADLS Gen2 with plenty of tables and Spark jobs. The Spark jobs are running in Databricks and we mounted the ADLS containers into DBFS (abfss://delta@.dfs.core.windows.net/silver). There's one…
3
votes
0 answers

Copy file from one container to another container in Azure Data Lake 2 programatically

I have a storage account from type ADLS Gen2, with 2 containers. I would like to copy a blob file from container A to container B, with java SDK. I'm using DataLakeFileSystemClient and I'm looking for something like the rename method of…
Yaniv Irony
  • 149
  • 4
3
votes
1 answer

Is Azure BlockBlobStorage or General-Purpose v2 better for premium, low-latency, lots of small json files with search?

This link suggests that BlockBlobStorage is ideal for my scenario, where I have lots of small JSON files and want low-latency and expect a lot of upsert activity plus use of Azure Cognitive Search. I will be arranging my files in folders (one entity…
3
votes
1 answer

Azure Data Factory - extracting information from Data Lake Gen 2 JSON files

I have an ADF pipeline loading raw log data as JSON files into a Data Lake Gen 2 container. We now want to extract information from those JSON files and I am trying to find the best way to get information from said files. I found that Azure Data…
3
votes
0 answers

Read Parquet file from Azure DataLake Gen2 to DataTable / SQLView to query for C#.Net Core Automation Testing

https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-directory-file-acl-dotnet#list-directory-contents I am trying to read Parquet file which is located in Azure DataLake Gen2 Container for automation framework. Able to connect…
shashi k
  • 31
  • 1
3
votes
1 answer

How do I build a Docker image representing Azure's Data Lake (gen 2)?

I'm using the following Docker image for a MS Sql Server ... version: "3.2" services: sql-server-db: image: mcr.microsoft.com/mssql/server:latest ports: - 1433:1433 env_file: ./tests/.my_test_env How do I construct a Docker…
Dave
  • 15,639
  • 133
  • 442
  • 830
3
votes
1 answer

Azure databricks dataframe write gives job abort error

I am trying to write data to a csv files and store the file on Azure Data Lake Gen2 and run into job aborted error message. This same code used to work fine previously. Error Message: org.apache.spark.SparkException: Job aborted. Code: import…
paone
  • 828
  • 8
  • 18
3
votes
1 answer

save rest api get method response as a json document

I am using the code below to read from a rest api and write the response to a json document in pyspark and save the file to Azure Data Lake Gen2. The code works fine when the response has no blank data but when I try to get all the data back then…
paone
  • 828
  • 8
  • 18
3
votes
2 answers

Connect to Azure Data Lake Storage Gen 2 with SAS token Power BI

I'm trying to connec to to ADLS Gen 2 container with Power BI, but I've only found the option to connect with the key1/2 from the container (active directory is not an option in this case). However, I don't want to use those keys since they are…
Rodrigo A
  • 657
  • 7
  • 23
3
votes
1 answer

Azure ADLSGEN2 - API Error 400 - DatalakeStorageException The request URI is invalid

I'm using Azure SDK (Java) to create directory, upload files, move files in ADLSGEN2. My input are very simple, it look like: path : /path/to/fileOrFolder But I got following Error: com.azure.storage.file.datalake.models.DatalakeStorageException:…
tdebroc
  • 1,436
  • 13
  • 28
3
votes
3 answers

How to browse Azure Data lake gen 2 using GUI tool

First some background: I want to facilitate access to the different groups of data scientists in Azure Data Lake gen 2. However, we don’t want provide access to them to the entire data lake because they are not supposed to see all the data for…
1
2
3
44 45