Questions tagged [azure-data-lake-gen2]

Ask question related to Azure Data Lake Storage Gen2.

Azure Data Lake Storage Gen2 is a set of capabilities dedicated to big data analytics, built on Azure Blob storage. Data

669 questions
2
votes
1 answer

Azure Data Lake Gen 2 default access control list not being applied to new files

Azure Data Lake Gen 2 has two levels of access control; roles based access controls (RBAC) and access control lists (ACL). RBAC function at the container level and ACL can function at the directory and file level. For child objects of a directory to…
2
votes
0 answers

Uploading CSV file to Azure Data Lake Store(ADLS) Gen 2 using Python SDK

[UPDATE - 5/15/2020 - I got this code and the entire flow working with parquet file format. However,I would be still interested in the approach using CSV] I am trying to upload a csv file from a local machine to ADLS Gen 2 storage using the below…
2
votes
2 answers

How can i read a file from Azure Data Lake Gen 2 using python

I have a file lying in Azure Data lake gen 2 filesystem. I want to read the contents of the file and make some low level changes i.e. remove few characters from a few fields in the records. To be more explicit - there are some fields that also have…
Kamal Nandan
  • 233
  • 1
  • 5
  • 11
2
votes
2 answers

Workaround for soft delete not available in ADLS Gen2

As of now the blob feature 'soft delete' is not yet supported for ADLS Gen2 (hierarchical namespaces turned on). Soft delete is really good for accidental deletes either from human error or programmatic deletion. Considering soft delete is not yet…
Dhiraj
  • 3,396
  • 4
  • 41
  • 80
2
votes
1 answer

Referencing ADL storage gen2 files from U-SQL

I have an ADL account set up with two storages: the regular ADLS gen1 storage set up as a default and a blob storage with "Hierarchical namespace" enabled which is connected to ADLS using storage key if that matters (no managed identities at this…
n0rd
  • 11,850
  • 5
  • 35
  • 56
2
votes
1 answer

Azure SQL Data Warehouse Polybase Query to Azure Data Lake Gen 2 returns zero rows

Why does an Azure SQL Data Warehouse Polybase Query to Azure Data Lake Gen 2 return many rows for a single file source, but zero rows for the parent folder source? I created: Master Key (CREATE MASTER KEY;) Credential (CREATE DATABASE…
Andy Jones
  • 1,472
  • 9
  • 15
2
votes
1 answer

Azure Data Factory - Incremental Load to Azure Data Lake

I want to have Incremental Load Pattern for a Source System where there is no Audit Fields which state when was the record last modified. Example: Lasted Modified on (date time) But these tables are defined with Primary Keys and Unique Keys which…
Sreedhar
  • 29,307
  • 34
  • 118
  • 188
2
votes
2 answers

Grant access to Azure Data Lake Gen2 Access via ACLs only (no RBAC)

my goal is to restrict access to a Azure Data Lake Gen 2 storage on a directory level (which should be possible according to Microsoft's promises). I have two directories data, and sensitive in a data lake gen 2 container. For a specific user, I…
SherwoodCH
  • 23
  • 1
  • 4
2
votes
1 answer

Read Azure Datalake Gen2 images from Azure Databricks

Am working on .tif files stored in Azure Data Lake Gen2. Want to open this files using rasterio from Azure Databricks. Example: when reading the image file from Data Lake as spark.read.format("image").load(filepath) works fine. But trying to open…
Sreedhar
  • 29,307
  • 34
  • 118
  • 188
2
votes
2 answers

How to rename a Data Lake Gen2 folder using the Azure CLI?

I'm using Azure Data Lake Gen2 and I have a folder named myfolder with 1000s of files. Is there a command on the Azure Storage CLI for renaming the folder and/or move the entire folder to another location of the ADLS Gen2? Inside Azure Databricks I…
Renan Vilas Novas
  • 1,210
  • 1
  • 10
  • 22
2
votes
1 answer

Unable to access FileSystem of Azure Data Lake Gen2 with Angular using azure-sdk-for-js

I am developing an application with Angular 8 and trying to connect to Azure Data Lake Gen 2's FileSystem through its REST API in order to be able to retrieve the list of folders as well as do a file import. For authentication I use the library…
2
votes
1 answer

Azure Databricks - Write Parquet file to Curated Zone

While writing parquet file back to DataLake Gen2 is creating additional files. Example: %python rawfile = "wasbs://xxxx@dxxxx.blob.core.windows.net/xxxx/2019-09-30/account.parquet" curatedfile =…
Sreedhar
  • 29,307
  • 34
  • 118
  • 188
1
vote
2 answers

Azure Storage Account stuck on 0% for Data Lake Gen2 validation

I am trying to upgrade my Azure Storage Account to the Gen2 Data Lake. I am running the 3 step process in the UI, but when I get to step 2 (validation), it just sits at 0% and never progresses. I don't see any errors come up or anything, just no…
1
vote
1 answer

Load MongoDB data incrementally through Azure data factory

I would like to load data from MongoDB incrementally to Azure storage using Azure data factory. I couldn't find any relavent documentation to do this..Appreicate if there is a way to achieve this with Azure data factory. I have already checked the…
1
vote
3 answers

PERMISSION_DENIED: Invalid permissions on the specified KeyVault

com.databricks.common.client.DatabricksServiceHttpClientException: PERMISSION_DENIED: Invalid permissions on the specified KeyVault https://azkv*.vault.azure.net/. Wrapped Message: Status code 403, {"error": {"code":"Forbidden","message": …