Questions tagged [azure-data-lake]

Azure Data Lake Analytics is a suite of three big data services in Microsoft Azure: HDInsight, Data Lake Store, and Data Lake Analytics. These fully managed services make it easy to get started and easy to scale big data jobs written in Hive, Pig, Spark, Storm, and U-SQL.

Azure Data Lake Analytics is a suite of three big data services in Microsoft Azure: HDInsight, Data Lake Store, and Data Lake Analytics. These fully managed services make it easy to get started and easy to scale big data jobs written in, U-SQL, Apache Hive, Pig, Spark, and Storm.

  • HDInsight is a fully managed, monitored and supported Apache Hadoop service, bringing the power of Hadoop clusters to you with a few clicks.
  • Data Lake Store is a cloud scale service designed to store all data for analytics. The Data Lake Store allows for petabyte sized files, and unlimited accounts sizes, surfaced through an HDFS API enabling any Hadoop component to access data. Additionally, date in Data Lake Store is protected via ACL's that can be tied to an OAuth2 based identity, including those from your on-premises Active Directory.
  • Data Lake Analytics is a distributed service built on Apache YARN that dynamically scales on demand while you only pay for the job that is running. Data Lake Analytics also includes U-SQL, a language designed for big data, keeping the familiar declarative syntax of SQL, easily extended with user code authored in C#.

To learn more, check out: https://azure.microsoft.com/en-us/solutions/data-lake/

1870 questions
0
votes
1 answer

Is HDInsight cluster setup based on ADLS persistent?

Is HDInsight cluster setup based on ADLS a persistent cluster? What is the similar storage for ADLS in AWS?
Deepak Janyavula
  • 348
  • 4
  • 17
0
votes
1 answer

Error when connecting to azure-datalakes using continuation token

Im currently trying list files/directories inside of adls2 using a continuation token (currently our folder has over 5000 files). I am able to send my first request, however receive a 403 error (presumably meaning incorrect formatting) when trying…
Fastas
  • 79
  • 1
  • 8
0
votes
1 answer

Python PermissionError uploading to an Azure Datalake folder

I am trying to upload a file to azure datalake using python script. I am able to download a file from the datalake, but the uploading raise a permission error, whereas i checked all permissions at all levels (Read Write Execute and the option for…
Dave
  • 33
  • 3
0
votes
1 answer

Referencing Assemblies when executing U-SQL Application scripts against local-project

I have a U-SQL DB Project (USQLdb) that defines a U-SQL database and it's constituent tables, procedues, etc. This project also references two assemblies for use in one of the stored procedures. The DLL files are held within a folder called…
iamdave
  • 12,023
  • 3
  • 24
  • 53
0
votes
1 answer

Flask File upload to Azure Data Lake Store

I am trying to upload a file from a Flask (Flask-restplus) application directly to azure data lake store (gen1). The flask application is running on azure web app. Is that even possible, or would I need to upload it to the azure web app server…
candidson
  • 516
  • 3
  • 18
0
votes
1 answer

Which Azure storage technology for weather forecast data

I would like some advice/tips about the right technology to select in order to store some forecast data on Azure technologies. My team and I are scraping some weather forecast data everyday from various sources and store it as is on a Azure File…
0
votes
1 answer

What is the content-type and x-ms-version to be used to load the azure datalake file to azure datalake gen2?

I have to load data lake file(csv format) to azure datalake storage gen2 using logic app.I have created logic app using http action,able to create the file and appended the data.for the next http action need to give the length.what is content-type…
pythonUser
  • 183
  • 2
  • 7
  • 20
0
votes
1 answer

Syncing Azure Data Lake Storage User permission with Apache Ranger & Active Directory

Architecture : Bigdata cluster deployed using Hortonworks Cloudbreak on Microsoft Azure with storage as Azure Data Lake Storage (ADLS). Users will be synced from clients Active Directory to Azure Active Directory. Apache Ranger will be used to…
0
votes
1 answer

Power BI Data Flows Access to the resource is forbidden error

I have created a new workspace in Power BI that has dataflows which are backed by Azure Data Lake Gen2 storage. I have created several of these and can use them without issue in Power BI Desktop. I have given another person permission (currently…
Paul Cavacas
  • 4,194
  • 5
  • 31
  • 60
0
votes
1 answer

Not able to get the folder path of azure storage gen2 of file systems

I have created a logic app,i'm reading file from datalake and need to load that to storage gen2 in azure.I have created connection for storage gen 2 using the action azure file storage and need to create the file in the file system.i have full…
pythonUser
  • 183
  • 2
  • 7
  • 20
0
votes
2 answers

U-SQL External table error: 'Unable to cast object of type 'System.DBNull' to type 'System.Type'.'

I'm failing to create external tables to two specific tables from Azure SQL DB, I already created few external tables with no issues. The only difference I can see between the failed and the successful external tables is that the tables that failed…
Dor Meiri
  • 389
  • 1
  • 4
  • 13
0
votes
1 answer

can't get ADLS Gen2 REST continuation token to work

I'm trying to retrieve list of files and folders form ADLS Gen2. I can get the first 5000 items, but when I use continuation to get the rest (about 17,000 items or so), I get Error 403 (Forbidden). According to documentation, I add the continuation…
gunta
  • 71
  • 6
0
votes
1 answer

Service to Support Data Lake Set Up

I have to test and compare the available solutions to create a Data Lake. Is there any other service that makes it easy to set up a secure data lake besides AWS Lake Formation? I know that I can create an account on Azure and Google Cloud…
0
votes
1 answer

Does hdfs know about umi security context when gets run from hdi worker nodes?

We have Azure HDI cluster ( linux worker nodes ) with primary storage account linked to ADLS gen2 storage. We use user manage identity (umi) to connect hdi cluster to it's primary storage. Everything works fine cluster successfully runs and creates…
Alexey Melezhik
  • 962
  • 9
  • 27
0
votes
1 answer

How to fix "Object reference not set to an instance of an object" error when running Get-AzDataLakeStoreChildItem cmdlet?

I'm getting an error while running the Azure cmdlet in Powershell. How do I resolve this? I'm trying to get details of folders and files present in Azure datalake through powershell. I'm able to access the data lake through portal and access all…
Vasanth
  • 25
  • 2
  • 5