Questions tagged [azure-data-lake]

Azure Data Lake Analytics is a suite of three big data services in Microsoft Azure: HDInsight, Data Lake Store, and Data Lake Analytics. These fully managed services make it easy to get started and easy to scale big data jobs written in Hive, Pig, Spark, Storm, and U-SQL.

Azure Data Lake Analytics is a suite of three big data services in Microsoft Azure: HDInsight, Data Lake Store, and Data Lake Analytics. These fully managed services make it easy to get started and easy to scale big data jobs written in, U-SQL, Apache Hive, Pig, Spark, and Storm.

  • HDInsight is a fully managed, monitored and supported Apache Hadoop service, bringing the power of Hadoop clusters to you with a few clicks.
  • Data Lake Store is a cloud scale service designed to store all data for analytics. The Data Lake Store allows for petabyte sized files, and unlimited accounts sizes, surfaced through an HDFS API enabling any Hadoop component to access data. Additionally, date in Data Lake Store is protected via ACL's that can be tied to an OAuth2 based identity, including those from your on-premises Active Directory.
  • Data Lake Analytics is a distributed service built on Apache YARN that dynamically scales on demand while you only pay for the job that is running. Data Lake Analytics also includes U-SQL, a language designed for big data, keeping the familiar declarative syntax of SQL, easily extended with user code authored in C#.

To learn more, check out: https://azure.microsoft.com/en-us/solutions/data-lake/

1870 questions
0
votes
0 answers

How to ingest data to adls using NiFi?

I am trying to ingest data from my local system to azure data lake storage using NiFi . I have been told to use the putHDFS processor for that , but I do not have Hadoop in my machine . Is there any alternative way to ingest the data or any…
0
votes
2 answers

Can we use Azure CLI to upload files to Azure Data Lake Storage Gen2

All I want to do, is to upload files from on prime to Azure Data Lake Storage Gen2 using the Azure CLI (via ` command), but have a connection error! Can I use Azure CLI to to that? Or I have to use another tool? PS: I cannot use Azure Data Factory,…
benabderrahmane
  • 55
  • 1
  • 1
  • 8
0
votes
1 answer

What is the purpose of having two folders in Azure Data-lake Analytics

I am a newbie to Azure Data lake. The below screenshot has 2 folders (Storage Account and Catalog), one for Datalake analytics and other data lake store. My Question is why is the purpose of each folder and why are we using U-SQL for transformations…
addy
  • 29
  • 1
  • 6
0
votes
1 answer

Output path folders to Data Lake Store without "ColumnName="

Is posible to use the function partitionBy or other without returning the path "ColumnName=Value"? I´m using a python notebook in azure databricks to send a csv file to Azure Data Lake Store. The Cmd used is the following: %scala val filepath=…
ptfaferreira
  • 592
  • 5
  • 20
0
votes
1 answer

Azure Data Lake Gen 2 Integration with DataFactory

I am trying to connect data lake gen2 with data factory v2 where we need to add the user through Add user Wizard in data lake . But we couldnt see that option in this and we are not able to connect to data lake gen 2 from data factory . Please help…
0
votes
1 answer

When we use Azure data lake store as data source for Azure Analysis services, is Parquet file formats are supported?

Could you please help with sample tutorial if 'Parquet" file formats are supported when we use data lake as data source.
Idleguys
  • 325
  • 1
  • 7
  • 18
0
votes
1 answer

Test-AzureRmDataLakeStoreItem throw error "Account name is invalid. Specify the full account including the domain name."

I am using the latest azurerm 6.13.0. I am sure I passed in correct data lake store. Thanks.
lidong
  • 556
  • 1
  • 4
  • 20
0
votes
1 answer

Routing telemetry messages from Azure IoT Hub to Data Lake store using AZURE function

How can we route telemetry messages from Azure IOT Hub to Data Lake store using AZURE function.
Murthy
  • 21
  • 2
0
votes
1 answer

Copy only the latest file from azure data lake store with Azure Data Factory (ADF)

I'm trying to copy data from azure data lake store, perform some processing and move it into a different folder in the same data lake using azure data factory. The source data is organized by year, month and date. I only want to copy the latest file…
user9705761
0
votes
1 answer

How to process data with Data Lake Analytics into multiple files with max size?

I am processing a huge amount of small JSON files with Azure Data Lake Analytics and I want to save the result into multiple JSON files (if it is needed) with max size (e.g. 128MB) It this possible? I know, that there is an option to write custom…
Tomáš Čičman
  • 281
  • 1
  • 5
  • 17
0
votes
1 answer

Azure Data Factory passing a parameter into a function (string replace)

I'm trying to use ADF to create azure table storage tables from one source SQL table. Within my pipeline.. I can query a distinct list of customers pass this into a for-each task. Inside the for each select the data for each customer But when…
0
votes
1 answer

Get-AzureRmDataLakeStoreChildItem access issue

I am trying to run this powershell cmdlet : Get-AzureRmDataLakeStoreChildItem -AccountName "xxxx" -Path "xxxxxx" It fails with an access error. It does not really make sense because i have complete access to the ADLS account. I can browse in the…
faizal
  • 3,497
  • 7
  • 37
  • 62
0
votes
1 answer

.usqldbproj could not be opened -- 'File is corrupt.'

Inexplicably, my U-sql Database and U-sql script project seem to become broken. And result in an error like this in the output window when submitting a file in the script project to a remote ADLS account with a database project referenced to…
Alex KeySmith
  • 16,657
  • 11
  • 74
  • 152
0
votes
0 answers

Are Guids unique when using a U-SQL Extractor?

As these questions point out, Guid.NewGuid will return the same value for all rows due to the enforced deterministic nature of U-SQL i.e if it's scaled out if an element (vertex) needs retrying then it should return the same value....…
Alex KeySmith
  • 16,657
  • 11
  • 74
  • 152
0
votes
2 answers

Can I use Python SDK to access data from azure datalake gen2?

Python SDK available for Azure DataLake Gen1. the documentation is here Can i use same SDK to access files at Azure DataLake Gen2 from python?
Radhi
  • 6,289
  • 15
  • 47
  • 68
1 2 3
99
100