Questions tagged [azure-data-lake]

Azure Data Lake Analytics is a suite of three big data services in Microsoft Azure: HDInsight, Data Lake Store, and Data Lake Analytics. These fully managed services make it easy to get started and easy to scale big data jobs written in Hive, Pig, Spark, Storm, and U-SQL.

Azure Data Lake Analytics is a suite of three big data services in Microsoft Azure: HDInsight, Data Lake Store, and Data Lake Analytics. These fully managed services make it easy to get started and easy to scale big data jobs written in, U-SQL, Apache Hive, Pig, Spark, and Storm.

  • HDInsight is a fully managed, monitored and supported Apache Hadoop service, bringing the power of Hadoop clusters to you with a few clicks.
  • Data Lake Store is a cloud scale service designed to store all data for analytics. The Data Lake Store allows for petabyte sized files, and unlimited accounts sizes, surfaced through an HDFS API enabling any Hadoop component to access data. Additionally, date in Data Lake Store is protected via ACL's that can be tied to an OAuth2 based identity, including those from your on-premises Active Directory.
  • Data Lake Analytics is a distributed service built on Apache YARN that dynamically scales on demand while you only pay for the job that is running. Data Lake Analytics also includes U-SQL, a language designed for big data, keeping the familiar declarative syntax of SQL, easily extended with user code authored in C#.

To learn more, check out: https://azure.microsoft.com/en-us/solutions/data-lake/

1870 questions
0
votes
2 answers

Compile or validate U-SQL programatically

I have a requirement, where based on certain rules or conditions, the U-SQL script is generated. This is done via templating. I want some way to validate the generated U-SQL script, similar to the "compile script" feature in Visual Studio Code (for…
Joy
  • 92
  • 1
  • 9
0
votes
1 answer

Getting files and folders in the datalake while reading from datafactory

While reading azure sql table data (which actually consists of path of the directories) from azure data factory by using the paths how to dynamically get the files from the datalake. Can any one tell me what should I give in the dataset Screenshot
0
votes
1 answer

Getting File Format not supported while uploading xls file to azure data lake gen1

I am developing an application which will take file input from the user using angular7 and .net core. I am passing payload to my backend using a websocket. I am able to upload to azure datalake successfully, but when I am previewing or downloading…
Vikas Singh
  • 181
  • 7
0
votes
2 answers

Calling ADLS Data in python script inside VS Code

I already installed the ADL extension in VS code and now i am writing a Python script, where i need to read a csv file present in Azure Data Lake Storage (ADLS Gen1). For local file the following code is working: df =…
Lav Mehta
  • 92
  • 1
  • 2
  • 13
0
votes
1 answer

Data from HTTP endpoint to be loaded into Azure Data Lake using Azure Data Factory

I am trying to build a so called "modern data warehouse" using Azure services. First step is to gather all the data in its native raw format into Azure Data Lake store. For some of the data sources we have no other choice than to use API for…
0
votes
1 answer

Write data to a file in ADLS

I have List object collection of a base class that is retrieved from JSON serialization, now before i write the data to a table i need to have a copy of the data in Azure data lake. with below sample code i'm able to create a folder and sample file.…
user1941025
  • 541
  • 6
  • 21
0
votes
1 answer

Capture metadata using Azure Data Factory and storing in SQL Database?

The sort of metadata that I am after includes file sizes, number of rows, file names, if the file has already been processed etc. and I want to capture the flow of data from source to target including capturing data from Azure data lake and SQL…
0
votes
2 answers

Will Azure Data Lake Analytics support ADLS Gen2?

We have a number of projects based on ADLA+ADLS Gen1 and we recently noticed that prices for Gen1 are not available here any more. Also ADLA isn't listed in the Gen1->Gen2 migration guide. Googling brought no relief, so seeking for advise and…
0
votes
1 answer

Append data to existing file in Azure Datalake Storage Gen2 Using Rest API

Rest API available for Azure DataLake Gen2. the documentation is here. Does anyone have examples for the postman or anything like that?
0
votes
1 answer

As a software tester, how can I test the data in azure data lake?

I would like to validate the data in Azure data Lake that's being ingested by Azure Data Factory. How can I validate? What are the different validations that I can do as part of the validation process?
tpitta
  • 13
  • 2
0
votes
2 answers

U-SQL custom extractor on custom row delimiter and json

I have several text files with the following data structure: { huge json block that spans across multiple lines } --#newjson#-- { huge json block that spans across multiple lines } --#newjson#-- { huge json block that spans across multiple…
jayt.dev
  • 975
  • 6
  • 14
  • 36
0
votes
1 answer

Presto query engine with Azure Data Lake

I have a requirement to deploy a presto server which can help me query data stored in ADLS in Avro file formats. I have gone through this tutorial and it seems that the Hive is used as a catalogue/connector in presto to query from ADLS. Can I…
Bhanuday Birla
  • 969
  • 1
  • 10
  • 23
0
votes
1 answer

Why am I getting a missing header error when calling the put file api for azure data lake gen2?

I am trying to call the gen2 rest endpoint directly and keep getting an error that I am missing a required header (MissingRequiredHeader message An HTTP header that's mandatory for this request is not specified. I fail to see what header is missing.…
0
votes
1 answer

Delete temporary files from Azure Datalake Storage in a Azure DataFactory Pipeline (USQL preferred)

We are using AdLS (Azure data lake storage)as a temporary storage in our ADF (Azure data factory - V2) pipeline. What is the best way to delete the data that is stored temporarily in ADLS? U-SQL only supports DDL and not DML, so can’t delete the…
Manjunath Rao
  • 1,397
  • 4
  • 26
  • 42
0
votes
1 answer

How do I query Azure Data Lake Store

Do I have to learn U-SQL to query data in ADLS? Or is there a way to query to using SQL.