Questions tagged [azure-data-lake]

Azure Data Lake Analytics is a suite of three big data services in Microsoft Azure: HDInsight, Data Lake Store, and Data Lake Analytics. These fully managed services make it easy to get started and easy to scale big data jobs written in Hive, Pig, Spark, Storm, and U-SQL.

Azure Data Lake Analytics is a suite of three big data services in Microsoft Azure: HDInsight, Data Lake Store, and Data Lake Analytics. These fully managed services make it easy to get started and easy to scale big data jobs written in, U-SQL, Apache Hive, Pig, Spark, and Storm.

  • HDInsight is a fully managed, monitored and supported Apache Hadoop service, bringing the power of Hadoop clusters to you with a few clicks.
  • Data Lake Store is a cloud scale service designed to store all data for analytics. The Data Lake Store allows for petabyte sized files, and unlimited accounts sizes, surfaced through an HDFS API enabling any Hadoop component to access data. Additionally, date in Data Lake Store is protected via ACL's that can be tied to an OAuth2 based identity, including those from your on-premises Active Directory.
  • Data Lake Analytics is a distributed service built on Apache YARN that dynamically scales on demand while you only pay for the job that is running. Data Lake Analytics also includes U-SQL, a language designed for big data, keeping the familiar declarative syntax of SQL, easily extended with user code authored in C#.

To learn more, check out: https://azure.microsoft.com/en-us/solutions/data-lake/

1870 questions
0
votes
3 answers

Azure Databricks writing a file into Azure Data Lake Gen 2

I have an Azure Data Lake gen1 and an Azure Data Lake gen2 (Blob Storage w/hierarchical) and I am trying to create a Databricks notebook (Scala) that reads 2 files and writes a new file back into the Data Lake. In both Gen1 and Gen2 I am…
0
votes
2 answers

How to use Data Factory to ingest all Dynamics 365 entities to a Data Lake?

I'm currently using a Data Factory (V2) to copy a few entities from Dynamics 365 to an Azure Data Lake (Gen1). So far I've just been creating each sink dataset individually as they become relevant. But there are hundreds of potential entities to…
Chris
  • 1,150
  • 3
  • 13
  • 29
0
votes
2 answers

Azure Data Factory- Copy specific files from multiple Parent folders from FTP Server

I am trying to copy the .ZIP files from FTP Server to Azure DataLake. I need to copy specific files from specific parent folders(Totally i have 6 parent folders in the FTP)and this pipeline needs to scheduled. So how should i provide the parameters…
user10813834
  • 43
  • 1
  • 3
  • 11
0
votes
1 answer

Data masking in Azure Datalake Store Gen2

We've multiple pipelines which ingest data from various data sources into Azure Datalake Store-Gen2. Since, we have couple of trusted datasets which needs Data masking in addition to ACL and RBAC implementation. Is there any way that we can mask the…
Shankar
  • 571
  • 14
  • 26
0
votes
1 answer

Pipeline Upload is failing on Sink side with cryptic error message

Pipeline is supposed copying several tables from on-prem SQL Server to ADLS parquet files (drop and create during each processing). The intake relies on self-hosted integration runtime. All tests during the configuration are successful (i.e. table…
Dan
  • 494
  • 2
  • 14
0
votes
1 answer

Unable to connect to Azure DataLake Storage gen 1, forbidden error

I am using .net4.8. I need to connect to Azure Data Lake Storage Gen1. I found below sample on github: https://github.com/Azure-Samples/data-lake-store-adls-dot-net-get-started/ Now in Azure account : Registered New Application, got Application…
knowdotnet
  • 839
  • 1
  • 15
  • 29
0
votes
3 answers

How can I query data in an Azure Analysis Service from a ASP.NET Core application?

I have an cloud application that dumps all its data into an Azure Data Lake. Using Azure Data Factory, I have built a pipeline that extracts and transforms the data from the lake and saves it in local .csv files. These .csv files are accessible in…
0
votes
1 answer

what is the alternative for double datatype from spark sql(Databricks) to Sql Server Data warehouse

I have to load the data from azure datalake to data warehouse.I have created set up for creating external tables.there is one column which is double datatype, i have used decimal type in sql server data warehouse for creating the external table and…
0
votes
1 answer

How to get dynamically all json files table data in a table(sql server data warehouse) using Azure Data Factory(Load from ADF to DWH)

I have to get all json files data into a table from azure data factory to sql server data warehouse.i'm able to load the data into a table with static values (by giving column names in the dataset) but generating in dynamic i'm unable to get that…
pythonUser
  • 183
  • 2
  • 7
  • 20
0
votes
1 answer

How to resolve special character issue in SQL Server data warehouse

I have to load the data from datalake into a SQL Server data warehouse using the polybase tables. I have created the set up for the creation of external tables. I have created the external tables and I am trying to do select * from ext_t1 table but…
pythonUser
  • 183
  • 2
  • 7
  • 20
0
votes
1 answer

Azure Lake to Lake transfer of files

My company has two Azure environments. The first one was a temporary environment and is being re-purposed / decommissioned / I'm not sure. All I know is I need to get files from one Data Lake on one environment, to a DataLake on another. I've looked…
Beth
  • 3
  • 1
0
votes
1 answer

What is the difference between using a COPY DATA activity to a SQL table vs using CREATE EXTERNAL TABLE?

I have a bunch of U-SQL activities that manipulates & transform data in an Azure Data Lake. Out of this, I get a csv file that contains all my events. Next I would just use a Copy Data activity to copy the csv file from the Data Lake directly into…
Kzryzstof
  • 7,688
  • 10
  • 61
  • 108
0
votes
1 answer

How dispose connections to services such as Azure Storage

My function stores data up Azure Data Lakta Storage Gen 1. But I got bug An error occurred while sending the request. When I investigated,I knowed that my connection in azure function overcome 8k then it's broken. Here is my code(Append to file…
Duc Dang
  • 23
  • 1
  • 1
  • 9
0
votes
0 answers

mkdir -p skips creating a middle directory on the Azure Datalake

Why does this consistently make a path in Databricks %sh /dbfs/mnt/datalake/data/staging/steve/3/14 No matter what it drops creating the 2019. If I leave out /2019 then it drops creating the 3. . Here is my Cmd mkdir -p…
Steve Lyle-CSG
  • 117
  • 1
  • 4
  • 12
0
votes
1 answer

Databricks fails accessing a Data Lake Gen1 while trying to enumerate a directory

I am using (well... trying to use) Azure Databricks and I have created a notebook. I would like the notebook to connect my Azure Data Lake (Gen1) and transform the data. I followed the documentation and put the code in the first cell of my notebook:…
Kzryzstof
  • 7,688
  • 10
  • 61
  • 108