Questions tagged [u-sql]

U-SQL is a query language designed for Azure Data Lake. It provides a way to mingle SQL keywords with syntactic C# expressions, so that within a single script, a programmer can schematize the data from an unstructured source, use SQL to aggregate the data into the desired form, and then write the output to a file or table.

771 questions
3
votes
3 answers

Error while running U-SQL Activity in Pipeline in Azure Data Factory

I am getting following error while running a USQL Activity in the pipeline in ADF: Error in Activity: {"errorId":"E_CSC_USER_SYNTAXERROR","severity":"Error","component":"CSC", "source":"USER","message":"syntax error. Final statement did not…
Jai
  • 416
  • 6
  • 20
3
votes
2 answers

Config file for input and output folder location

I have multiple U-SQL scripts and I am using filename variable at the top of each U-SQL script. Is there any way we can define input and output folder to any config file and read that variable, constant or any thing to use them with Extract and…
Ajay
  • 783
  • 3
  • 16
  • 37
3
votes
1 answer

How do I partition a large file into files/directories using only U-SQL and certain fields in the file?

I have an extremely large CSV, where each row contains customer and store ids, along with transaction information. The current test file is around 40 GB (about 2 days worth), so partitioning is an absolute must for any reasonable return time on…
Travis Manning
  • 320
  • 1
  • 12
3
votes
2 answers

U-SQL Extract Statement - working with hundreds of columns

Is there any way in the U-SQL extract statement to only specify the input columns that I care about? I'm working with a legacy database that exports several tables to csv that has about 200 columns. I only care about 10 of those fields. I was…
Dan
  • 3,583
  • 1
  • 23
  • 18
3
votes
2 answers

Memory limit in Azure Data Lake Analytics

I have implemented a custom extractor for NetCDF files and load the variables into arrays in memory before outputting them. Some arrays can be quite big, so I wonder what the memory limit is in ADLA. Is there some max amount of memory you can…
3
votes
2 answers

USQL - How To Select All Rows Between Two String Rows in USQL

Here is my complete task description: I have to extract data from multiple files using u-sql and output it into csv file. Every input file contains multiple reports based on some string rows ("START OF ..." and "END OF ..." working as report…
Kishan Gupta
  • 586
  • 1
  • 5
  • 18
3
votes
2 answers

Transfer data from U-SQL managed table to Azure SQL Database table

I have a U-SQL managed table that contains schematized structured data. CREATE TABLE [AdlaDb].[dbo].[User] ( UserGuid Guid, Postcode string, Age int? DateOfBirth DateTime?, ) And a Azure SQL Database table. CREATE TABLE…
databash
  • 656
  • 6
  • 19
3
votes
2 answers

Slow execution of USQL

I have created a simple script to score between two strings. Please find the USQL and BackEnd .net Code below CN_Matcher.usql: REFERENCE ASSEMBLY master.FuzzyString; @searchlog = EXTRACT ID int, Input_CN string, …
The6thSense
  • 8,103
  • 8
  • 31
  • 65
3
votes
3 answers

How to move processed file to another directory using U-SQL?

I am writing a U-SQL query that extract information from a file (say query.txt) that is stored in a directory (say A). Now I want to move the file query.txt to another directory (say processed) after i am done processing the file query.txt and have…
Abhishek Singh
  • 406
  • 1
  • 6
  • 18
3
votes
1 answer

Out of memory Exception running U-SQL Activity using Azure Data Factory

I am running a U-SQL Activity as part of a Pipeline in Azure Data Factory for a defined time slice. The U-SQL Activity runs a sequence of U-SQL scripts that read-in and process data stored in Azure Data Lake. While the data processes successfully in…
Shahzad Badar
  • 83
  • 1
  • 5
3
votes
2 answers

How do I ignore invalid rows in U-SQL EXTRACT that don't fit schema?

I'm trying to extract some data from a CSV file using the following U-SQL EXTRACT statement: EXTRACT SessionId string, Latitude double, Longitude double, Timestamp int FROM…
outside2344
  • 2,075
  • 2
  • 29
  • 52
3
votes
1 answer

Google's BigQuery vs Azure data lake U-SQL

I am trying to understand the difference or The pros and cons between Google's Big query and Azure data Lake U-SQL. Which is better ? I have exhaustively searched what the big difference is but couldnt find it.
Kaushik
  • 1,264
  • 8
  • 20
  • 32
3
votes
1 answer

Optimize for Max Degree of Parallelizm in Azure Data Lake

What are the guidelines, or where can we find guidelines for designing a system for optimal parallelism. I understand that the data is split out across the individual nodes and optimized to do so. The data that I have in files currently has…
3
votes
1 answer

Querying Windows Azure Storage Table in Azure Data Lake Analytics U-SQL

I have found documentation for querying files from Azure Data Lake Storage or Azure Storage Blob with EXTRACT FROM as well as SQL, Azure SQL Database or Azure Data Warehouse with external tables in a data source location. However, I cannot find…
3
votes
1 answer

How do I know when parallelism will be triggered in Azure data lake analytics?

I have Azure data lake analytics job that processes around 3.8 million records stored on Azure data lake store using U-SQL user defined operators. On the first run, I set parallelism equal to 10 and on the second run I used parallelism equal to 1.…
Jamil
  • 858
  • 11
  • 26