1

I am getting the following error while copying data from blob to Azure Data Warehouse:

"errorCode": "2200", "message": "ErrorCode=FailedDbOperation,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Error happened when loading data into SQL Data Warehouse.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.Data.SqlClient.SqlException,Message=HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopExecutionException: Too long string in column [-1]: Actual len = [4251]. MaxLEN=[3999],Source=.Net SqlClient Data Provider,SqlErrorNumber=107090,Class=16,ErrorCode=-2146232060,State=1,Errors=[{Class=16,Number=107090,State=1,Message=HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopExecutionException: Too long string in column [-1]: Actual len = [4251]. MaxLEN=[3999],},],'", "failureType": "UserError", "target": "CopyToADW"

How can trim data while copying to ADW without making any change in source data?

Pரதீப்
  • 91,748
  • 19
  • 131
  • 172
Megha Agarwal
  • 11
  • 1
  • 3
  • Can you detail the column definition of the external table and what kind of data is being copied from the data source to the external table? – Mike Ubezzi Jan 07 '19 at 21:59
  • @Mike Ubezzi MSFT - this is the column definition: [CRM_Note_Description] [nvarchar](3999) NULL. I didn't get the 2nd part of question. If you are asking which delimiter is used, then it is SOH delimited. – Megha Agarwal Jan 08 '19 at 05:15
  • In DWH, you import data to an external (staging) table (example: ext.crm_note_descption) from an external data source, and then run a second job to import to production table (dbo.crm_note_description). You have defined this external table as nvarchar. Does the contents of the CRM_Note_Description contain any double-byte characters or extra characters you no longer need? You can use the ELT method to load the data: https://learn.microsoft.com/en-us/azure/sql-data-warehouse/design-elt-data-loading – Mike Ubezzi Jan 08 '19 at 22:12
  • If you absolutely need all the data in the column, you can use nvarchar (max): https://learn.microsoft.com/en-us/sql/t-sql/data-types/nchar-and-nvarchar-transact-sql?view=sql-server-2017 (Use nvarchar(max) when the sizes of the column data entries vary considerably, and the string length might exceed 4,000 byte-pairs). – Mike Ubezzi Jan 08 '19 at 22:20
  • @ Mike Ubezzi MSFT - Can u please tell in more detail how should I do this? For more clarification, we are already using Polybase mechanism to load data from Azure blob to ADW in ADF. This is the series of activity being followed: Lookup Activity--> CopyToADW. – Megha Agarwal Jan 09 '19 at 06:14
  • Also, this is the internal table which I created in ADW and no external tables were created. So, considering these scenario how do u suggest to modify the existing activity and load it to ADW? – Megha Agarwal Jan 09 '19 at 06:16
  • That is why I explained the two step process, is to provide some education around best practice for importing data. You could create the external table(s) as nvarchar(max), run a clean-up process on the column data to remove unwanted stuff, and then migrate the data to a final (internal) table. That be said, please follow the guidance on this thread: https://stackoverflow.com/questions/10608310/changing-a-data-column-from-varcharmax-to-nvarcharmax-in-sql-server-2008 (ALTER TABLE some_table ALTER COLUMN some_column nvarchar(max)) – Mike Ubezzi Jan 09 '19 at 18:32

0 Answers0