While creating PolyBase external file format definition for external data stored in Azure blob storage, i am struggling to specify field terminator as a unicode character. The reason why I want to specify unicode character is because when I am loading data in azure blob using Azure data factory, copy activity doesn't support specifying more than one character as column delimiter unless its a unicode character like \u0081
Asked
Active
Viewed 416 times
1
-
Can you provide some sample data so we can see your actual delimiter? You would make life easier for yourself if you used a conventional delimiter like: comma, tab, pipe etc – wBob Jul 17 '17 at 10:57
-
Hi wBob,Thanks for your reply. The reason why i cannot use a conventional single character delimiter is because data has a lot of text fields and every combination that i tried has messed up the data in some way i.e. data value will have Pipe character, tab, new line, comma, semi comma. – avi Aug 12 '17 at 00:12
1 Answers
0
Looking at the documentation here, it suggests custom delimiters are possible but using their hex codes:
STRING_DELIMITER = '0x22' -- Double quote hex
STRING_DELIMITER = '0x7E0x7E' -- Two tildas (e.g. ~~)
For your example you could try (untested):
STRING_DELIMITER = '0x81' -- Control character \u0081

wBob
- 13,710
- 3
- 20
- 37