1

While creating PolyBase external file format definition for external data stored in Azure blob storage, i am struggling to specify field terminator as a unicode character. The reason why I want to specify unicode character is because when I am loading data in azure blob using Azure data factory, copy activity doesn't support specifying more than one character as column delimiter unless its a unicode character like \u0081

avi
  • 13
  • 3
  • Can you provide some sample data so we can see your actual delimiter? You would make life easier for yourself if you used a conventional delimiter like: comma, tab, pipe etc – wBob Jul 17 '17 at 10:57
  • Hi wBob,Thanks for your reply. The reason why i cannot use a conventional single character delimiter is because data has a lot of text fields and every combination that i tried has messed up the data in some way i.e. data value will have Pipe character, tab, new line, comma, semi comma. – avi Aug 12 '17 at 00:12

1 Answers1

0

Looking at the documentation here, it suggests custom delimiters are possible but using their hex codes:

STRING_DELIMITER = '0x22' -- Double quote hex
STRING_DELIMITER = '0x7E0x7E' -- Two tildas (e.g. ~~)

For your example you could try (untested):

STRING_DELIMITER = '0x81' -- Control character \u0081
wBob
  • 13,710
  • 3
  • 20
  • 37