2

I have issues opening a series of .xml files on MacOS with an utf-16 encoding header that were forwarded to me.

The preamble of all these files is clean, with encoding="utf-16". MS Excel cannot open them, and text editors such as emacs and TextEdit display strange foreign characters. However, RStudio and Xcode open them properly. I can open the files with one of them and manually set the encoding to 'utf-8'. Once this is done, they are parsed properly with Excel and text editors as well.

Interestingly, once they are labeled as 'utf-8', I can open them in emacs, manually edit the encoding back to 'utf-16', save them, and after that they open correctly, but labeled as 'utf-16'. I suspect that this is a problem with how these files were generated in the first place, but I am puzzled.

  • The encoding specified in the XML declaration is only descriptive -- changing it doesn't actually change the physical encoding of the document. The behavior you observe indicates that the document is truly UTF-8 but is mislabeled in the XML declaration as UTF-16. (You might find some tools that disregard the XML declaration and so can open the file anyway.) See [this Q/A](https://stackoverflow.com/q/29915467/290085) for further details. – kjhughes Nov 30 '20 at 14:00
  • [I disagree with the closing of this question as lacking debugging details. It may be a duplicate of the linked Q/A in my previous comment, but without getting that question's error message, you may not have been able to find it. I've thus voted to reopen your question. Hopefully my comment above will help you in the meantime.] – kjhughes Nov 30 '20 at 14:06
  • 1
    Thanks for your reply! It appears that emacs does physically change the file. At the beginning, emacs cannot open the file labeled with 'utf-16'. However, I can open the file with Xcode, manually edit 'utf-16' to 'utf-8'. After that, I can open the file with emacs and manually edit 'utf-8' back to 'utf-16' again. After this last iteration I can now open the file normally with emacs. – Hugues Mercier Dec 01 '20 at 15:11

0 Answers0