3

I needed to start dealing with foreign characters, and in doing so, I think I royally screwed up a file's encoding.

The error I'm getting is:

Lexical error at line 1, column 8.  Encountered: "" (0), after :  ""

The first line of the file is:

import xml.etree.cElementTree as ET

Also of note: when I pasted the line above into the textarea to ask this question, and submitted, an unknown character appeared between every character (e I have been unable to fix this issue by adding an explicit coding definition:

# -*- coding: utf-8 -*-

I have also been unable to revert the file (using Hg) to a previous version, nor copy/paste code into a new file, or replace the broken file with copied/pasted code.

Please help!

Jonathan Drake
  • 270
  • 1
  • 4
  • 13

1 Answers1

3

If it is indeed a zero character in there, you may find you've injected some UTF-16/UCS-2 text. That particular Unicode encoding would have a zero byte in between every ASCII character.

The best way to find out is to do a hex dump of you file with something like od -xcb myfile.py.

If that is the case, then you'll need to edit the file with something that's able to see those characters, and fix them up.

vi would be my first choice (since that's what I'm used to) but I don't want to start any holy wars with the Emacs illuminati. In vi, they'll most likely show up as ^@ characters.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • Thank you! The file indeed has zero bytes in between every ASCII character. `0000000 6d69 6f70 7472 0020 0078 006d 006c 002e i m p o r t \0 x \0 m \0 l \0 . \0 151 155 160 157 162 164 040 000 170 000 155 000 154 000 056 000 `. How do I remove these? I'm using Eclipse. – Jonathan Drake Jun 28 '11 at 03:20
  • Perfect. Used vi and had to search/replace on ^@ – Jonathan Drake Jun 28 '11 at 03:33