python - writing hex digits to csv

Question

I am having a the following string:

>>> line = '\x00\t\x007\x00\t\x00C\x00a\x00r\x00d\x00i\x00o\x00 \x00M\x00e\x00t\x00a\x00b\x00o\x00l\x00i\x00c\x00 \x00C\x00a\x00r\x00e\x00\t\x00\t\x00\t\x00\t\x00 \x001\x002\x00,\x007\x008\x008\x00,\x005\x002\x008\x00.\x000\x004\x00\r\x00\n'

When I type the variable line in the python terminal it showing the following:

>>> line
'\x00\t\x007\x00\t\x00C\x00a\x00r\x00d\x00i\x00o\x00 \x00M\x00e\x00t\x00a\x00b\x00o\x00l\x00i\x00c\x00 \x00C\x00a\x00r\x00e\x00\t\x00\t\x00\t\x00\t\x00 \x001\x002\x00,\x007\x008\x008\x00,\x005\x002\x008\x00.\x000\x004\x00\r\x00\n'

When I am printing it, its showing the following:

>>> print line
        7    Cardio Metabolic Care               12,788,528.04

In the variable line each word is separated using \t and I wanted to save it to a csv file. So I tried using the following code:

import csv
with open('test.csv', 'wb') as csvfile:
    spamwriter = csv.writer(csvfile, delimiter=',')
    spamwriter.writerow(line.split('\t'))

When I look into the test.csv file, I am getting only the following

,,,,,,

Is there any to get the words into the csv file. Kindly help.

CSV doesn't actually stand for Comma. It stands for Tab as well. So you already have a CSV! — e4c5, May 16 '17 at 05:55
actually I am trying to convert a corrupted file to csv file. — Jeril, May 16 '17 at 05:59
This may help: http://stackoverflow.com/questions/29230943/importing-a-text-file-gives-error — DYZ, May 16 '17 at 06:01
@e4c5 it gives me the following: `['\x00', '\x007\x00', '\x00C\x00a\x00r\x00d\x00i\x00o\x00 \x00M\x00e\x00t\x00a\x00b\x00o\x00l\x00i\x00c\x00 \x00C\x00a\x00r\x00e\x00', '\x00', '\x00', '\x00', '\x00 \x001\x002\x00,\x007\x008\x008\x00,\x005\x002\x008\x00.\x000\x004\x00\r\x00\n']` — Jeril, May 16 '17 at 06:05
You are reading your file incorrectly. Open it with `open("source.csv","r", "utf-16")` or `io.open("source.csv","r", encoding = "utf-16")`. — DYZ, May 16 '17 at 06:12
@DYZ i referred to your previous comment and its working. I used the following: `import io; file1 = io.open(filename, "r", encoding="utf-16")` its giving me the answer. With `utf-8` its giving me `UnicodeDecodeError`. Thanks a lot. — Jeril, May 16 '17 at 06:19

Tomalak · Accepted Answer · 2017-05-16T06:43:40.017

Your input text is not corrupted, it's encoded - as UTF-16 (Big Endian in this case). And it's CSV itself, just with tab as the delimiter.

You must decode it into a string, after that you can use it normally.

Ideally you declare the proper byte encoding when you read it from a source. For example, when you open a file you can state the encoding the file uses so that the file reader will decode the contents for you.

If you have that byte string from a source where you can't declare an encoding while reading it, you can decode manually:

line = '\x00\t\x007\x00\t\x00C\x00a\x00r\x00d\x00i\x00o\x00 \x00M\x00e\x00t\x00a\x00b\x00o\x00l\x00i\x00c\x00 \x00C\x00a\x00r\x00e\x00\t\x00\t\x00\t\x00\t\x00 \x001\x002\x00,\x007\x008\x008\x00,\x005\x002\x008\x00.\x000\x004\x00\r\x00\n'
decoded = line.decode('utf_16_be')

print decoded
#   7   Cardio Metabolic Care                12,788,528.04

But since I suppose that you are actually reading it from a file:

import csv
import codecs

with codecs.open('input.txt', 'r', encoding='utf16') as in_file, codecs.open('output.csv', 'w', encoding='utf8') as out_file:
    reader = csv.reader(in_file, delimiter='\t')
    writer = csv.writer(out_file, delimiter=',', quotechar='"')

    writer.writerows(reader)

Note that I read the file as UTF-16, but write it as UTF-8, as this is the more common encoding. Pick the output encoding you need. — Tomalak, May 16 '17 at 06:40

python - writing hex digits to csv

1 Answers1