0

Here is what I have tried:

>>> with open("symbols.raw") as f:
...     text=f.readlines()
...
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "C:\Python35\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 1694: character maps to <undefined>
>>> with open("symbols.raw",encoding='utf-16') as f:
...     text=f.readlines()
...
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "C:\Python35\lib\codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
  File "C:\Python35\lib\encodings\utf_16.py", line 61, in _buffer_decode
    codecs.utf_16_ex_decode(input, errors, 0, final)
UnicodeDecodeError: 'utf-16-le' codec can't decode bytes in position 7500-7501: illegal encoding
>>> with open("symbols.raw",encoding='utf-8') as f:
...     text=f.readlines()
...
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "C:\Python35\lib\codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfe in position 7: invalid start byte

When I tried using the binary mode then it got loaded but I am not able to understand how to read and edit my own data in it.

>>> with open("symbols.raw",'rb') as f:
...     text=f.readlines()
...

Here is the file: symbols.raw

Please let me know how I can read it in human interpreted way and write my own data in it. Here is the format of the symbols.raw file.

Jaffer Wilson
  • 7,029
  • 10
  • 62
  • 139
  • 1
    why don't you read the file as binary and then decode it to utf-16? – Avenger789 Apr 01 '20 at 10:05
  • @Avenger789 won't that give the same problem as reading the file with the utf-16 encoding as shown in the second error? – incarnadine Apr 01 '20 at 10:08
  • As you see I have tried. But not working. Secondly, I have said I am not able to understand the binary. I have written in the question. – Jaffer Wilson Apr 01 '20 at 10:12
  • You can use 'rb' to read it and then use the `struct` mod to interpet it,you can read the docs at https://docs.python.org/3/library/struct.html#module-struct – xkcdjerry Apr 01 '20 at 10:13
  • @Avenger789 tried your idea: Error: `UnicodeError: UTF-16 stream does not start with BOM` – Jaffer Wilson Apr 01 '20 at 10:15
  • is your .raw an image file? possible duplicate of https://stackoverflow.com/questions/32439831/open-raw-image-data-using-python – albusSimba Apr 01 '20 at 10:43
  • No it is not an image file. That is why I am facing a lot of trouble. I got one idea reading through the internet: https://www.forexfactory.com/showthread.php?p=11585927#post11585927 But that is not working since I do not have the license of WinHex. – Jaffer Wilson Apr 01 '20 at 10:46
  • One way is to use HxD to view the bytes and extract it accordingly. When reading in as bytes and remove the headers. – albusSimba Apr 01 '20 at 10:48
  • Please can you guide me with the HxD? It is something new for me... – Jaffer Wilson Apr 01 '20 at 10:52

3 Answers3

1

you may use with encoding="ISO-8859-1":

with open("symbols.raw", encoding="ISO-8859-1") as f:
    text=f.readlines()
kederrac
  • 16,819
  • 6
  • 32
  • 55
  • Wow, I can edit it. But please can you help me in editing the file also. Because I am not able to understan dthe binary file. Itried, hence I found something that is related to the format. Can you help me with that please? – Jaffer Wilson Apr 01 '20 at 10:44
  • I have mentioned in the question that I am trying to read and edit the file. Already mentioned in the same question. Please can you help me? – Jaffer Wilson Apr 01 '20 at 10:47
0

You should be able to tell python to ignore or replace the errors by specifying the errors="ignore" parmater of the open function, as shown in this answer.

incarnadine
  • 658
  • 7
  • 19
0

One way is to read it off as bytes first then convert it into a list because python do not allow you to edit binary strings.

def read_file_bytes(file_name):
    in_file = open(file_name, "rb")  
    data = in_file.read() 
    in_file.close()    
    return data

file_data = list(read_file_bytes(file_name))

Alternatively you can slice the bytes according to your symbols files that you have provided, (assuming size is the number of bytes)

file_data = read_file_bytes(file_name)
name = str(file_data[:12])
unknown_2 = int(file_data[1628:1628 + 4])

To write a new file you can just do the following:

def write_bytes_to_file(file_name, bytes):
    out_file = open(file_name, "wb")
    out_file.write(bytes)
    out_file.close()

all_bytes = bytearray(name) + bytearray(unknown_2)
write_bytes_to_file('new_file_name.raw', all_bytes)
albusSimba
  • 441
  • 4
  • 14
  • But how I can edit my data into it? Please can you add that part too? I have that in question as well that I am trying to read and edit the file – Jaffer Wilson Apr 01 '20 at 11:08
  • you can just edit the values according already. Or do you mean to write a new raw file – albusSimba Apr 01 '20 at 11:11
  • I guess writing new file is always safe. I am trying to add my own values into the file. That's why I am very confused as I am dealing with binary data. – Jaffer Wilson Apr 01 '20 at 11:16
  • To be fair this question has 2 parts possibly more, and every thing you need has already been provided. All that is left for you is to read the file according to the sequence that you have provided. – albusSimba Apr 01 '20 at 11:18