I am attempting to read from a file and pass it through a data redundancy and cryptography algorithm that takes a string. How can I properly read in this file as a string. I need a encoding format that maps across all character positions since these are raw binary bytes. So far, I have tried the encoding format known as 'cp866', but whenever I use this encoding format, it reads from the file very, very slowly.
How can I read from the file as a string just as the UNIX cat command or the Windows type command does?
This is my file
character_encoding = 'cp866'
with open(r'Insert_Your_Large_Binary_File_Here',
encoding=character_encoding) as file:
text = file.read()
print(text)
How can I speed up this function or better replicate the string generation that the cat and type command yields?
How do I, print the data to the STDOUT? Is print sufficient? Essentially, I am looking for cross-platform Python script to replicate this data.
This is an extension of my previous question
Any help or pointing me to proper Python package would be greatly appreciated.
Update: When I don't specify an encoding, I get the following error: Traceback (most recent call last): File "filename_redacted", line 13, in text = file.read() File "C:\Python34\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 34: character maps to
Based off this question, it looks like I should be using this ancient MSDOS encoding. Is there really no better way to do this?