1

I am attempting to read from a file and pass it through a data redundancy and cryptography algorithm that takes a string. How can I properly read in this file as a string. I need a encoding format that maps across all character positions since these are raw binary bytes. So far, I have tried the encoding format known as 'cp866', but whenever I use this encoding format, it reads from the file very, very slowly.

How can I read from the file as a string just as the UNIX cat command or the Windows type command does?

This is my file

character_encoding = 'cp866'

with open(r'Insert_Your_Large_Binary_File_Here', 
          encoding=character_encoding) as file:
    text = file.read()
    print(text)

How can I speed up this function or better replicate the string generation that the cat and type command yields?

How do I, print the data to the STDOUT? Is print sufficient? Essentially, I am looking for cross-platform Python script to replicate this data.

This is an extension of my previous question

Any help or pointing me to proper Python package would be greatly appreciated.

Update: When I don't specify an encoding, I get the following error: Traceback (most recent call last): File "filename_redacted", line 13, in text = file.read() File "C:\Python34\lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 34: character maps to

Based off this question, it looks like I should be using this ancient MSDOS encoding. Is there really no better way to do this?

Skylion
  • 2,696
  • 26
  • 50

0 Answers0