2

I have strings of eight 1's and 0's (e.g. 11100011) and I need to take each one and convert them to a single unique byte (or a character, object etc that has a file size of one byte per string when saved to file). And they must be unique per arrangement of 1's and 0's so I can reverse the process later to get each string back.

I'll then take all the bytes (or objects) and save them to file (which I've been using pickle to do so far in experimentation), so that critically the final file size in bytes is the same as the number of 8 character strings I have.

How would I convert the strings, and how would I reverse the process?

Thanks in advance, very much indeed!

Edit: I was asked to provide some examples of my input data. Basically, given an input file of text e.g. 'Hello World' the first part of my code runs an algorithm and outputs a list of 1's and 0's ints:

>>> array = runAlgorithm()
>>> array
 [1, 0, 1, 0, ....]

And there are anything from a single to typically thousands of entries in the list. After padding out the list so its number of entries is divisible by 8, I then take the entries, cast each as a string, and concatenate them all together (so technically I do have access to them as 1's and 0's ints). I currently then slice the mega string into many 8-character strings.


I'm unfortunately very new to python and have been searching hard for how I would do this, but always run into a problem.

I first tried using int( '1's and 0's string', 2) on each to get unique int's between 0-255. But then I found (since I'm very much a beginner) that most integers aren't each a byte in size.

Then I tried casting each int as a string, since a regular single string character saves as a byte in size. But then of course for all int's with more than one digit, I had strings two and three bytes in size.

Edit:

My last ditch attempt would be to create my own dictionary for each of the 256 combos of 8-character strings, matching a single character to each string - e.g. {a:00000000, b:00000001, ..... !:10100101,[:010101101,.....} saving each string character to file, and hoping that each character is 1 byte in size.

Would this method work, and are there at least 256 characters that weigh 1 byte each in size?

  • 1
    Just for clarity, can you give some examples of your input data. – Mark Dickinson Feb 24 '18 at 18:26
  • 2
    I believe your question is answered here: https://stackoverflow.com/questions/7213996/convert-binary-string-representation-of-a-byte-to-actual-binary-value-in-python – Hendrik Evert Feb 24 '18 at 18:29
  • 2
    *"must be unique per arrangement of 1's and 0's"*? So you're allowing other ways than the obvious interpretation as binary? I don't think other ways will be easier... – Stefan Pochmann Feb 24 '18 at 18:29
  • Ok I tried the link you provided @HendrikEvert, and only the suggestion to use a bytesarray seems to have worked. I've tried taking some strings of 1's and 0's, used n= int(exampleString, 2) to convert them each to a 0-255 int, then put those ints into a bytearray e.g. b = bytearray(n,n1,n2,n3,n4) and then writing b to file. Measuring the filesize of b, it seems one n makes b be 36 bytes, and every other n increases b's size by 1 byte. That's fine for me. However, how do I then turn each entry in b (a byte array) back into a 0-255 int, to then turn back to a string of 1's and 0's? Many thanks! – TheRealPaulMcCartney Feb 24 '18 at 20:37
  • On second thoughts, bytearray() doesn't seem to be panning out, either? Whenever I take an int and store it in a bytearray and pickle it to disk, the minimum file size is 2bytes. For some ints it's more. But I need every int (or char, etc) to be exactly 1 byte in filesize each, so that the file size in bytes = number of ints written to disk. Is this perhaps not possible, or am I doing something wrong? Many thanks. – TheRealPaulMcCartney Feb 25 '18 at 07:49

0 Answers0