-3

For binary, quaternery, octonery and hex it's clear for me how to convert a stream of them to plain text. Can anybody help me understand how should I do it for ternary, quinary, senary and other bases? (Here's a ternary example, a python script would be appreciated) So far I've tried:

  • Decoding every number to it's binary counterpart, 0->00, 1->01 and 2->10
  • Creating chunks of 3 characters and map them to English alphabets which didn't work either: 000->a,001->b and so on till 221->z

Here's my code:

from numpy import *
import binascii

base =3
base_data = ''
with open ("./base%s"%base,'r') as b3:
    for line in b3:
        base_data = base_data + line.strip('\r\n')
output = []
all_nums_in_base = range(base)
list_chars = list(base_data)
final = ''
for char in list_chars:
    if char == '0':
        output += ['0']
    elif char == '1':
        output += ['1']
    elif char == '2':
        output += ['1','0']
output = ''.join(output)
n = int('ob'+output,2)
print binascii.unhexlify('%x' % n)

and my result is in this format:

JMN4,�J�j�T*2VYI�F�%��TjYCL���Y�E�&�
�I��̚dYCL�Z�
�K*�թ��-P��Qie�K"q�jL��5j�Y���K0�C�K2i�f�
Kennet Celeste
  • 4,593
  • 3
  • 25
  • 34
  • Just reverse however it was *encoded*? – Stefan Pochmann Jan 28 '18 at 01:25
  • @StefanPochmann but how should I know how it is encoded? I just have a stream of ternary numbers. Should I try to bruteforce different approaches? – Kennet Celeste Jan 28 '18 at 01:26
  • Binary, ternary, etc. are positional numeral systems. They encode numbers, not text. If you're expecting English, you're going to need some additional layer of encoding, and some way to separate the representations of adjacent numbers. – user2357112 Jan 28 '18 at 01:27
  • @Yugi If *you* didn't encode it, then ask the person who did. – Stefan Pochmann Jan 28 '18 at 01:29
  • @user2357112 that stream of numbers is the only thing I'm provided with. Is there any approach to find the separation element? or should I just try more ways ? – Kennet Celeste Jan 28 '18 at 01:29
  • It's not particularly hard to convert an e.g. ternary numeric stream to an e.g. binary or hex numeric stream. But to convert that "standard base" stream to text you'd need to what encoding was used. You could probably guess ASCII, UTF-8, etc. – 101 Jan 28 '18 at 01:37

1 Answers1

3

With your example data (abbreviated here):

> s = '''010020010202011000010200011010011001010202001012...'''
> ''.join(chr(int(s[i:i+6], 3)) for i in range(0, len(s), 6))
=> 'Welcome to base 3!\nLorem ipsum dolor sit amet, consectetur ...'

I guessed that it encodes each character in six ternary digits because your example data's length is a multiple of 6 and 36 is the smallest power of 3 larger than or equal to 28.

Stefan Pochmann
  • 27,593
  • 8
  • 44
  • 107