6

On Season 12 Episode 07 "The Great Money Caper" of The Simpsons, I noticed a few years ago "gibberish" signs on the Russian spaceship. Randomly today, I decided to search and see if anyone decoded them but couldn't find any results.

Screenshot of episode where two Russians argue in a spaceship. Two signs showing gibberish on walls can be seen.

I suspect that it is KOI8-R showing up as either Latin-1 or Windows-1252. The image I could grab is not very clear.

I have two interpretations of the mojibake as shown in this Python 3 code interpreter interaction:

>>> 'Ï‹ÏËÏÁ ¿Ä ÄÏÍ.†.'.encode('windows-1252').decode('koi8_r')
'о▀окоа ©д дом.├.'
>>> 'Ï<ÏËÏÁ ¿Ä ÄÏÍ.×.'.encode('latin1').decode('koi8_r')
'о<окоа ©д дом.в.'

Looking at the code charts on Wikpedia, I cannot figure out what the "<"-like and "+"-like symbols are. I thought about brute-forcing and matching it with some sort of spellcheck dictionary but I would rather get some help first.

Can the original text or meaning still be recovered? Or is it really gibberish?

(I appreciate if someone knows what it says, but I would like to see if its possible to solve this through some code.)

Edit: A naive script:

codec_list = ['ascii', 'big5', 'big5hkscs', 'cp037', 'cp424', 'cp437',
'cp500', 'cp720', 'cp737', 'cp775', 'cp850', 'cp852', 'cp855', 'cp856',
'cp857', 'cp858', 'cp860', 'cp861', 'cp862', 'cp863', 'cp864', 'cp865',
'cp866', 'cp869', 'cp874', 'cp875', 'cp932', 'cp949', 'cp950', 'cp1006',
'cp1026', 'cp1140', 'cp1250', 'cp1251', 'cp1252', 'cp1253', 'cp1254',
'cp1255', 'cp1256', 'cp1257', 'cp1258', 'euc_jp', 'euc_jis_2004',
'euc_jisx0213', 'euc_kr', 'gb2312', 'gbk', 'gb18030', 'hz', 'iso2022_jp',
'iso2022_jp_1', 'iso2022_jp_2', 'iso2022_jp_2004', 'iso2022_jp_3',
'iso2022_jp_ext', 'iso2022_kr', 'latin_1', 'iso8859_2', 'iso8859_3',
'iso8859_4', 'iso8859_5', 'iso8859_6', 'iso8859_7', 'iso8859_8',
'iso8859_9', 'iso8859_10', 'iso8859_13', 'iso8859_14', 'iso8859_15',
'iso8859_16', 'johab', 'koi8_r', 'koi8_u', 'mac_cyrillic', 'mac_greek',
'mac_iceland', 'mac_latin2', 'mac_roman', 'mac_turkish', 'ptcp154',
'shift_jis', 'shift_jis_2004', 'shift_jisx0213', 'utf_32', 'utf_32_be',
'utf_32_le', 'utf_16', 'utf_16_be', 'utf_16_le', 'utf_7', 'utf_8',
'utf_8_sig',]


source_str_list = ['Ï‹ÏËÏÁ ¿Ä ÄÏÍ.†.', 'Ï<ÏËÏÁ ¿Ä ÄÏÍ.×.']

for mangled_codec in codec_list:
    for correct_codec in codec_list:
        decoded_str_list = []

        for s in source_str_list:
            try:
                decoded_str_list.append(s.encode(mangled_codec
                    ).decode(correct_codec))
            except (UnicodeEncodeError, UnicodeDecodeError):
                continue

        if decoded_str_list:
            print(mangled_codec, correct_codec, decoded_str_list)
chfoo
  • 386
  • 3
  • 13
  • Assuming windows-1252, the "<"-like symbol is `‹` = `\u2039`, and the "+"-like symbol is `†` = `\u2020`. Can't tell what the dots are, though. Also, try the other Cyrillic encodings: Windows-1251, ISO-8859-5, MacCyrillic, IBM855, IBM866, and PTCP154. – dan04 Aug 10 '12 at 01:50
  • Also it looks like the I letters could be Î (circum), Ì (grave) or Í (acute) above, and the A letter could be Å (overring) or Á (acute). – sleblanc Apr 11 '22 at 01:12

1 Answers1

0
Ï‹ÏËÏÁ¿ÄÄÏÍ.†.

                                      gbk  15  5 '蠇纤狭磕南'
                              cp932, sjis  31 11 'マ均ヒマチソトトマヘ.'
                                   cp1250  28 14 'Ď‹ĎËĎÁżÄÄĎÍ.†.'
                                   cp1251  28 14 'П‹ПЛПБїДДПН.†.'
                                   cp1256  28 14 'د‹دثدء؟ؤؤدح.†.'
                                   cp1257  28 14 'Ļ‹ĻĖĻĮæÄÄĻĶ.†.'
                                  geostd8  37 14 'ო‹ოლობ¿ეეონ.†.'

Ï<ÏËÏÁ ¿Ä ÄÏÍ

                                    euckr  31 11 '횕<횕횏횕횁쩔횆횆횕횒'
                                      gbk  31 11 '脧<脧脣脧脕驴脛脛脧脥'

Mojibake for Cyrillic usually involves 'eth':

АБВГҐДЂ -> ÐБВГÒДЂ

Mojibaking Greek gets Î and Ï.

Rick James
  • 135,179
  • 13
  • 127
  • 222