1

If given a set of text "abcdefg-foo" that is encoded with codepage "xzc" is it possible in python to decode those characters with the codepage?

More specifically, we have a known AFP codepage T1V10500. The font we extract is being extracted from an AFP that references this codepage. We can extract the reference and build the path to the codepage.

codepage="/path/to/codepage/T1V10500"
ascii_encoded_extracted_afp_text=extract_afp_text().decode(codepage).encode("ascii")

This is an oversimplification of what I wish to achieve, but I hope to understand if the concept is available in python specifically.

jrlmx2
  • 1,967
  • 2
  • 11
  • 9

1 Answers1

0

You should try to use ICU

There seems to be a python binding (http://pypi.python.org/pypi/PyICU/0.8.1)

If the codepage used in your AFP file is generic (and not a custom one) you can easily build a converter with ICU from the codepage specified encoding (T1v10500 should be CP500 ie. IBM EBCDIC International) to ascii or whatever encoding you need. ICU is a great library and is used in most of IBM AFP files tools.

If you find ICU too cumbersome and do not need to handle other codepages you can build a simple conversion table from CP500 to ASCII

user18428
  • 1,216
  • 11
  • 17