Numpy array from characters in BDF file

Question

I have a file, font_file.bdf, and need to get the characters contained in it as numpy arrays where each element is one pixel.

Here's the snippet of that file which defines the '?' character:

STARTCHAR question
ENCODING 63
SWIDTH 1000 0
DWIDTH 6 0
BBX 5 7 0 0
BITMAP
70
88
08
10
20
00
20
ENDCHAR

I researched .bdf files to understand how they encode data. Basically, it's a bitmap with bit-depth of 1. I found a pillow module, PIL.BdfFontFile, which can interpret bdf files. After experimenting with this module a while I was able to get a PIL image for each of the characters in the font and save them to see that it is working like so:

from PIL.BdfFontFile import BdfFontFile

fp = open("font_file.bdf", "r")
bdf_file = BdfFontFile(fp)
bdf_file.compile()
char = '?'
_, __, bounding_box, image = bdf_file[ord(char)]
image.save(char + ".png")

The saved image looks like the following: Question Mark. and from looking at its properties it has a bit-depth of 1, which makes sense. (I'm not sure why it seems inverted, but I could do that kind of manipulation with numpy if still needed.)

Once I had that, I tried to convert to a numpy array:

print numpy.array(image, dtype=numpy.int)

which gave me an array that no longer seems to represent the corresponding character any longer:

[[1 1 1 1 1]
 [0 1 0 1 1]
 [1 1 1 1 1]
 [1 1 1 1 0]
 [1 0 1 0 1]
 [1 0 1 1 1]
 [0 1 1 1 1]]

I was hoping for something that looked more like this:

[[0 1 1 1 0]
 [1 0 0 0 1]
 [0 0 0 0 1]
 [0 0 0 1 0]
 [0 0 1 0 0]
 [0 0 0 0 0]
 [0 0 1 0 0]]

Worst case-scenario, I could make an algorithm myself that converts the data in the PIL image to a numpy array, but I feel like there has to be an easier way given my past experience with converting between PIL Images and numpy arrays (It's usually quite straight-forward.)

Any ideas about how to get the PIL image to convert to a numpy array properly or another solution to my problem would be appreciated.

What does `image` look like before you try converting it to a numpy array? — pretzlstyle, Aug 12 '16 at 18:19
@jphollowed, It looks like a question mark (although the pixels in the character show up white and the background is black). I have the image attached, although since it is an extremely small image it is hard to see. — Drake Mossman, Aug 12 '16 at 20:17
no I mean what does tour variable `image` look like if your printit — pretzlstyle, Aug 12 '16 at 20:19
I've also tried printing out the data contained in the image using a BitArray: print BitArray(bytes=image.tobytes()).hex, which gives me this: 70880810200020, exactly matching the bytes in the bitmap. — Drake Mossman, Aug 12 '16 at 20:37
can you confirm that the dimensions of the resultant numpy array match that of the image? — pretzlstyle, Aug 12 '16 at 20:51

score 0 · Accepted Answer · edited May 23 '17 at 11:51

It turns out the unexpected behavior I was seeing was due to a bug in PIL as described in this SO question: Error Converting PIL B&W images to Numpy Arrays.

So the key to solving my problem was to convert the image to grayscale before creating the numpy array.

My final solution with doing a small numpy conversion into the described format was as follows:

fp = open("font_file.bdf", "r")
bdf_file = BdfFontFile(fp)
bdf_file.compile()
char = '?'
_, __, bounding_box, image = bdf_file[ord(char)]
print numpy.array(image.convert('L')) / 255

which gave me this:

[[0 1 1 1 0]
 [1 0 0 0 1]
 [0 0 0 0 1]
 [0 0 0 1 0]
 [0 0 1 0 0]
 [0 0 0 0 0]
 [0 0 1 0 0]]

score 0 · Answer 2 · edited Jun 05 '21 at 08:21

0

For me to get @drake-mossman's answer to work, I had to modify the first line to read the file in byte format:

fp = open("font_file.bdf", "rb")

Which unfortunately means that the BdfFontFile script currently doesn't support unicode characters (or any code points past 255).

edited Jun 05 '21 at 08:21

StupidWolf

45,075
17
40
72

answered Jun 05 '21 at 05:10

Steven Bhardwaj

3
3

Numpy array from characters in BDF file

2 Answers2