1

I should get different number of different materials of wood. but I got 0 for all materials.

%%writefile wood.txt

item,material,number

100,oak,33
110,maple,14
120,oak,7
145,birch,3


tree_to_int = dict(oak=1,maple=2,birch=3)

def convert(s):
    return tree_to_int.get(s,0)

data = np.genfromtxt('wood.txt', delimiter=',', dtype=np.int, 
names=True,converters={1:convert})
data

[output]:

array([(100, 0, 33), (110, 0, 14), (120, 0,  7), (145, 0,  3)],dtype=[('item', '<i4'), ('material', '<i4'), ('number', '<i4')])
Kevin Ji
  • 10,479
  • 4
  • 40
  • 63
Peca
  • 13
  • 2
  • In `convert()`, add `print(s)` so you can see the exact value being searched for. I bet this will reveal the true problem. – John Gordon Jan 25 '19 at 05:33

1 Answers1

0

Turns out, it's because the strings read from "wood.txt" are bytestrings. That's why they couldn't be found in the dictionary. To fix it, just decode the bytestings like below

def convert(s):
    return tree_to_int.get(s.decode("utf-8") , 0)

Alternatively, you can also use pandas

import pandas

tree_to_int = pandas.DataFrame([{'material': 'oak', 'material_int': 1}, {'material': 'maple', 'material_int': 2}, {'material': 'birch', 'material_int': 3}])

df = pandas.read_csv('wood.txt')

data = pandas.merge(df, tree_to_int, how='left', on='material')

Tim
  • 3,178
  • 1
  • 13
  • 26
  • I think that was the actual question -- why aren't oak/maple/etc. being converted to their values in the tree_to_int dict? – John Gordon Jan 25 '19 at 05:37