2

I need to create lookup tables in python from a csv. I have to do this, though, by unique values in my columns. The example is attached. I have a name column that is the name of the model. For reach model, I need a dictionary with the title from the variable column, the key from the level column and value from the value column. I'm thinking the best thing is a dictionary of dictionaries. I will use this look up table in the future to multiply the values together based on the keys.

Here is code to generate sample data set:

 Name = ['model1', 'model1', 'model1', 'model2', 'model2', 
'model2','model1', 'model1', 'model1', 'model1', 'model2', 'model2', 
'model2','model2']
 Variable = ['channel_model','channel_model','channel_model','channel_model','channel_model','channel_model', 'driver_age', 'driver_age', 'driver_age', 'driver_age', 
'driver_age', 'driver_age', 'driver_age', 'driver_age']
channel_Level = ['Dir', 'IA', 'EA','Dir', 'IA', 'EA', '21','22','23','24', '21','22','23','24']
Value = [1.11,1.18,1.002, 2.2, 2.5, 2.56, 1.1,1.2,1.3,1.4,2.1,2.2,2.3,2.4]
df= {'Name': Name, 'Variable': Variable, 'Level': channel_Level, 'Value':Value}
factor_table = pd.DataFrame(df)

I have read the following but it hasn't yielded great results: Python Creating Dictionary from excel data

I've also tried:

import pandas as pd
factor_table = pd.read_excel('...\\factor_table_example.xlsx')

#define function to be used multiple times
def factor_tables(file, model_column, variable_column, level_column, value_column):
    for i in file[model_column]:
        for row in file[variable_column]:
            lookup = {}
            lookup = dict(zip(file[level_column], file[value,column]))

This yields the error: `dict expected at most 1 arguments, got 2

What I would ultimately like is: {{'model2':{'channel':{'EA':1.002, 'IA': 1.18, 'DIR': 1.11}}}, {'model1'::{'channel':{'EA':1.86, 'IA': 1.66, 'DIR': 1.64}}}}

Jordan
  • 1,415
  • 3
  • 18
  • 44
  • 1
    It may be a bit easier to use a list of dictionaries. I haven't tested it yet, but that error could be from the fact that you aren't supplying a `key: value` pair, thus, it's only getting a key, and no value. A structure like `[{'model2':{'channel':{'EA':1.002, 'IA': 1.18, 'DIR': 1.11}}}, {'model1'::{'channel':{'EA':1.86, 'IA': 1.66, 'DIR': 1.64}}}]` may suit your needs better – C.Nivs Jun 25 '18 at 14:11
  • typo fixed. When I run the code now set to a variable, I don't get an error, I just get `none`. Is my for loop sequence off? Hi @C.Nivs, I understand what you're communicating but can't conceptualize how the loop should run...I'm a python novice. – Jordan Jun 25 '18 at 14:14
  • 1
    you have to return something from your function for starters – Jean-François Fabre Jun 25 '18 at 14:21
  • 1
    then don't create a dictionary, update it with the values. and [edit] your question because "dict expected at most 1 arguments, got 2" doesn't make any sense with the corrected code either. – Jean-François Fabre Jun 25 '18 at 14:22
  • @Jean-FrançoisFabre, don't I need a dictionary for the lookup table? – Jordan Jun 25 '18 at 14:22

2 Answers2

1

It looks like your error could be comming from this line:

lookup = dict(zip(file[level_column], file[value,column]))

where file is a dict expecting one key, yet you give it value,column, thus it got two args. The loop you might be looking for is like so

def factor_tables(file, model_column, variable_column, level_column, value_column):
    lookup = {}

    for i in file[model_column]:

        lookup[model_column] = dict(zip(file[level_column], file[value_column]))

    return lookup

This will return to you a single dictionary with keys corresponding to individual (and unique) models:

{'model_1':{'level_col': 'val_col'}, 'model_2':...}

Allowing you to use:

lookups.get('model_1') {'level_col': 'val_col'}

If you need the variable_column, you can wrap it one level deeper:

def factor_tables(file, model_column, variable_column, level_column, value_column):
    lookup = {}

    for i in file[model_column]:

        lookup[model_column] = {variable_column: dict(zip(file[level_column], file[value_column]))}

    return lookup
C.Nivs
  • 12,353
  • 2
  • 19
  • 44
  • Thankyou. I'm getting a `Nonetype object is not callable` error when trying to run the file on a data frame. – Jordan Jun 25 '18 at 14:39
  • I have now added some code to generate a sample data set like the one I'm working with. @C.Nivs – Jordan Jun 25 '18 at 14:53
1

Using collections.defaultdict, you can create a nested dictionary while iterating your dataframe. Then realign into a list of dictionaries via a list comprehension.

from collections import defaultdict

tree = lambda: defaultdict(tree)

d = tree()
for row in factor_table.itertuples(index=False):
    d[(row.Name, row.Variable)].update({row.Level: row.Value})

res = [{k[0]: {k[1]: dict(v)}} for k, v in d.items()]

print(res)

[{'model1': {'channel_model': {'Dir': 1.110, 'EA': 1.002, 'IA': 1.180}}},
 {'model2': {'channel_model': {'Dir': 2.200, 'EA': 2.560, 'IA': 2.500}}},
 {'model1': {'driver_age': {'21': 1.100, '22': 1.200, '23': 1.300, '24': 1.400}}},
 {'model2': {'driver_age': {'21': 2.100, '22': 2.200, '23': 2.300, '24': 2.400}}}]
jpp
  • 159,742
  • 34
  • 281
  • 339
  • Thanks @jpp. For some reason, I'm still getting a `TypeError: 'NoneType' object is not callable` – Jordan Jun 25 '18 at 15:09
  • @Jordan, Can't replicate. I ran this code straight after `factor_table = pd.DataFrame(df)` as you've defined it. – jpp Jun 25 '18 at 15:10
  • 1
    I don' t know what happened, but i restarted the kernal and it worked! Thank you. – Jordan Jun 25 '18 at 15:16