-1

I have a tree structure where at every node idf values are stored for a large number of words. Dictionary has two fields i.e. word and idf.

I want to store all the idf values in a dictionary. I want all the value of idf which are stored in the tree to get stored in the dictionary, but when I am doing so it is storing only one value of each word.

For example: A has two childs B and C. A,B,C all has idf values stored at them. I want to make a dictionary which will combine all the idf values and store it together.

A = {'a':10, 'b': 11} B = {'a':5, 'c': 8} C = {'b':21, 'd': 20}, I want to store it as dic = {'a':10,'a':5,'b':11,'b':21,'c':8,'d':20}

Below is the code that I am using:

def idf_total(node):
    dic={}
    next_node=[]
    for child in node.children:
        next_node.append(child)
        idf = child.idf
        dic.update(idf)
    if next_node:
        for i in next_node:
            idf_total(i)
    return dic

Kindly help how this can be done.

Latest code:

def idf_total_updated(node):
    dic=defaultdict(list)
    next_node=[]
    for child in node.children:
        next_node.append(child)
        for k,v in child.idf.items():
            dic[k].append(v)
     if next_node:
        for i in next_node:
            idf_total_updated(i)
    return dic

The above latest code is storing multiple values for a key but it is repeating the same value again and again. Where I am going wrong. Please help.

adi5257
  • 83
  • 7

2 Answers2

1

Python dictionaries cannot have duplicate keys.

This means you cannot have (for example):

C = {'a': 5, 'a': 10}  # key 'a' is duplicate here.

One way to solve this issue is to have a list of values for a key.

For example:

A = {'a': 5}
B = {'a': 10}

This can be combined into

C = {'a': [5, 10]}

defaultdict from collections module is appropriate here:

from collections import defaultdict

A = {'a': 10, 'b': 11} 
B = {'a': 5, 'c': 8}
C = {'b': 21, 'd': 20}

dic = defaultdict(list)

for d in A, B, C:
    for k, v in d.items():  # d.iteritems() in Python 2
        dic[k].append(v)
print(dic)

# defaultdict(<class 'list'>, {'a': [10, 5], 
#                              'b': [11, 21],
#                              'c': [8],
#                              'd': [20]})                                        
Austin
  • 25,759
  • 4
  • 25
  • 48
  • I am using the following code but it is storing same values multiple times in the list for example, it is storing 'a':[10:10] for me. Can you please tell me where I am going wrong. Below is my code: def idf_total_updated(node): dic=defaultdict(list) next_node=[] for child in node.children: next_node.append(child) for k,v in child.idf.items(): dic[k].append(v) if next_node: for i in next_node: idf_total_updated(i) return dic – adi5257 Apr 05 '18 at 12:25
  • 1
    Could you add code to question with proper formatting? It's hard to judge where indentation starts and ends. – Austin Apr 05 '18 at 12:27
  • I assume you don't need to store `10` again if you already have `10` in list. Is that right? – Austin Apr 05 '18 at 12:35
  • I want to store all the values even if they are repeated. – adi5257 Apr 05 '18 at 12:37
  • So, what's the issue with `'a': [10, 10]` ? It should be like that if you have `10` as value for key `a` in two dicts. – Austin Apr 05 '18 at 12:42
  • The issue with me is that I am not able to store all the values as a list together. If a should store 10,15,20...i am capturing only one value multiple times using the latest code which i shared. Can you please tell where i am going wrong with that code. – adi5257 Apr 05 '18 at 12:49
  • What do you pass to this function? – Austin Apr 05 '18 at 12:54
  • I am passing the parent node which contains multiple childs. All the nodes have idf values stored as dictionary to it. For e.g. A is the parent and has B and C as childs, so I am passing A to the function. – adi5257 Apr 05 '18 at 12:58
  • I don't know if this solves because I couldn't recreate your issue. Try placing `dic=defaultdict(list)` outside the function. – Austin Apr 05 '18 at 13:06
0

you can use update:

A = {'a':10, 'b': 11} 
B = {'a':5, 'c': 8}
B.update(A)
>> B
>> {'a': 10, 'b': 11, 'c': 8}

As you can not have a duplicate key with different value in a dict, it overrides the same keys with the key in A dictionary. read the documentation here Update

or if you have to keep both values you have to create a list of dictionaries like this:

l = [A, B, C]
Mehrdad Pedramfar
  • 10,941
  • 4
  • 38
  • 59
  • I don't want to update the values. I just want to store all the possible values for a key together. Is there any other way around in which this can be done. – adi5257 Apr 05 '18 at 11:31
  • if `A={'a':1}` and `B={'a':5}`, what are you expect the final dict to be? tell me i will figure it out for you. @adi5257 – Mehrdad Pedramfar Apr 05 '18 at 11:33
  • You may have the list of dict like this too `[A,B,C]` if desired.@adi5257 – Mehrdad Pedramfar Apr 05 '18 at 11:42
  • I want dic={'a':1,'a':5}...I know that a dictionary can't hold multiple values for same key but is there any other way in which this can be done. By creating a list or a dataframe. – adi5257 Apr 05 '18 at 12:22
  • yes i updated my answer, this is the only way left to do it. make them as a list o dicts. – Mehrdad Pedramfar Apr 05 '18 at 12:24