Python txt file into key value pair

Question

race, football, badminton
yellow, 10, 20
white, 50, 30
red, 80, 100

I have a data similar as above (which is more complex). I wish form the key value pair where the result is as below:

'yellow': {'football': 10, 'badminton': 20}, 'white': {'football': 50, 'badminton': 30},   'red':{'football': 80, 'badminton': 100}

which the number is in integer instead of string and I can call the value when I search for the key to plot in bar or pie.

first try:

country_activity_time = []

file = open('time-used', 'r')

for data in file:
    data = data.rstrip('\n').split(',')
    if data[0] != 'Country':
        country = data[0]
        time_spend = data[1:]
        country_time = {country: time_spend}
        country_activity_time.append(country_time)

print(country_activity_time)

first outcome:

[{'Australia': ['6', '45', '89', '27', '132', '76', '58', '211', '56', '40', '29', '512',    '19', '140']}, {'Austria': ['9', '34', '79', '27', '125', '59', .............etc

the value is not integer and unable to search the data by country nam#

second try:

country_activity_time = []
catogory_all = []
country_all = []
time_spend_all = []

file = open('time-used', 'r')

for data in file:
    data = data.rstrip('\n').split(',')
    if data[0] == 'Country':
        catogory = data[1:]
        catogory_all.append(catogory)
    else:
        country = data[0]
        country_all.append(country)
        for time_spend in data[1:]:
            time_spend = int(time_spend)
            time_spend_all.append(time_spend)

print(catogory_all)
print(country_all)
print(time_spend_all)

second outcome:

[['Attending events', 'Care for household members', 'Eating and drinking', 'Education',   'Housework', 'Other leisure activities',.........etc
['Australia', 'Austria', 'Belgium', 'Canada', 'China', 'Denmark', 'Estonia', 'Finland', 'France', 'Germany', 'Greece', 'Hungary', 'India', '.............etc
[6, 45, 89, 27, 132, 76, 58, 211, 56, 40, 29, 512, 19, 140, 9, 34, 79, 27, 125, 59, 32, 280, 55, 82, 21, 498, 32, 109, 15, 22, 99, 41, 121,..............etc

the second out come sucessful separate all the key and value but not sure why the category is in that forms (where len(catogory) = 1 and the value not group accordingly

third try:

country_activity_time = []

file = open('time-used', 'r')

for data in file:
    data = data.rstrip('\n').split(',')
    if data[0] != 'Country':
        country = data[0]
        for time_spend in data[1:]:
            time_spend = int(time_spend)
            country_time = {country: time_spend}
            country_activity_time.append(country_time)

print(country_activity_time)

third outcome:

[{'Australia': 6}, {'Australia': 45}, {'Australia': 89}, {'Australia': 27}, {'Australia': 132}, {'Australia': 76},.............etc

successful link the country and value bout not the category and also the value separate like that

You might want to try [`csv.DictReader`](https://docs.python.org/3/library/csv.html#csv.DictReader) from Pythons standard library. — Matthias, May 08 '23 at 12:42
Could you add a few rows of the actual data you are reading in? — James, May 08 '23 at 12:56
Country,Attending events,Care for household members,Eating and drinking,Education,Housework,Other leisure activities,Other unpaid work & volunteering,Paid work,Personal care,Seeing friends,Shopping,Sleep,Sports,TV and Radio Australia,6,45,89,27,132,76,58,211,56,40,29,512,19,140 Austria,9,34,79,27,125,59,32,280,55,82,21,498,32,109 Belgium,15,22,99,41,121,122,30,194,53,50,29,513,21,131 Canada,6,29,65,36,115,89,52,269,52,53,24,520,21,109 China,2,23,100,25,103,53,33,315,52,23,20,542,23,127 — Joe, May 09 '23 at 15:42

Jan Hein de Jong · Answer 1 · 2023-05-08T20:22:23.440

Your data has the format of a standard CSV table. You can use csv.DictReader for that.

import csv
import io 

s = """col1, col2, col3
entry1, 1, 2
entry2, 3, 4
"""

file_stream = io.StringIO(s)

reader = csv.DictReader(file_stream)
for row in reader:
    print(row)

Returns:

{'col1': 'entry1', ' col2': ' 1', ' col3': ' 2'}
{'col1': 'entry2', ' col2': ' 3', ' col3': ' 4'}

The way you'd want your data (i.e. in a dictionary, with the value of the first column as the key) is a bit tricky I'd say. The reason being that simply based of your description of the data, there is no uniqueness constraint on the first column. This means that if there are two entries with the same first column value, you'll have to have some logic on how to deal with this. You could do something like this:

data = {}
for row in reader:
    col1 = row.pop("col1") 
    if col1 in data: 
        raise Exception("Uh-oh, duplicate entry!")
    data[col1] = row

I know it is tricky........ but too bad that my assignments doesn't allow css and io import..........Thanks — Joe, May 09 '23 at 15:39
The import of io is just for example purpose, to create a runnable example. But yes, if you cant use the csv module from the standard library, that makes things more complicated. — Jan Hein de Jong, May 10 '23 at 10:01

score 0 · Answer 2 · answered May 08 '23 at 14:42

sure you can use external modules that can easily convert to dictionaries but in case you can not use or depend on external modules you can create a function that splits the text into lines and further split into individual elements. Then based on the elements we can convert the integers and extract the heading elements. Finally we can append to a dictionary in the format you desire

So here is the final code txt='''race, football, badminton yellow, 10, 20 white, 50, 30 red, 80, 100'''

import re 
txt=re.sub(' {1,}','',txt) # for cleaning text


def dictonirize(txt):
    d={} #initialize dictionary
    lst=txt.split('\n') #split text into line elements
    headings=lst[0].split(',') #get the headings
    for i in range(1,len(lst)): #traverse the line elements
        xLst=lst[i].split(',') #split line into sub elements
        top=xLst[0] #get the heading for the line
        for j in range(1,len(xLst)):xLst[j]=int(xLst[j]) #convert to int
        d[top]={headings[j]:xLst[j] for j in range(1,len(xLst))} #appending
    return d

dct=dictonirize(txt)
print(dct)

Python txt file into key value pair

2 Answers2