2

So far, I have this code (from cs50/pset6/DNA):

import csv

data_dict = {}
with open(argv[1]) as data_file:
    reader = csv.DictReader(data_file)
    for record in reader:
        # `record` is a dictionary of column-name & value
        name = record["name"]
        data = {
            "AGATC": record["AGATC"],
            "AATG": record["AATG"],
            "TATC": record["TATC"],
        }

        data_dict[name] = data

print(data_dict)

Output

{'Alice': {'AATG': '8', 'AGATC': '2', 'TATC': '3'},
     'Bob': {'AATG': '1', 'AGATC': '4', 'TATC': '5'},
 'Charlie': {'AATG': '2', 'AGATC': '3', 'TATC': '5'}}

Here is the csv file:

name,AGATC,AATG,TATC
Alice,2,8,3
Bob,4,1,5
Charlie,3,2,5

But my goal is to achieve the exact same thing, but instead of hardcoding the keys AATG, etc., and also because I'll use a much much bigger database that contains more values, I want to be able to loop through the data, instead of doing this:

data = {
            "AGATC": record["AGATC"],
            "AATG": record["AATG"],
            "TATC": record["TATC"],
        }

Could you please help me? Thanks

Nicolas F
  • 505
  • 6
  • 17
  • It looks like you want the `data` to contain everything that `record` (which is already a dictionary) does *except* for the `'name'` entry, correct? – Karl Knechtel Jul 12 '20 at 18:48
  • No, I want to have a dictionary that contains dictionaries, exactly like this: `{'Alice': {'AATG': '8', 'AGATC': '2', 'TATC': '3'}, 'Bob': {'AATG': '1', 'AGATC': '4', 'TATC': '5'}, 'Charlie': {'AATG': '2', 'AGATC': '3', 'TATC': '5'}}` However I want to loop through the dictionary not hardcode what I want it to say of each individual sub-dictionary. Do you understand @KarlKnechtel – Nicolas F Jul 13 '20 at 16:46

4 Answers4

2

You can loop through a dictionary in python simply enough like this:

for key in dictionary:
  print(key, dictionary[key])
Dharman
  • 30,962
  • 25
  • 85
  • 135
zjb
  • 400
  • 1
  • 7
  • Hey, but will this convert the csv file into an organized dictionary. Because I'll later have to access these values and compare them... – Nicolas F Jul 11 '20 at 23:03
2

You could also try using pandas.

Using your example data as .csv file:

pandas.read_csv('example.csv', index_col = 0).transpose().to_dict()

Outputs:

{'Alice': {'AGATC': 2, 'AATG': 8, 'TATC': 3},
 'Bob': {'AGATC': 4, 'AATG': 1, 'TATC': 5},
 'Charlie': {'AGATC': 3, 'AATG': 2, 'TATC': 5}}

index_col = 0 because you have names column which I set as index (so that later becomes top level keys in dictionary)

.transpose() so top level keys are names and not features (AGATC, AATG, etc.)

.to_dict() to transform pandas.DataFrame to python dictionary

dm2
  • 4,053
  • 3
  • 17
  • 28
2

you can simply use pandas:

import csv
import pandas as pd

data_dict = {}
with open(argv[1]) as data_file:
    reader = csv.DictReader(data_file)
    df = pd.DataFrame(reader)
    df = df.set_index('name') # set name column as index
    data_dict = df.transpose().to_dict() # transpose to make dict with indexes
   
print(data_dict)
mjrezaee
  • 1,100
  • 5
  • 9
2

You are on the right track using csv.DictReader.

import csv
from pprint import pprint

data_dict = {}

with open('fasta.csv', 'r') as f:
    reader = csv.DictReader(f)

    for record in reader:
        name = record.pop('name')
        data_dict[name] = record

pprint(data_dict)

Prints

{'Alice': {'AATG': '8', 'AGATC': '2', 'TATC': '3'},
 'Bob': {'AATG': '1', 'AGATC': '4', 'TATC': '5'},
 'Charlie': {'AATG': '2', 'AGATC': '3', 'TATC': '5'}}
Chris Charley
  • 6,403
  • 2
  • 24
  • 26
  • Hey @ChrisCharley! Could you please explain to me with more detail what each line is doing? I would appreciate it!! – Nicolas F Jul 12 '20 at 15:07