0

(Not allowed to use pandas)

I am very new to python and struggling with this question. I am trying to create a function. I have a CSV file (called personal_info.csv) with a bunch of different columns (full_name, weight_b, height_c, etc). I am trying to loop through the column called height_c and return the most frequent number. Some more info: The range of said column is 0-10, though there is a possibility some numbers won't appear. The numbers are stored as strings (ex: '4') and I'm trying to return the value as a string as well. If there are any ties for the most frequent number, I just want to return the one that shows up first.

Here is some of the data from the csv file:

file_name weight_b height_c
john smith 74 2
rachel lamb 32 5
adam lee 12 2
mackenzie tre 26 2
abby wallace 79 1
karen brown 46 7
harry wright 73 9
madi bear 53 4

So I'm trying to go through column height_c and find the most common value. (Which in this case would be 2), but the file is a lot longer than this.

(edited this to get rid of useless code)

Emppy
  • 15
  • 4

1 Answers1

0

I would suggest loading the csv file into a Pandas dataframe:

import pandas as pd
df = pd.read_csv('file.csv')

Then you can easily get the most common value in the 'heigt_m' column:

df['heigt_m'].value_counts().idxmax()

EDIT: Without using pandas, I would open and store all 'height_c' values in a list and then calculate the most common value:

import csv

height_c = []
with open('file.csv', mode='r') as csv_file:
    csv_reader = csv.DictReader(csv_file)
    for row in csv_reader:
        height_c.append(row['height_c'])
        
most_frequent_height_c = max(set(height_c), key = height_c.count)

print(most_frequent_height_c)
2

Where the file.csv contains

file_name,weight_b,height_c
john smith,74,2
rachel lamb,32,5
adam lee,12,2
mackenzie tre,26,2
abby wallace,79,1
karen brown,46,7
harry wright,73,9
madi bear,53,4
Treeco
  • 64
  • 4