0

I'm trying to read in a csv file from the command line and do a few calculations on the columns. However, I'm struggling to skip the first row (Header Row) when the file gets read in.

For example, here is a screenshot of the csv file: enter image description here

Here is the code I'm using currently:

#!/usr/bin/env python
import sys
import re
import csv

def main(argv):
    for row in csv.reader(iter(sys.stdin.readline, "")):
        quantity = int(row[3])
        price_per_unit = int(row[5])
        cum_sum = quantity*price_per_unit
        print(row[0]+" "+str(cum_sum)+" "+row[6]+"\t"+"1")

#Note there are two underscores around name and main
if __name__ == "__main__":
    main(sys.argv)

From the command line I'm running this:

python problem1.py < orders.csv
Cameron Erwin
  • 51
  • 1
  • 5
  • 1
    why not use pandas framework to read csv? it is very convenient to do such work – Atul Shanbhag Sep 28 '18 at 03:37
  • Ultimately this code is going to be used in Hadoop Mapreduce, but I'm trying to make sure it works in python first. I'm also unsure that pandas is available in MapReduce at this time. – Cameron Erwin Sep 28 '18 at 03:38
  • Check out https://stackoverflow.com/questions/11349333/when-processing-csv-data-how-do-i-ignore-the-first-line-of-data – jesseWUT Sep 28 '18 at 03:42
  • Possible duplicate of [When processing CSV data, how do I ignore the first line of data?](https://stackoverflow.com/questions/11349333/when-processing-csv-data-how-do-i-ignore-the-first-line-of-data) – tripleee Jan 11 '19 at 12:05

1 Answers1

2

You need to do one iteration before you start the loop. This is fairly common.

my_iterator = iter(sys.stdin.readline, "")
next(my_iterator)  # Gets the first line and does nothing with it.
for row in csv.reader(my_iterator):
    quantity = int(row[3])
Alex Weavers
  • 710
  • 3
  • 9