3

I am attempting to write to append values to a csv. I can search for and find where there are missing fields, but want to know if there is a way to insert 0 as a default value for these fields.

I have the following code.

def fillBlanks():

    HEADERS =  ['ST','Year','PCT_SHORT','PCT_V_SHORT','Year','PCT_SHORT',
                'PCT_V_SHORT','Year','PCT_SHORT','PCT_V_SHORT' ]
    fileH = open(outputDir+"PCT_SHORT_V_SHORT.csv", 'rb')
    reader = csv.DictReader(fileH, HEADERS)

    for row in reader:
        if any(row[key] in (None, "") for key in row):
           print "bad"+ str(row)

    fileH.close()

This gives me the rows and columns that are missing as follows.

bad{'PCT_SHORT': None, 'Year': None, 'PCT_V_SHORT': None, 'ST': 'NV'}
bad{'PCT_SHORT': None, 'Year': None, 'PCT_V_SHORT': None, 'ST': 'CA'}
bad{'PCT_SHORT': None, 'Year': None, 'PCT_V_SHORT': None, 'ST': 'AZ'}
bad{'PCT_SHORT': None, 'Year': None, 'PCT_V_SHORT': None, 'ST': 'US'}

Is there a way to code in default values so no fields are missing? I would like to make the fields either 999 or 0. I hope this is clear, I am new to python.

EDIT: Here is a sample of the data ('OK', '2015', '14', '3', '2014', '28', '17', '2013', '19', '17', '2012', '36', '12') ('AZ', '2015', '14', '2', '2014', '36', '2') ('ID', '2015', '12', '0', '2014', '28', '4', '2013', '24', '2', '2012', '14', '1')

J.R.W
  • 65
  • 1
  • 9
  • checkout pandas and its fillna function – Ezer K Jan 13 '16 at 21:43
  • Please added a (short) example of your input so we can see in what way the fields are missing. – martineau Jan 13 '16 at 22:35
  • Here is an example. So the rows aren't missing values, but rather fields. Every 3 or so rows are values appended to a final csv from 2 different CSV, so example the year 2015 and the 2 values that follow are in 1 file and the Year 2014 is in another. For some of the years there is no data for a particular state. AZ is an example of this for the year 2013. ('OK', '2015', '14', '3', '2014', '28', '17', '2013', '19', '17', '2012', '36', '12') ('AZ', '2015', '14', '2', '2014', '36', '2') ('ID', '2015', '12', '0', '2014', '28', '4', '2013', '24', '2', '2012', '14', '1') – J.R.W Jan 14 '16 at 13:07

1 Answers1

4

DictReader has a restval parameter to set the default value for missing fields.

But for missing fields ... at the end of the read values, i.e: the line has fewer fields than expected.

It is unclear in your question if your missing fields will be at the end or anywhere in the chain. Or if missing means there will be two consecutive separators.

Additionally, DictReader may not be your best friend: some of the fields in your HEADERS definition have the same name and will therefore be written to the same key. Should that not be your intention, you should assign a unique name to each of the headers.

EDIT: Following the comment

The header in the code below feature unique ids for each csv field and missing fields will be filled with passed value missing or '0'.

With it the DictReader can assign each read csv field to individual keys or fill in with restval.

A DictWriter undoes the action and writes back to outpath with the parsed (and possibly filled) csv data

def fillBlanks(inpath, outpath, missing=None, writeheaders=False):
    # Assign unique identifiers in HEADERS to each csv field
    HEADERS = [
        'ST',
        'Year_1', 'PCT_SHORT_1', 'PCT_V_SHORT_1',
        'Year_2', 'PCT_SHORT_2', 'PCT_V_SHORT_2',
        'Year_3', 'PCT_SHORT_3', 'PCT_V_SHORT_3'
    ]

    # Open and parse file with default value for missing
    rfile = open(inpath, 'r')
    reader = csv.DictReader(rfile, fieldnames=HEADERS, restval=missing or '0')

    # Open and write to file with writer
    wfile = open(outpath, 'w')
    writer = csv.DictWriter(wfile, fieldnames=HEADERS)

    if writeheaders:
        writer.writeheaders()

    # Write each row from reader to writer
    for row in reader:
        writer.writerow(row)
mementum
  • 3,153
  • 13
  • 20
  • Yes, this is my issue. Some of the lines have fewer fields than expected. I am essentially appending multiple columns to a new list and then writing to a CSV. I need to be able to fill blank cells with 0 when the file with missing field is appended. I can also change the field names as needed. – J.R.W Jan 14 '16 at 13:11
  • So I have been able to use DictReader to add the restval and I can print this. How do I go about writing this to a file? Also is it possible to do this as I am appending a list or tuple? – J.R.W Jan 14 '16 at 16:03