0

So I am collecting data and this data is saved into csv files, however for presentation purposes I want to reorder the columns in each respective csv file based on it's related "order".

I was using this question (write CSV columns out in a different order in Python) as a guide but I'm not sure why I'm getting the error

writeindices = [name2index[name] for name in writenames]
KeyError: % Processor Time

when I run it. Note this error doesn't seem to be limited to just the string % Processor Time'.

Where am I going wrong?

Here is my code:

CPU_order=["%"+" Processor Time", "%"+" User Time", "Other"]
Memory_order=["Available Bytes", "Pages/sec", "Pages Output/sec", "Pages Input/sec", "Page Faults/sec"]

def reorder_csv(path,title,input_file):
    if title == 'CPU':
        order=CPU_order
    elif title == 'Memory':
        order=Memory_order

    output_file=path+'/'+title+'_reorder'+'.csv'

    writenames = order

    reader = csv.reader(input_file)
    writer = csv.writer(open(output_file, 'wb'))

    readnames = reader.next()
    name2index = dict((name, index) for index, name in enumerate(readnames))
    writeindices = [name2index[name] for name in writenames]
    reorderfunc = operator.itemgetter(*writeindices)
    writer.writerow(writenames)

    for row in reader:
        writer.writerow(reorderfunc(row))

Here is a sample of what the input CSV file looks like:

,CPU\% User Time,CPU\% Processor Time,CPU\Other
05/23/2016 06:01:51.552,0,0,0
05/23/2016 06:02:01.567,0.038940741537158409,0.62259056657940626,0.077882481554869071
05/23/2016 06:02:11.566,0.03900149141703179,0.77956981074955856,0
05/23/2016 06:02:21.566,0,0,0
05/23/2016 06:02:31.566,0,1.1695867249963632,0
Community
  • 1
  • 1
Catherine
  • 727
  • 2
  • 11
  • 30

1 Answers1

1

Your code works. It is your data which does not have a column named "% Processor Time". Here is a sample data I use:

Other,% User Time,% Processor Time
o1,u1,p1
o2,u2,p2

And here is the code which I call:

reorder_csv('.', 'CPU', open('data.csv'))

With these settings, everything works fine. Please check your data.

Update

Now that I see your data, it looks like your have column names such as "CPU\% Processor Time" and want to translate it to "% Processor Time" before writing out. All you need to do is creating your name2index this way:

name2index = dict((name.replace('CPU\\', ''), index) for index, name in enumerate(readnames))

The difference here is instead of name, you should have name.replace('CPU\\', ''), which get rid of the CPU\ part.

Update 2

I reworked your code to use csv.DictReader and csv.DictWriter. I also assume that "CPU\% Prvileged Time" will be transformed into "Other". If that is not the case, you can fix it in the transformer dictionary.

import csv
import os

def rename_columns(row):
    """ Take a row (dictionary) of data and return a new row with columns renamed """
    transformer = {
        'CPU\\% User Time': '% User Time',
        'CPU\\% Processor Time': '% Processor Time',
        'CPU\\% Privileged Time': 'Other',
        }
    new_row = {transformer.get(k, k): v for k, v in row.items()}
    return new_row

def reorder_csv(path, title, input_file):
    header = dict(
        CPU=["% Processor Time", "% User Time", "Other"],
        Memory=["Available Bytes", "Pages/sec", "Pages Output/sec", "Pages Input/sec", "Page Faults/sec"],
        )

    reader = csv.DictReader(input_file)
    output_filename = os.path.join(path, '{}_reorder2.csv'.format(title))

    with open(output_filename, 'wb') as outfile:
        # Create a new writer where each row is a dictionary.
        # If the row contains extra keys, ignore them
        writer = csv.DictWriter(outfile, header[title], extrasaction='ignore')
        writer.writeheader()
        for row in reader:
            # Each row is a dictionary, not list
            print row
            row = rename_columns(row)
            print row
            print
            writer.writerow(row)
Hai Vu
  • 37,849
  • 11
  • 66
  • 93
  • Thank you, my data has text before a backslash (I've updated my question above) but I thought since I'm looking for the given string "in" the line it should still work? – Catherine May 24 '16 at 13:27
  • Using your new name2index which replaces the 'CPU\\' with ' ' I still get `KeyError: '% Processor Time'` – Catherine May 24 '16 at 13:52
  • I noticed that your csv is lacking the header for the the time stamp (first column). Is it the problem? It helps that you post the csv sample in its raw form. – Hai Vu May 24 '16 at 13:57
  • That is correct there is no header over the first column (timestamp). I do not know how to post the csv in raw form – Catherine May 24 '16 at 13:59
  • Just pretend your csv is code (i.e. indent 4 spaces). I don't see any comma in your csv, which means it is not in raw form. – Hai Vu May 24 '16 at 14:04
  • Updated csv into raw form – Catherine May 24 '16 at 14:07
  • With your data file, I got: **KeyError: 'Other'**, not the same error as yours. It is understandable because the input csv does not have any **Other** column. – Hai Vu May 24 '16 at 14:11
  • Apologies I used the incorrect csv column, there is now an Other column but I still get the same error – Catherine May 24 '16 at 14:14
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/112805/discussion-between-hai-vu-and-catherine). – Hai Vu May 24 '16 at 14:20