How to code the exception for a column in excel using pandas?

Question

Sample data:

|   | Status                  | Failed | In Progress | Passed | Untested |
|---|-------------------------|--------|-------------|--------|----------|
| 2 | P0 Dry Run - 13/02/18   | 2.0    |             | 143.0  | 5.0      |
| 3 | P1 Test Plan - 06/02/18 | 4.0    |             | 247.0  | 367.0    |
| 4 | P2 Test plan - 03/01/18 | 22.0   | 2.0         | 496.0  | 54.0     |

Code:

msft = pd.read_csv("C:\\Users\\gomathis\\Downloads\\week_071.csv") 
msft = msft[['Passed', 'Failed', 'Blocked', 'In Progress', 'Not_Implemented', 'Not Applicable', 'Clarification Opened', 'Untested']]
msft.to_csv("C:\\Users\\gomathis\\Downloads\\week_072.csv")

Error:

KeyError: "['Blocked'] not in index"

Expected result:

I need an exception for a column which may not be available now but in future it may come. So help me accordingly to solve this.

what do you want to do for the columns which are not in your csv file? — Pyd, Feb 14 '18 at 04:06
Yes exactly, since which we may not available now but in future it will be available. — gomathi subramanian, Feb 14 '18 at 04:07
what do you want to do if your csv file does not have `Blocked` column? you want the output dataframe with same column as uploaded csv file ? — Pyd, Feb 14 '18 at 04:09
Yes, I need Blocked column even it is not in the uploaded csv file — gomathi subramanian, Feb 14 '18 at 04:11

cs95 · Accepted Answer · 2018-02-14T04:33:55.870

1

Use the csv.DictReader.fieldnames attribute, figure out what columns are present in your CSV, and then find the intersection of those.

First, specify the columns you want.

columns = ['Passed', 
           'Failed', 
           'Blocked', 
           'In Progress', 
           'Not_Implemented', 
           'Not Applicable', 
           'Clarification Opened', 
           'Untested']

path = "C:\\Users\\gomathis\\Downloads\\week_071.csv"   # we'll use this later

Next, use the csv.DictReader to read the CSV's headers (this does NOT read the entire file!).

import csv
with open(path, 'r') as f:
    reader = csv.DictReader(f)
    df_columns = reader.fieldnames

Now, find the set intersection, and pass it to usecols in pd.read_csv:

df = pd.read_csv(path, usecols=set(columns).intersection(df_columns))

Finally, to fill in missing columns, take the set difference and call df.assign:

df = df.assign(**dict.fromkeys(set(columns).difference(df_columns), np.nan))

edited Feb 14 '18 at 04:33

answered Feb 14 '18 at 04:14

cs95

379,657
97
704
746

`TypeError: parser_f() got an unexpected keyword argument 'use_cols' ` – gomathi subramanian Feb 14 '18 at 04:21
@gomathisubramanian Sorry, it's "usecols", without the underscore. – cs95 Feb 14 '18 at 04:22
@pyd Does it make sense to keep a column that does not exist? – cs95 Feb 14 '18 at 04:25
I think he wants to fill with some other values for the columns that does not exist – Pyd Feb 14 '18 at 04:26
1

@pyd Maybe, maybe not. OP's question isn't very clear on that aspect, so I'll wait for their response :) – cs95 Feb 14 '18 at 04:27
@cᴏʟᴅsᴘᴇᴇᴅ I need a Blocked which doesn't exist in my data, but in the future it may need for me align the columns in particular order that I need to display – gomathi subramanian Feb 14 '18 at 04:31
Yes, I don't have the value of blocked, – gomathi subramanian Feb 14 '18 at 04:33
@gomathisubramanian Okay, thank you, that was helpful. I've added an edit at the bottom of my answer. – cs95 Feb 14 '18 at 04:34
1

@pyd You were right in the end, so I've fixed it :) – cs95 Feb 14 '18 at 04:34
what does `**dict.fromkeys()` do ? pls explain – Pyd Feb 14 '18 at 04:47
@pyd Try running this: `dict.fromkeys(['a', 'b', 'c'], np.nan)` it generates a dictionary of all keys with the same value (np.nan). Then, just unpack the dictionary when passing it to `df.assign`. – cs95 Feb 14 '18 at 04:47
will it works when i have a value in blocked column. Say if the `Blocked` has value `2` will it print value 2 or it will show the `nan` values – gomathi subramanian Feb 14 '18 at 05:43

How to code the exception for a column in excel using pandas?

Sample data:

Code:

Error:

Expected result:

1 Answers1