1

I am trying to implement the union of two sets (with their labels) but apparently it gives a 'keyerror' for 'Survived' column. It should be fairly simple but i don't know what's causing the error

the train_df has 12 columns, while the test_df has 11 with the exception of 'Survived'.

Here's the data labels of train_df

Index(['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp',
       'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'],
      dtype='object')

Here are those of test_df

test_df.columns

Index(['PassengerId', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp', 'Parch',
       'Ticket', 'Fare', 'Cabin', 'Embarked'],
      dtype='object')

The code

cols = train_df.columns
labels = []
    for i in range(0,12):
        train = train_df[cols[i]].unique()
        test = test_df[cols[i]].unique()
        labels.append(list(set(train) | set(test)))

The output should merge the columns of the two but it gives keyerror on 'Survived'

Bill Bell
  • 21,021
  • 5
  • 43
  • 58
Sereph
  • 61
  • 5
  • Is this a Pandas `Index` object? – Matthew Cole May 04 '17 at 22:19
  • no i didn't index the file with a column, it's the whole dataframe with all the data in it, ie.train_df = pd.read_csv("filepath") – Sereph May 05 '17 at 00:00
  • You see that your are iterating through cols wich contains 'Survived'. When you hit that aligns with 'Survived' you are trying to find 'Survied' in test_df which doesn't exists. Therefore, you are getting the Key Error. – Scott Boston Aug 04 '17 at 20:44

0 Answers0