where does this parenthesis come from?

Question

I'm a newbie from pandas and I'm in a stage of fundamental.

I tried to encode some data and put the same columns into data_enc.

from sklearn.preprocessing import LabelEncoder


labelencoder = LabelEncoder()
new_data = data[['HeatingQC']][:35].copy()

data_enc = pd.DataFrame(labelencoder.fit_transform(new_data),
                        columns = [new_data.columns + '_enc'],
                        index = new_data.index)
print(data_enc.columns[0])
print(new_data.columns[0])

But then output is unexpected. that is

('HeatingQC_enc',) 
HeatingQC

My question is, where does the parenthesis come from and how can I remove them?

It's a tuple, try: `print(data_enc.columns[0][0])` – Keri Apr 17 '20 at 17:33 — Keri, Apr 17 '20 at 17:33

score 1 · Answer 1 · answered Apr 17 '20 at 17:56

The problem is how you created the columns of data_enc. You passed a list which contains an Index object. Because of this nesting, pandas decided to create a broken MultiIndex. (It's broken because it's a MultiIndex with only a single level, so it really shouldn't exist)

Example:

df = pd.DataFrame(columns=list('abc'))

# Placing the Index in a list incorrectly leads to a MultiIndex
pd.DataFrame(columns=[df.columns+'_suffix']).columns
#MultiIndex([('a_suffix',),
#            ('b_suffix',),
#            ('c_suffix',)],)

# Instead get rid of the list, just add the suffix:
pd.DataFrame(columns=df.columns+'_suffix').columns
#Index(['a_suffix', 'b_suffix', 'c_suffix'], dtype='object')

score 0 · Answer 2 · answered Apr 17 '20 at 17:22

0

How about new_data = data['HeatingQC'][:35].copy() instead of indexing the dataframe with a list? That way you should get a single series.

answered Apr 17 '20 at 17:22

AKX

152,115
15
115
172

But then ```new_data``` turns into Series. So It won't have a column. – David kim Apr 18 '20 at 01:55

score 0 · Answer 3 · answered Apr 17 '20 at 17:24

0

The parenthesis are there because your code returned a tuple. To get rid of them run:

print(data_enc.columns[0][0]) Instead of: print(data_enc.columns[0])

answered Apr 17 '20 at 17:24

Jack Cummins

41
1
1
8

where does this parenthesis come from?

3 Answers3