3

I have dataframe with 20 columns and one index.

Its shape is something like (100, 20).

I want to slice the 3rd column from this dataframe, but want to keep the result as a dataframe of (100,1).

  1. If I do a v = df['col3'], I get a Series (which I do not want)
  2. If I do a v =df[df['col3']!=0] and then v.drop(label=[list of 19 columns], axis = 1) --- I get what I want [that is a df(100,1)] but I have to

(a) write an unnecessary != condition ( which I want to avoid) and

(b) I have to write a long list of 19 column names.

There should be a better and cleaner way of doing what I wan to achieve.

Ami Tavory
  • 74,578
  • 11
  • 141
  • 185
Prana
  • 693
  • 1
  • 7
  • 16

2 Answers2

3

If I do a v = df['col3'], I get a Series (which I do not want)

If you use df[cols], where cols is a list, you'll get a DataFrame (not a Series). This includes the case where it is a list consisting of a single item. So, you can use df[['col3']].

For example:

In [33]: df = pd.DataFrame({'a': [1, 2], 'b': [3, 4]})

This gives a Series:

In [35]: df['a']
Out[35]: 
0    1
1    2
Name: a, dtype: int64

This gives a DataFrame:

In [36]: df[['a']]
Out[36]: 
   a
0  1
1  2

Finally, note that you can always transform a Series to a DataFrame with reset_index. So here, you can also use:

In [44]: df['a'].reset_index()
Out[44]: 
   index  a
0      0  1
1      1  2
Ami Tavory
  • 74,578
  • 11
  • 141
  • 185
3

Another handy trick is to_frame()

df['col3'].to_frame()
piRSquared
  • 285,575
  • 57
  • 475
  • 624