24

How to multiply all the numeric values in the data frame by a constant without having to specify column names explicitly? Example:

In [13]: df = pd.DataFrame({'col1': ['A','B','C'], 'col2':[1,2,3], 'col3': [30, 10,20]})

In [14]: df
Out[14]: 
  col1  col2  col3
0    A     1    30
1    B     2    10
2    C     3    20

I tried df.multiply but it affects the string values as well by concatenating them several times.

In [15]: df.multiply(3)
Out[15]: 
  col1  col2  col3
0  AAA     3    90
1  BBB     6    30
2  CCC     9    60

Is there a way to preserve the string values intact while multiplying only the numeric values by a constant?

CentAu
  • 10,660
  • 15
  • 59
  • 85

5 Answers5

34

you can use select_dtypes() including number dtype or excluding all columns of object and datetime64 dtypes:

Demo:

In [162]: df
Out[162]:
  col1  col2  col3       date
0    A     1    30 2016-01-01
1    B     2    10 2016-01-02
2    C     3    20 2016-01-03

In [163]: df.dtypes
Out[163]:
col1            object
col2             int64
col3             int64
date    datetime64[ns]
dtype: object

In [164]: df.select_dtypes(exclude=['object', 'datetime']) * 3
Out[164]:
   col2  col3
0     3    90
1     6    30
2     9    60

or a much better solution (c) ayhan:

df[df.select_dtypes(include=['number']).columns] *= 3

From docs:

To select all numeric types use the numpy dtype numpy.number

Community
  • 1
  • 1
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
  • 12
    To operate on the original dataframe, you can modify that to: `df[df.select_dtypes(include=['number']).columns] *= 3` – ayhan Jul 23 '16 at 15:27
  • 1
    @ayhan, thank you! I've added your solution to my answer as it might help those who doesn't read comments... :) – MaxU - stand with Ukraine Jul 23 '16 at 15:47
  • I am just wondering as a sort of random thought that I am having lately : Is it fair to answer pandas questions with NumPy funcs? I mean I don't mind, but I am guessing the OPs won't either and would have access to NumPy too, at least if I remember when installing pandas library ;) Also, since pandas uses NumPy internally, does this look like hacking/cheating?BTW I am referring to my posts on pandas. – Divakar Jul 23 '16 at 15:52
  • 3
    @Divakar, i do love your NumPy solutions for pandas questions, especially when they are orders of magnitude faster ;) – MaxU - stand with Ukraine Jul 23 '16 at 15:56
  • 4
    @Divakar OP can select whatever answer they want. After that the community, we, vote as we please. I feel that your numpy-ness is always welcome. – piRSquared Jul 23 '16 at 15:57
  • 1
    @piRSquared, MaxU Thanks guys, really appreciate the feedback and positive assurance there! – Divakar Jul 23 '16 at 16:04
7

The other answer specifies how to multiply only numeric columns. Here's how to update it:

df = pd.DataFrame({'col1': ['A','B','C'], 'col2':[1,2,3], 'col3': [30, 10,20]})

s = df.select_dtypes(include=[np.number])*3

df[s.columns] = s

print (df)

  col1  col2  col3
0    A     3    90
1    B     6    30
2    C     9    60
Jossie Calderon
  • 1,393
  • 12
  • 21
4

One way would be to get the dtypes, match them against object and datetime dtypes and exclude them with a mask, like so -

df.ix[:,~np.in1d(df.dtypes,['object','datetime'])] *= 3

Sample run -

In [273]: df
Out[273]: 
  col1  col2  col3
0    A     1    30
1    B     2    10
2    C     3    20

In [274]: df.ix[:,~np.in1d(df.dtypes,['object','datetime'])] *= 3

In [275]: df
Out[275]: 
  col1  col2  col3
0    A     3    90
1    B     6    30
2    C     9    60
Divakar
  • 218,885
  • 19
  • 262
  • 358
3

This should work even over mixed types within columns but is likely slow over large dataframes.

def mul(x, y):
    try:
        return pd.to_numeric(x) * y
    except:
        return x

df.applymap(lambda x: mul(x, 3))
piRSquared
  • 285,575
  • 57
  • 475
  • 624
0

A simple solution using assign() and select_dtypes():

df.assign(**df.select_dtypes('number')*3)

R.W.
  • 99
  • 5