0
def test_lprun():
    data = {'Name':['Tom', 'Brad', 'Kyle', 'Jerry'],
        'Age':[20, 21, 19, 18],
        'Height' : [6.1, 5.9, 6.0, 6.1]
        }
    df = pd.DataFrame(data)
    df=df.assign(A=123,
                 B=lambda x:x.Age+x.Height,
                 C=lambda x:x.Name.str.upper(),
                 D=lambda x:x.Name.str.lower()
                 )
    return df



In [8]: %lprun -f test_lprun test_lprun()
Timer unit: 1e-07 s

Total time: 0.0044901 s
File: <ipython-input-7-eaf21639fb5f>
Function: test_lprun at line 1

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     1                                           def test_lprun():
     2         1         21.0     21.0      0.0      data = {'Name':['Tom', 'Brad', 'Kyle', 'Jerry'],
     3         1         13.0     13.0      0.0          'Age':[20, 21, 19, 18],
     4         1         15.0     15.0      0.0          'Height' : [6.1, 5.9, 6.0, 6.1]
     5                                                   }
     6         1       8651.0   8651.0     19.3      df = pd.DataFrame(data)
     7         1         19.0     19.0      0.0      df=df.assign(A=123,
     8         1         11.0     11.0      0.0                   B=lambda x:x.Age+x.Height,
     9         1         10.0     10.0      0.0                   C=lambda x:x.Name.str.upper(),
    10         1      36147.0  36147.0     80.5                   D=lambda x:x.Name.str.lower()
    11                                                            )
    12         1         14.0     14.0      0.0      return df

When using pandas assign, it could not tell which rows occupies most time but tell the whole result for assign function.

Goal: line_profile could tell each row result for pandas assign function like Line 6 %Time is 10, Line7 %Time is 30 and so on.

Jack
  • 1,724
  • 4
  • 18
  • 33

0 Answers0