def test_lprun():
data = {'Name':['Tom', 'Brad', 'Kyle', 'Jerry'],
'Age':[20, 21, 19, 18],
'Height' : [6.1, 5.9, 6.0, 6.1]
}
df = pd.DataFrame(data)
df=df.assign(A=123,
B=lambda x:x.Age+x.Height,
C=lambda x:x.Name.str.upper(),
D=lambda x:x.Name.str.lower()
)
return df
In [8]: %lprun -f test_lprun test_lprun()
Timer unit: 1e-07 s
Total time: 0.0044901 s
File: <ipython-input-7-eaf21639fb5f>
Function: test_lprun at line 1
Line # Hits Time Per Hit % Time Line Contents
==============================================================
1 def test_lprun():
2 1 21.0 21.0 0.0 data = {'Name':['Tom', 'Brad', 'Kyle', 'Jerry'],
3 1 13.0 13.0 0.0 'Age':[20, 21, 19, 18],
4 1 15.0 15.0 0.0 'Height' : [6.1, 5.9, 6.0, 6.1]
5 }
6 1 8651.0 8651.0 19.3 df = pd.DataFrame(data)
7 1 19.0 19.0 0.0 df=df.assign(A=123,
8 1 11.0 11.0 0.0 B=lambda x:x.Age+x.Height,
9 1 10.0 10.0 0.0 C=lambda x:x.Name.str.upper(),
10 1 36147.0 36147.0 80.5 D=lambda x:x.Name.str.lower()
11 )
12 1 14.0 14.0 0.0 return df
When using pandas assign, it could not tell which rows occupies most time but tell the whole result for assign function.
Goal: line_profile could tell each row result for pandas assign function like Line 6 %Time is 10, Line7 %Time is 30 and so on.