I have a dataframe consisting of four columns and around 20000 rows, like this.
import pandas as pd
import numpy as np
d = {'x': [1,1,0,1,0,0,1],'BPM':[70,55,45,np.nan,35,25,np.nan],'AGE': [50, 47,21, 50,24,47,16], 'WEIGHT': [50,100,50,np.nan,np.nan,100,27]}
df = pd.DataFrame(data=d)
x BPM AGE WEIGHT
1 70 50 50
1 55 47 100
0 45 21 50
1 nan 24 nan
0 35 50 nan
0 25 47 100
1 nan 16 27
Is there any significant difference in "BPM" between class '1' and class '0' after matching AGE and WEIGHT?
There are two classes: 0 and 1. The number of samples is not equal in both classes. I understand that first I have to match values, then I can apply t-test. I am new to this field, so I do not understand how to proceed.