Is there a way to check for linear dependency for columns in a pandas dataframe? For example:
columns = ['A','B', 'C']
df = pd.DataFrame(columns=columns)
df.A = [0,2,3,4]
df.B = df.A*2
df.C = [8,3,5,4]
print(df)
A B C
0 0 0 8
1 2 4 3
2 3 6 5
3 4 8 4
Is there a way to show that column B
is a linear combination of A
, but C
is an independent column? My ultimate goal is to run a poisson regression on a dataset, but I keep getting a LinAlgError: Singular matrix
error, meaning no inverse exists of my dataframe and thus it contains dependent columns.
I would like to come up with a programmatic way to check each feature and ensure there are no dependent columns.