I must be doing a very basic mistake. I am trying to select only certain columns from a dataframe, dropping the na rows. I also am supposed to reset the row index after removing the rows.
This is what my dataset looks like
CRIM ZN INDUS CHAS NOX ... TAX PTRATIO B LSTAT MEDV
0 0.00632 18.0 2.31 0.0 0.538 ... 296 15.3 396.90 4.98 24.0
1 0.02731 0.0 7.07 0.0 0.469 ... 242 17.8 396.90 9.14 21.6
2 0.02729 0.0 7.07 0.0 0.469 ... 242 17.8 392.83 4.03 34.7
This is what I have tried so far
F = HousingData.dropna(subset = ['CRIM', 'ZN', 'INDUS'])
this first attempt just gives no output
HousingData.select("CRIM").show("CRIM")
this one gives the error message AttributeError: 'DataFrame' object has no attribute 'select'
cheers!