0

I built a data frame on python using an inputed SQL query. Afer this I name my columns and make sure it's nice to isolate columns with NaN values :

cursor.execute(raw_input("Enter your SQL query: "))
records = cursor.fetchall()
import pandas as pd
dframesql = pd.DataFrame(records)
dframesql.columns = [i[0] for i in cursor.description]

The problem comes after when I want to compare the number of rows with data with the total number of rows in the data frame :

dframelines = len(dframesql)
dframedesc = pd.DataFrame(dframesql.count())

When I try to compare dframedesc with dframelines, I get an error

nancol = []
for line in dframedesc:
    if dframedesc < dframelines:
        nancol.append(line)

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Thanks in advance !

user4634131
  • 1
  • 1
  • 1

1 Answers1

1

If you want to do it with a forloop, loop through the df's index:

nancol = []
for index in dframedesc.index:
    if dframedesc.loc[index,'a_column'] < dframelines:
        nancol.append(dframedesc.loc[index,:])

But why not just:

dframedesc[dframedesc['col_to_compare'] < dframelines]
Liam Foley
  • 7,432
  • 2
  • 26
  • 24