How to check whether a pandas DataFrame
is empty? In my case I want to print some message in terminal if the DataFrame
is empty.

- 319
- 1
- 5
- 14

- 32,876
- 32
- 87
- 121
-
5len() doesn't work? It should return 0 for empty dataframe. – VIKASH JAISWAL Nov 07 '13 at 05:55
5 Answers
You can use the attribute df.empty
to check whether it's empty or not:
if df.empty:
print('DataFrame is empty!')
Source: Pandas Documentation

- 3
- 1
- 2

- 26,968
- 4
- 39
- 65
-
4This seems like a shame, since you need to know that df is a pd.DataFrame. I'd like to know the motivation for not implementing bool() on pd.DataFrame. – Quant Feb 14 '14 at 16:55
-
24@Quant - The documentation has a discussion on why __bool__ raises an error for a dataframe here: [link](http://pandas.pydata.org/pandas-docs/dev/gotchas.html#gotchas-truth). Quote: "Should it be True because it’s not zero-length? False because there are False values? It is unclear, so instead, pandas raises a ValueError" – Bij Apr 18 '14 at 14:04
-
2Much more faster approach is `df.shape[0] == 0` to check if dataframe is empty. You can test it. – highlytrainedbadger Nov 10 '20 at 13:40
-
3This method would not work in all of the cases, as in some cases empty dataframe might be of NoneType. – Anish Jain Mar 24 '21 at 07:51
-
2@AnishJain To be clear, we are dealing with emptiness here, not nullity; if we want to find out whether a data frame is empty, we need to have a data frame object first; testing nullity is a different matter. If your data frame is NoneType to start with, you are not testing emptiness, you want to know whether you have an object or not. – stucash Jun 08 '22 at 06:05
I use the len
function. It's much faster than empty
. len(df.index)
is even faster.
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(10000, 4), columns=list('ABCD'))
def empty(df):
return df.empty
def lenz(df):
return len(df) == 0
def lenzi(df):
return len(df.index) == 0
'''
%timeit empty(df)
%timeit lenz(df)
%timeit lenzi(df)
10000 loops, best of 3: 13.9 µs per loop
100000 loops, best of 3: 2.34 µs per loop
1000000 loops, best of 3: 695 ns per loop
len on index seems to be faster
'''

- 1,952
- 2
- 17
- 28

- 74,117
- 18
- 147
- 154
-
12A DataFrame can be empty due either len(df.index) == 0 or len(df.columns) == 0 as well. – Mark Horvath Nov 04 '16 at 09:53
-
9No, a data frame can contain columns but still be empty. len(df.index) == 0 is the best solution – salRad Jun 30 '21 at 07:03
To see if a dataframe is empty, I argue that one should test for the length of a dataframe's columns index:
if len(df.columns) == 0: 1
Reason:
According to the Pandas Reference API, there is a distinction between:
- an empty dataframe with 0 rows and 0 columns
- an empty dataframe with rows containing
NaN
hence at least 1 column
Arguably, they are not the same. The other answers are imprecise in that df.empty
, len(df)
, or len(df.index)
make no distinction and return index is 0 and empty is True in both cases.
Examples
Example 1: An empty dataframe with 0 rows and 0 columns
In [1]: import pandas as pd
df1 = pd.DataFrame()
df1
Out[1]: Empty DataFrame
Columns: []
Index: []
In [2]: len(df1.index) # or len(df1)
Out[2]: 0
In [3]: df1.empty
Out[3]: True
Example 2: A dataframe which is emptied to 0 rows but still retains n
columns
In [4]: df2 = pd.DataFrame({'AA' : [1, 2, 3], 'BB' : [11, 22, 33]})
df2
Out[4]: AA BB
0 1 11
1 2 22
2 3 33
In [5]: df2 = df2[df2['AA'] == 5]
df2
Out[5]: Empty DataFrame
Columns: [AA, BB]
Index: []
In [6]: len(df2.index) # or len(df2)
Out[6]: 0
In [7]: df2.empty
Out[7]: True
Now, building on the previous examples, in which the index is 0 and empty is True. When reading the length of the columns index for the first loaded dataframe df1, it returns 0 columns to prove that it is indeed empty.
In [8]: len(df1.columns)
Out[8]: 0
In [9]: len(df2.columns)
Out[9]: 2
Critically, while the second dataframe df2 contains no data, it is not completely empty because it returns the amount of empty columns that persist.
Why it matters
Let's add a new column to these dataframes to understand the implications:
# As expected, the empty column displays 1 series
In [10]: df1['CC'] = [111, 222, 333]
df1
Out[10]: CC
0 111
1 222
2 333
In [11]: len(df1.columns)
Out[11]: 1
# Note the persisting series with rows containing `NaN` values in df2
In [12]: df2['CC'] = [111, 222, 333]
df2
Out[12]: AA BB CC
0 NaN NaN 111
1 NaN NaN 222
2 NaN NaN 333
In [13]: len(df2.columns)
Out[13]: 3
It is evident that the original columns in df2 have re-surfaced. Therefore, it is prudent to instead read the length of the columns index with len(pandas.core.frame.DataFrame.columns)
to see if a dataframe is empty.
Practical solution
# New dataframe df
In [1]: df = pd.DataFrame({'AA' : [1, 2, 3], 'BB' : [11, 22, 33]})
df
Out[1]: AA BB
0 1 11
1 2 22
2 3 33
# This data manipulation approach results in an empty df
# because of a subset of values that are not available (`NaN`)
In [2]: df = df[df['AA'] == 5]
df
Out[2]: Empty DataFrame
Columns: [AA, BB]
Index: []
# NOTE: the df is empty, BUT the columns are persistent
In [3]: len(df.columns)
Out[3]: 2
# And accordingly, the other answers on this page
In [4]: len(df.index) # or len(df)
Out[4]: 0
In [5]: df.empty
Out[5]: True
# SOLUTION: conditionally check for empty columns
In [6]: if len(df.columns) != 0: # <--- here
# Do something, e.g.
# drop any columns containing rows with `NaN`
# to make the df really empty
df = df.dropna(how='all', axis=1)
df
Out[6]: Empty DataFrame
Columns: []
Index: []
# Testing shows it is indeed empty now
In [7]: len(df.columns)
Out[7]: 0
Adding a new data series works as expected without the re-surfacing of empty columns (factually, without any series that were containing rows with only NaN
):
In [8]: df['CC'] = [111, 222, 333]
df
Out[8]: CC
0 111
1 222
2 333
In [9]: len(df.columns)
Out[9]: 1

- 1,100
- 11
- 11
I prefer going the long route. These are the checks I follow to avoid using a try-except clause -
- check if variable is not None
- then check if its a dataframe and
- make sure its not empty
Here, DATA
is the suspect variable -
DATA is not None and isinstance(DATA, pd.DataFrame) and not DATA.empty

- 15,568
- 15
- 58
- 76
-
3This is redundant and bad practice if it's expected that the variable will be a DataFrame (which is what the OP implies) that is either empty or has rows. If it's not a DF (or if it's none), an exception should be thrown since something went wrong somewhere. – fgblomqvist Sep 19 '19 at 22:06
-
2In Python, `try/except` is cheap and `if` is expensive. Python is neither Java nor C; here it's [Easier to Ask Forgiveness than Permission](https://docs.python.org/3/glossary.html#term-eafp) – Nick Marinakis Apr 14 '20 at 01:26
If a DataFrame has got Nan and Non Null values and you want to find whether the DataFrame is empty or not then try this code.
when this situation can happen? This situation happens when a single function is used to plot more than one DataFrame which are passed as parameter.In such a situation the function try to plot the data even when a DataFrame is empty and thus plot an empty figure!. It will make sense if simply display 'DataFrame has no data' message.
why? if a DataFrame is empty(i.e. contain no data at all.Mind you DataFrame with Nan values is considered non empty) then it is desirable not to plot but put out a message : Suppose we have two DataFrames df1 and df2. The function myfunc takes any DataFrame(df1 and df2 in this case) and print a message if a DataFrame is empty(instead of plotting):
df1 df2 col1 col2 col1 col2 Nan 2 Nan Nan 2 Nan Nan Nan
and the function:
def myfunc(df):
if (df.count().sum())>0: ##count the total number of non Nan values.Equal to 0 if DataFrame is empty
print('not empty')
df.plot(kind='barh')
else:
display a message instead of plotting if it is empty
print('empty')

- 9,861
- 3
- 15
- 38

- 139
- 1
- 6