How to check if pandas Series is empty?
I have tried this:
How to check whether a pandas DataFrame is empty?
but it seems that Series has no property 'isempty'.
How to check if pandas Series is empty?
I have tried this:
How to check whether a pandas DataFrame is empty?
but it seems that Series has no property 'isempty'.
I use len function. It's much faster than empty(). len(df.index) is even faster.
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(10000, 4), columns=list('ABCD'))
def empty(df):
return df.empty
def lenz(df):
return len(df) == 0
def lenzi(df):
return len(df.index) == 0
'''
%timeit empty(df)
%timeit lenz(df)
%timeit lenzi(df)
10000 loops, best of 3: 13.9 µs per loop
100000 loops, best of 3: 2.34 µs per loop
1000000 loops, best of 3: 695 ns per loop
len on index seems to be faster
'''
I use this to check if a particular column in a dataFrame has no values or is empty:
len(df.col_name.value_counts()) > 0
According to the Pandas documentation you need to use the empty
property and not isempty
E.g.
In [12]: df.empty
Out[13]: False
If NDFrame contains only NaNs, it is still not considered empty. See the example below.
Examples
An example of an actual empty DataFrame. Notice the index is empty:
>>> df_empty = pd.DataFrame({'A' : []})
>>> df_empty
Empty DataFrame
Columns: [A]
Index: []
>>> df_empty.empty
True
If we only have NaNs in our DataFrame, it is not considered empty! We will need to drop the NaNs to make the DataFrame empty:
>>> df = pd.DataFrame({'A' : [np.nan]})
>>> df
A
0 NaN
>>> df.empty
False
>>> df.dropna().empty
True
Depending on your definition of empty, your answer may vary a lot, as indicated by the various other answers. I try to summarize, but first have some test DataFrames:
no_rows = pd.DataFrame([], columns=list('ABCD'))
no_cols = pd.DataFrame([], index=range(3))
only_na = pd.DataFrame(float('nan'), index=range(3), columns=list('ABCD'))
The currently most popular answer takes this approach: a DataFrame with 0 rows is empty:
def empty_no_rows(df):
return len(df.index) == 0
Not mentioned yet, but equally valid would be the transposed definition:
def empty_no_cols(df):
return len(df.columns) == 0
No actually, what you care about are values! If you prefer a definition that can deal with both empty index
or columns
, the following definition would work:
def empty_no_vals(df):
return df.values.size == 0
Why not live with pandas' own definition of emptiness, which for these test cases leads to the same results as the no values definition:
def empty_native(df):
return df.empty
Pandas' own implementation basically just checks if len(df.columns) == 0 or len(df.index) == 0
, and never looks at values
directly.
Finally, you might want to ignore NaN
in your considerations:
def empty_nans(df):
return df.dropna(how='all').empty
But actually, this opens the next can of worms, as you now must decide how
and along which axis
you want to discard values? I stick to the more conservative all
, here. And once those values are dropped, you could now apply all of the above definitions to its result.
DataFrame | empty_no_rows | empty_no_cols | empty_no_vals | empty_native | empty_nans |
---|---|---|---|---|---|
no_rows |
✅ True | ❌ False | ✅ True | ✅ True | ✅ True |
no_cols |
❌ False | ✅ True | ✅ True | ✅ True | ✅ True |
only_na |
❌ False | ❌ False | ❌ False | ❌ False | ✅ True |
Editorial remark: I would call all those functions is_empty_...
, but that leads to a comparison table that is too wide.
Thanks @sparrow I used this to test for datetime columns:
if len(df.select_dtypes(include='datetime').iloc[0].value_counts()) == 0:
print('DF DATETIME COLUMNS: ', len(df_dt.iloc[0].value_counts()))
None of the other methodes (a.any(), a.empty()...) worked. select returns with a non-empty index but with empty columns so I think that's it. I think it actually returns a series, hence the zero iloc.
I will explain my experiment:
I had a code as following:
matched_this = pd.Series(matched_groups)
And sometimes matched_this was []. But match_this.empty was False. So to solve this problem I used as follows:
if match_this[0]:
# do ...
Many answers here are dealing with measuring an empty pandas dataframe. A pandas dataframe is not the same as a pandas Series. An individual pandas Series may change its length during a data manipulation process. It may be useful to verify the length of a Series directly, using either
Series.empty
len()
len(Series.array)
Let's create 3 dataframes to compare the output when measuring the length of a Series
>>> df0 = pd.DataFrame({'X' : []})
>>> df0
Empty DataFrame
Columns: [X]
Index: []
>>> df1 = pd.DataFrame({'A' : ['np.Nan']})
>>> df1
A
0 np.NaN
>>> df2 = pd.DataFrame({'B' : ['b']})
>>> df2
B
0 b
The pandas Series df0.X
is empty. Therefore,
1. >>> df0.X.empty
True
2. >>> len(df0.X)
0
3. >>> len(df0.x.array)
0
Each other pandas Series pf1.A
and df2.B
contains 1 value. Therefore,
df1.A | df2.B
----------------------------|----------------------------
1. >>> df1.A.empty | 1. >>> df2.B.empty
False | False
2. >>> len(df1.A) | 2. >>> len(df2.B)
1 | 1
3. >>> len(df1.A.array) | 3. >>> len(df2.B.array)
1 | 1
Hence, to verify if a pandas Series is empty, e.g. for df['A']
, one may use
if df.A.empty == True:
if len(df.A) == 0:
if len(df.A.array) == 0:
If you want to check the columns id
you can try
df[df["id"].isna()].shape[0] == 0