How do I implement the syntax for filtering dataframes in Pandas? (df[df.column1 > someValue]
)
I am trying to make a class that have the same syntax of Pandas when filtering dataframes.
How do I replicate the syntax for a Dataframe df = DataFrame(someData)
like this one:
df[df.column1 > someValue]
I implemented the methods __getattr__
and __getitem__
for the syntaxes of
df.column1
df['column1']
But I don't know how to link both together. Also, I could not find the function to copy from Pandas code.
Either an implementation to this problem or the reference to the function in Pandas would be of great help.
Edit:(Solution)
Following the hint on the answers I implemented the __getitem__
function as follows:
from tier tools import compress
def __getitem__(self, name):
"""Get items with [ and ]
"""
#If there is no expression, return a column
if isinstance(name, str):
return self.data[name]
#if there was an expression return the dataframe filtered
elif isinstance(name, list):
ind = list(compress(range(len(name)), name))
temp = DataFrame([[self.data[c].values[i]
for i in ind]
for c in self.columns],
columns=self.columns)
return temp
Note that I also had to implement the comparison methods for my column class (Series). The full code can be seen here.