pandera provides a flexible and expressive API for performing data validation on dataframes to make data processing pipelines more readable and robust.
Questions tagged [pandera]
38 questions
0
votes
0 answers
Pandera print pa.errors.SchemaErrors with pa.check_inputs
Is there a way to print SchemaErrors when using pa.check_inputs? say i have df below
import pandera as pa
import pandas as pd
df = pd.DataFrame.from_dict({
'a' : [1,2,2,4,5],
'b' : [1,2,3,4,'dogs'],
})
schema = pa.DataFrameSchema({
…

manny
- 317
- 1
- 2
- 9
0
votes
1 answer
Pandera columns joint uniqueness
I need to check a data frame for joint uniqueness of similar columns.
In the documentation I have found this code snippet but it is applicable only to DataFrameSchema.
import pandas as pd
import pandera as pa
schema = pa.DataFrameSchema(
…

MariaMadalina
- 479
- 6
- 20
0
votes
2 answers
How would you decorate without modifying an inherited method?
I've seen and tried the answer given at How would one decorate an inherited method in the child class? to no avail.
Sample data:
import pandas as pd
df = pd.DataFrame([('Tom', 'M'), ('Sarah', 'X')], columns=['PersonName', 'PersonSex'])
I am using…

TomNash
- 3,147
- 2
- 21
- 57
0
votes
2 answers
How to validate a dataframe index using SchemaModel in Pandera
I can validate a DataFrame index using the DataFrameSchema like this:
import pandera as pa
from pandera import Column, DataFrameSchema, Check, Index
schema = DataFrameSchema(
columns={
"column1": pa.Column(int),
},
…

Nilo Araujo
- 725
- 6
- 15
0
votes
1 answer
How do I create a column in pandera that contains a valid datetime as a string?
I'm working with a schema that has a column which contains string representations of datetimes. I'd like to make sure that the string is a valid datetime. It looks like doing this with regex and str_matches is not sufficient (and challenging to…

jhnatr
- 503
- 5
- 9
0
votes
0 answers
Percent change using Pandera for Pandas DataFrame
I have the following DataFrame. I need to do validation of balance and other numeric measures over date range. I want to check if for any group and date, the balance or other measures have changed by more than 25%.
I can filter numerically using…

user3376169
- 405
- 1
- 5
- 17
0
votes
1 answer
Pandera: Is cell based dataframe data validation possible?
Every row of my dataframe contain a record with a unique key combination. The data validation will be based on the columns and on key combination. For example, in a single column, cells may have a different min/max requirement based on the key…

Walter Kelt
- 2,199
- 1
- 18
- 22
0
votes
1 answer
How do I validate a value in a dataframe which is dependent on other value in that specific row?
Suppose I have a .csv which follows this format:
Name, Salary, Department, Mandatory
Rob, 5500, Aviation, Yes
Bob, 1000, Facilities, No
Tom, 6000, IT, Yes
After exporting this to pandas/modin, I'd like to perform row-differentiated checks,…

Somebody Out There
- 347
- 1
- 5
- 15