Questions tagged [pandera]

pandera provides a flexible and expressive API for performing data validation on dataframes to make data processing pipelines more readable and robust.

38 questions
0
votes
0 answers

Pandera print pa.errors.SchemaErrors with pa.check_inputs

Is there a way to print SchemaErrors when using pa.check_inputs? say i have df below import pandera as pa import pandas as pd df = pd.DataFrame.from_dict({ 'a' : [1,2,2,4,5], 'b' : [1,2,3,4,'dogs'], }) schema = pa.DataFrameSchema({ …
manny
  • 317
  • 1
  • 2
  • 9
0
votes
1 answer

Pandera columns joint uniqueness

I need to check a data frame for joint uniqueness of similar columns. In the documentation I have found this code snippet but it is applicable only to DataFrameSchema. import pandas as pd import pandera as pa schema = pa.DataFrameSchema( …
MariaMadalina
  • 479
  • 6
  • 20
0
votes
2 answers

How would you decorate without modifying an inherited method?

I've seen and tried the answer given at How would one decorate an inherited method in the child class? to no avail. Sample data: import pandas as pd df = pd.DataFrame([('Tom', 'M'), ('Sarah', 'X')], columns=['PersonName', 'PersonSex']) I am using…
TomNash
  • 3,147
  • 2
  • 21
  • 57
0
votes
2 answers

How to validate a dataframe index using SchemaModel in Pandera

I can validate a DataFrame index using the DataFrameSchema like this: import pandera as pa from pandera import Column, DataFrameSchema, Check, Index schema = DataFrameSchema( columns={ "column1": pa.Column(int), }, …
Nilo Araujo
  • 725
  • 6
  • 15
0
votes
1 answer

How do I create a column in pandera that contains a valid datetime as a string?

I'm working with a schema that has a column which contains string representations of datetimes. I'd like to make sure that the string is a valid datetime. It looks like doing this with regex and str_matches is not sufficient (and challenging to…
jhnatr
  • 503
  • 5
  • 9
0
votes
0 answers

Percent change using Pandera for Pandas DataFrame

I have the following DataFrame. I need to do validation of balance and other numeric measures over date range. I want to check if for any group and date, the balance or other measures have changed by more than 25%. I can filter numerically using…
user3376169
  • 405
  • 1
  • 5
  • 17
0
votes
1 answer

Pandera: Is cell based dataframe data validation possible?

Every row of my dataframe contain a record with a unique key combination. The data validation will be based on the columns and on key combination. For example, in a single column, cells may have a different min/max requirement based on the key…
Walter Kelt
  • 2,199
  • 1
  • 18
  • 22
0
votes
1 answer

How do I validate a value in a dataframe which is dependent on other value in that specific row?

Suppose I have a .csv which follows this format: Name, Salary, Department, Mandatory Rob, 5500, Aviation, Yes Bob, 1000, Facilities, No Tom, 6000, IT, Yes After exporting this to pandas/modin, I'd like to perform row-differentiated checks,…
Somebody Out There
  • 347
  • 1
  • 5
  • 15
1 2
3