6

I'm using pydantic and want to create classes which contain pandas dataframes. I was looking for this online for quite a time and did not find anything. My code for the custom types looks as following. I named the type for dataframes pd.DataFrame but obviously its not correct. Does anyone know how to declare a pandas dataframe type?

import pandas as pd
from pydantic import BaseModel


class SubModelInput(BaseModel):
    a: pd.DataFrame
    b: pd.DataFrame

class ModelInput(BaseModel):
    SubModelInput: SubModelInput
    a: pd.DataFrame
    b: pd.DataFrame
    c: pd.DataFrame

Thanks for any help!

sophros
  • 14,672
  • 11
  • 46
  • 75
mafehx
  • 363
  • 2
  • 6
  • 14

3 Answers3

7

If I understand correctly, your intention is to create a pythonic type hint for a pd.Dataframe. I suppose you could utilize the below implementation:

import pandas as pd
from pydantic import BaseModel

from typing import TypeVar

PandasDataFrame = TypeVar('pandas.core.frame.DataFrame')


class SubModelInput(BaseModel):
    a: PandasDataFrame
    b: PandasDataFrame


class ModelInput(BaseModel):
    SubModelInput: SubModelInput
    a: PandasDataFrame
    b: PandasDataFrame
    c: PandasDataFrame


data_frame = pd.DataFrame([{"a": "foo", "b": "bar"}])

sub_model = SubModelInput(a=data_frame, b=data_frame)

model = ModelInput(a=data_frame, b=data_frame, c=data_frame, SubModelInput=sub_model)

model.dict()

# {'SubModelInput': {'a':      a    b
# 0  foo  bar, 'b':      a    b
# 0  foo  bar}, 'a':      a    b
# 0  foo  bar, 'b':      a    b
# 0  foo  bar, 'c':      a    b
# 0  foo  bar}
mustafasencer
  • 723
  • 4
  • 12
7

You can activate Arbitrary Types Allowed:

import pandas as pd
from pydantic import BaseModel


class SubModelInput(BaseModel):
    a: pd.DataFrame
    b: pd.DataFrame

    class Config:
        arbitrary_types_allowed = True

class ModelInput(BaseModel):
    SubModelInput: SubModelInput
    a: pd.DataFrame
    b: pd.DataFrame
    c: pd.DataFrame

    class Config:
        arbitrary_types_allowed = True
ST7
  • 2,272
  • 1
  • 20
  • 13
1

For pydantic V2.0, there is a slight change in the config. The right way to do it is:

import pandas as pd
from pydantic import BaseModel, ConfigDict


class SubModelInput(BaseModel):
    model_config = ConfigDict(arbitrary_types_allowed=True)
    a: pd.DataFrame
    b: pd.DataFrame

class ModelInput(BaseModel):
    model_config = ConfigDict(arbitrary_types_allowed=True)
    SubModelInput: SubModelInput
    a: pd.DataFrame
    b: pd.DataFrame
    c: pd.DataFrame

More info can be found in the docs

Nikos Korovesis
  • 124
  • 1
  • 8