24

How to change date format in pydantic for validation and serialization? For validation I am using @validator. Is there an solution for both cases?

jonsbox
  • 472
  • 1
  • 6
  • 13
  • 1
    Perhaps would be helpful https://stackoverflow.com/questions/65230604/can-i-create-a-unix-time-type-which-automatically-converts-to-datetime-in-pydant/65231106#65231106 – alex_noname Mar 09 '21 at 14:26
  • 1
    @alex_noname It works in data.json(), but not works in fastapi serializer. – jonsbox Mar 09 '21 at 16:04

5 Answers5

24

You can implement a custom json serializer by using pydantic's custom json encoders. Then, together with pydantic's custom validator, you can have both functionalities.


from datetime import datetime, timezone
from pydantic import BaseModel, validator


def convert_datetime_to_iso_8601_with_z_suffix(dt: datetime) -> str:
    return dt.strftime('%Y-%m-%dT%H:%M:%SZ')


def transform_to_utc_datetime(dt: datetime) -> datetime:
    return dt.astimezone(tz=timezone.utc)


class DateTimeSpecial(BaseModel):
    datetime_in_utc_with_z_suffix: datetime

    # custom input conversion for that field
    _normalize_datetimes = validator(
        "datetime_in_utc_with_z_suffix",
        allow_reuse=True)(transform_to_utc_datetime)

    class Config:
        json_encoders = {
            # custom output conversion for datetime
            datetime: convert_datetime_to_iso_8601_with_z_suffix
        }


if __name__ == "__main__":
    special_datetime = DateTimeSpecial(datetime_in_utc_with_z_suffix="2042-3-15T12:45+01:00")  # note the different timezone

    # input conversion
    print(special_datetime.datetime_in_utc_with_z_suffix)  # 2042-03-15 11:45:00+00:00

    # output conversion
    print(special_datetime.json())  # {"datetime_in_utc_with_z_suffix": "2042-03-15T11:45:00Z"}

This variant also works in fastapi's serializer where I am actually using it in that way.

Fabian
  • 403
  • 5
  • 13
  • I know about validators and encoders but I think the definition of what formats are accepted should be in the schema. Somehow the `date` type seems to be assumed and is not output in the schema `definitions`. How do we add the acceptable date format to the schema definitions? – NeilG Mar 04 '23 at 05:03
  • @NeilG In general, as shown in the examples of [Pydantic's custom validators](https://docs.pydantic.dev/usage/validators/), you can just raise an error in the custom validator. In my specific case (if I remember correctly), the acceptable date format was one of [Pydantic's datetime formats](https://docs.pydantic.dev/usage/types/#datetime-types). If you would like to have another datetime format, I suggest you could either try out [Pydantic's pre-validator](https://docs.pydantic.dev/usage/validators/#pre-and-per-item-validators) or you really need to use a plain `str` as class of your datetime. – Fabian Mar 08 '23 at 21:47
  • Thanks @Fabian, I'm happy to accept the default Pydantic formats. ISO format is fine by me, and possibly best practice as an acceptable input format. Front ends should handle specific user formatting. But it's just that the *default* format is *not* defined in the (default) schema that's published, and my perception is that there is not enough consensus to be able to assume users will know to use ISO. So I guess I'm hearing from you now that if you want the published schema to include the ISO format in type definitions you'll have to use a custom validator to force it to be published. – NeilG Mar 09 '23 at 00:30
  • To be honest, I haven't used pydantic for quite some time so I'm not 100% sure but if I remember correctly, any format that is not as defined in [pydantic's datetime format](https://docs.pydantic.dev/usage/types/#datetime-types) would in this case throw an error. But it's best if you try that out by yourself. – Fabian Mar 09 '23 at 18:48
  • Thanks, @Fabian. Wrong format (in this case ISO) *will* get an error, of course. I suppose they are unlikely to submit something that parses as ISO without knowing they're using ISO (how can you put American date order into an ISO format without realising it). But I've got a lot of Windows users here and I don't expect them to be very competent at standards based exchanges. And suppose someone sends `YYYY-DD-MM`? To avoid perceived bugs and support calls it would make sense to include the required format in the published OpenAPI spec. Date format is not as obvious as floats, for instance. – NeilG Mar 10 '23 at 05:28
17

I think that pre validator can help here.

from datetime import datetime, date

from pydantic import BaseModel, validator


class OddDate(BaseModel):
    birthdate: date

    @validator("birthdate", pre=True)
    def parse_birthdate(cls, value):
        return datetime.strptime(
            value,
            "%d/%m/%Y"
        ).date()


if __name__ == "__main__":
    odd_date = OddDate(birthdate="12/04/1992")
    print(odd_date.json()) #{"birthdate": "1992-04-12"}
Omer Shacham
  • 618
  • 4
  • 11
  • This is how I implemented the validator. And how to make a serializer? – jonsbox Apr 13 '21 at 12:57
  • this worked for me, just make sure the type of the model and the type returned from the validator are the same and can be run through the validator. ie. the param must be able to take that type. – Zaffer Jul 30 '22 at 19:59
14

In case you don't necessarily want to apply this behavior to all datetimes, you can create a custom type extending datetime. For example, to make a custom type that always ensures we have a datetime with tzinfo set to UTC:

from datetime import datetime, timezone

from pydantic.datetime_parse import parse_datetime


class utc_datetime(datetime):
    @classmethod
    def __get_validators__(cls):
        yield parse_datetime  # default pydantic behavior
        yield cls.ensure_tzinfo

    @classmethod
    def ensure_tzinfo(cls, v):
        # if TZ isn't provided, we assume UTC, but you can do w/e you need
        if v.tzinfo is None:
            return v.replace(tzinfo=timezone.utc)
        # else we convert to utc
        return v.astimezone(timezone.utc)
    
    @staticmethod
    def to_str(dt:datetime) -> str:
        return dt.isoformat() # replace with w/e format you want

Then your pydantic models would look like:

from pydantic import BaseModel

class SomeObject(BaseModel):
    some_datetime_in_utc: utc_datetime

    class Config:
        json_encoders = {
            utc_datetime: utc_datetime.to_str
        }

Going this route helps with reusability and separation of concerns :)

aiguofer
  • 1,887
  • 20
  • 34
  • any way to declare that Config anywhere else, this is such a bad practice ... –  Dec 16 '21 at 13:02
  • I suppose one could implement a custom `json` function on the object itself overriding the default.. I wish there was a better way, especially because the encoder isn't taken into account when calling `dict` instead. – aiguofer Dec 16 '21 at 15:08
  • 2
    @user7111260 that's how Pydantic configures models, see – bluesmonk Apr 12 '22 at 17:13
  • 2
    why that is a bad pracitce? – Phyo Arkar Lwin Sep 23 '22 at 12:12
  • 1
    I don't see this is "bad practice", nor "such bad practice". I suppose peeps would like to move the encoder into the custom object so they don't have to have a `Config` class attribute, but that's the way Pydantic has chosen to do it. I was thinking there may be a way to move the encoder into the object by using a dunder method that Pydantic might call when encoding but then I realised it's going to be down to the JSON encoder. A fully packed solution may then provide Pydantic BaseModel with an alternative JSON encoder and implement changes there. – NeilG Feb 22 '23 at 05:03
  • The `BaseModel` subclass should also implement `__modify_schema__`, @aiguofer, to present the valid / acceptable formats in the OpenAPI spec. I think the `date` type seems special as Pydantic doesn't include `date` in the schema `definitions`, but with this custom model there's no problem just adding `__modify_schema__`. There's plenty of examples of how to do that in this question, if you want a demo: https://stackoverflow.com/questions/75587442/validate-pydantic-dynamic-float-enum-by-name-with-openapi-description – NeilG Mar 04 '23 at 05:07
  • pydantic v2 has removed the `parse_datetime` function as per [migration docs](https://docs.pydantic.dev/2.3/migration/#removed-in-pydantic-v2) – dh762 Aug 25 '23 at 15:20
2

As of pydantic 2.0, we can use the @field_serializer decorator for serialization, and @field_validator for validation.

Taken from pydantic docs:

from datetime import datetime, timezone

from pydantic import BaseModel, field_serializer


class WithCustomEncoders(BaseModel):

    dt: datetime

    @field_serializer('dt')
    def serialize_dt(self, dt: datetime, _info):
        return dt.timestamp()


m = WithCustomEncoders(
    dt=datetime(2032, 6, 1, tzinfo=timezone.utc)
)
print(m.model_dump_json())
#> {"dt":1969660800.0}

And for validation:

from pydantic_core.core_schema import FieldValidationInfo

from pydantic import BaseModel, ValidationError, field_validator


class UserModel(BaseModel):
    name: str
    username: str
    password1: str
    password2: str

    @field_validator('name')
    def name_must_contain_space(cls, v):
        if ' ' not in v:
            raise ValueError('must contain a space')
        return v.title()

    @field_validator('password2')
    def passwords_match(cls, v, info: FieldValidationInfo):
        if 'password1' in info.data and v != info.data['password1']:
            raise ValueError('passwords do not match')
        return v

    @field_validator('username')
    def username_alphanumeric(cls, v):
        assert v.isalnum(), 'must be alphanumeric'
        return v


user = UserModel(
    name='samuel colvin',
    username='scolvin',
    password1='zxcvbn',
    password2='zxcvbn',
)
print(user)
"""
name='Samuel Colvin' username='scolvin' password1='zxcvbn' password2='zxcvbn'
"""
Trapsilo Bumi
  • 910
  • 8
  • 11
0

To make sure that a datetime field is Timezone-Aware and set to UTC we can use Annotated validators in Pydantic v2.

To quote from the docs:

You should use Annotated validators whenever you want to bind validation to a type instead of model or field.

from datetime import timezone, datetime
from typing import Annotated

from pydantic import BaseModel, AwareDatetime, AfterValidator, ValidationError


def validate_utc(dt: AwareDatetime) -> AwareDatetime:
    """Validate that the pydantic.AwareDatetime is in UTC."""
    if dt.tzinfo.utcoffset(dt) != timezone.utc.utcoffset(dt):
        raise ValueError("Timezone must be UTC")
    return dt


DatetimeUTC = Annotated[AwareDatetime, AfterValidator(validate_utc)]


class Datapoint(BaseModel):
  timestamp: DatetimeUTC


# valid
d0 = Datapoint(timestamp=datetime(2021, 1, 1, 0, 0, 0, tzinfo=timezone.utc))
print(f"d0: {d0.timestamp}, timezone: {d0.timestamp.tzinfo}")

# valid
d1 = Datapoint(timestamp='2021-01-01T00:00:00+00:00')
print(f"d1: {d1.timestamp}, timezone: {d1.timestamp.tzinfo}")

# valid
d2 = Datapoint(timestamp='2021-01-01T00:00:00Z')
print(f"d2: {d2.timestamp}, timezone: {d2.timestamp.tzinfo}")

# invalid, missing timezone
try:
    d3 = Datapoint(timestamp='2021-01-01T00:00:00')
except ValidationError as e:
    print(f"d3: {e}")

# invalid, non-UTC timezone
try:
    d4 = Datapoint(timestamp='2021-01-01T00:00:00+02:00')
except ValidationError as e:
    print(f"d4: {e}")

If we run this we see d0, d1, d2 are valid while d3 and d4 are not:

d0: 2021-01-01 00:00:00+00:00, timezone: UTC

d1: 2021-01-01 00:00:00+00:00, timezone: UTC

d2: 2021-01-01 00:00:00+00:00, timezone: UTC

d3: 1 validation error for Datapoint
timestamp
  Input should have timezone info [type=timezone_aware, input_value='2021-01-01T00:00:00', input_type=str]
    For further information visit https://errors.pydantic.dev/2.3/v/timezone_aware

d4: 1 validation error for Datapoint
timestamp
  Value error, Timezone must be UTC [type=value_error, input_value='2021-01-01T00:00:00+02:00', input_type=str]
    For further information visit https://errors.pydantic.dev/2.3/v/value_error
dh762
  • 2,259
  • 4
  • 25
  • 44