3

I found some examples on how to use ObjectId within BaseModel classes. Basically, this can be achieved by creating a Pydantic-friendly class as follows:

class PyObjectId(ObjectId):
    @classmethod
    def __get_validators__(cls):
        yield cls.validate

    @classmethod
    def validate(cls, v):
        if not ObjectId.is_valid(v):
            raise ValueError("Invalid objectid")
        return ObjectId(v)

    @classmethod
    def __modify_schema__(cls, field_schema):
        field_schema.update(type="string")

However, this seems to be for Pydantic v1, as this mechanisms have been superseeded by the __get_pydantic_core_schema__ classmethod. However, I have been unable to achieve an equivalent solution with Pydantic v2. Is is possible? What validators do I need? I tried to refactor things but was uanble to get anything usable.

MariusSiuram
  • 3,380
  • 1
  • 21
  • 40

3 Answers3

4

To migrate the old-fashioned PyObjectId to the newest pydantic-v2 version, the easiest way is to use an annotated validator.

from typing import Any
from typing import Annotated, Union
from bson import ObjectId
from pydantic import PlainSerializer, AfterValidator, WithJsonSchema

def validate_object_id(v: Any) -> ObjectId:
    if isinstance(v, ObjectId):
        return v
    if ObjectId.is_valid(v):
        return ObjectId(v)
    raise ValueError("Invalid ObjectId")

PyObjectId = Annotated[
    Union[str, ObjectId],
    AfterValidator(validate_object_id),
    PlainSerializer(lambda x: str(x), return_type=str),
    WithJsonSchema({"type": "string"}, mode="serialization"),
]

You can then use it in your model in this way:

from pydantic import BaseModel
from pydantic import ConfigDict, Field

class MyCustomModel(BaseModel):
    id: PyObjectId = Field(alias="_id")

    model_config = ConfigDict(arbitrary_types_allowed=True)

Test it out using TypeAdapter:

import pytest
from bson import ObjectId
from pydantic import TypeAdapter, ConfigDict
    
@pytest.mark.parametrize("obj", ["64b7992ba8f08069073f1055", ObjectId("64b7992ba8f08069073f1055")])
def test_pyobjectid_validation(obj):
    ta = TypeAdapter(PyObjectId, config=ConfigDict(arbitrary_types_allowed=True))
    ta.validate_python(obj)

@pytest.mark.parametrize("obj", ["64b7992ba8f08069073f1055", ObjectId("64b7992ba8f08069073f1055")])
def test_pyobjectid_serialization(obj):
    ta = TypeAdapter(PyObjectId, config=ConfigDict(arbitrary_types_allowed=True))
    ta.dump_json(obj)

This solution works well even with the newest FastAPI v0.100.0+

Marcel
  • 53
  • 6
A.B.
  • 105
  • 1
  • 8
  • However, this still does not work when using in the FastApi route directly like this: @report_router.delete("/presets/{_id}") def delete_preset(request: Request, _id: PyObjectId): giving the error: Invalid args for response field! Hint: check that typing.Union[str, bson.objectid.ObjectId] is a valid Pydantic field type. – gosuer1921 Aug 03 '23 at 12:28
  • running 0.100.1 version of FastAPI – gosuer1921 Aug 03 '23 at 12:55
  • This approach didn't work for me with Pydantic 2.1.1 and FastAPI 0.100.1. See my answer below. – Bojan Bogdanovic Aug 04 '23 at 16:16
  • @A.B. is there a way to use it as a class (not as a type), like before? I need it, because I also used `PyObjectId` class instead `ObjectId` as well everywhere. – Bi0max Aug 14 '23 at 14:50
  • @Bi0max please see the answer of Bojan Bogdanovic (https://stackoverflow.com/a/76837550/9204009), it's the correct way to migrate the old PyObjectId from v1 to v2 without any change. – A.B. Sep 01 '23 at 09:42
4

None of the above worked for me. I've followed Pydantic documentation to come up with this solution:

from typing import Annotated, Any, Callable

from bson import ObjectId
from fastapi import FastAPI
from pydantic import BaseModel, ConfigDict, Field, GetJsonSchemaHandler
from pydantic.json_schema import JsonSchemaValue
from pydantic_core import core_schema


# Based on https://docs.pydantic.dev/latest/usage/types/custom/#handling-third-party-types
class _ObjectIdPydanticAnnotation:
    @classmethod
    def __get_pydantic_core_schema__(
            cls,
            _source_type: Any,
            _handler: Callable[[Any], core_schema.CoreSchema],
    ) -> core_schema.CoreSchema:

        def validate_from_str(id_: str) -> ObjectId:
            return ObjectId(id_)

        from_str_schema = core_schema.chain_schema(
            [
                core_schema.str_schema(),
                core_schema.no_info_plain_validator_function(validate_from_str),
            ]
        )

        return core_schema.json_or_python_schema(
            json_schema=from_str_schema,
            python_schema=core_schema.union_schema(
                [
                    # check if it's an instance first before doing any further work
                    core_schema.is_instance_schema(ObjectId),
                    from_str_schema,
                ]
            ),
            serialization=core_schema.plain_serializer_function_ser_schema(
                lambda instance: str(instance)
            ),
        )

    @classmethod
    def __get_pydantic_json_schema__(
            cls, _core_schema: core_schema.CoreSchema, handler: GetJsonSchemaHandler
    ) -> JsonSchemaValue:
        # Use the same schema that would be used for `str`
        return handler(core_schema.str_schema())


PydanticObjectId = Annotated[
    ObjectId, _ObjectIdPydanticAnnotation
]


class User(BaseModel):
    model_config = ConfigDict(populate_by_name=True)

    id: PydanticObjectId = Field(alias='_id')
    name: str


app = FastAPI()


@app.get("/user/{id}")
def get_usr(id: str) -> User:
    # Here we would connect to MongoDB and return the user.
    # Method is here just to test that FastAPI will not complain about the "User" return type
    pass


# Some usage examples

user1 = User(_id=ObjectId('64cca8a68efc81fc425aa864'), name='John Doe')
user2 = User(_id='64cca8a68efc81fc425aa864', name='John Doe')
assert user1 == user2  # Can use str and ObjectId interchangeably

# Serialization
assert repr(user1) == "User(id=ObjectId('64cca8a68efc81fc425aa864'), name='John Doe')"
assert user1.model_dump() == {'id': '64cca8a68efc81fc425aa864', 'name': 'John Doe'}
assert user1.model_dump_json() == '{"id":"64cca8a68efc81fc425aa864","name":"John Doe"}'

# Deserialization
user2 = User.model_validate_json('{"id":"64cca8a68efc81fc425aa864","name":"John Doe"}')
user3 = User.model_validate_json('{"_id":"64cca8a68efc81fc425aa864","name":"John Doe"}')
assert user1 == user2 == user3

user4 = User(_id=ObjectId(), name='Jane Doe')  # Default ObjectId constructor

# Validation
user5 = User(_id=ObjectId('qwe'), name='Jack Failure')  # Will throw bson.errors.InvalidId




Bojan Bogdanovic
  • 500
  • 1
  • 6
  • 6
1

The easiest approach to solve this Pydantic-V2 is to create an annotated validator:

from typing_extensions import Annotated

from pydantic import BaseModel, ValidationError, field_validator
from pydantic.functional_validators import AfterValidator


PyObjectId = Annotated[
    ObjectId,
    AfterValidator(ObjectId.is_valid),
]

As far as representing as a string, you probably want to update the annotation, perhaps using a Union with a string, then perform an additional validator which converts to string:

PyObjectId = Annotated[
    Union[ObjectId, str],
    AfterValidator(ObjectId.is_valid),
    AfterValidator(str)
]

Test it out using TypeAdaptor:

from pydantic import TypeAdapter


ta = TypeAdapter(PyObjectId)
ta.validate_python("SOME-STRING")
Yaakov Bressler
  • 9,056
  • 2
  • 45
  • 69
  • 2
    So this article seems to be the "go to" for understanding how to use Pydantic 1 and MongoDB: https://www.mongodb.com/developer/languages/python/python-quickstart-fastapi/ - is this idea meant to be a drop in replacement for PyObjectId? Going to try and do so. – dixon1e Jul 17 '23 at 20:18