Dynamically Generating Pydantic Model from a Schema JSON File

Question

I want to dynamically generate a Pydantic model at runtime. I can do this by calling create_model. For example,

from pydantic import create_model

create_model("MyModel", i=(int,...), s=(str...))

does this same thing as

from pydantic import BaseModel

class MyModel(BaseModel):
    i: int
    s: str

I want to serialize these Pydantic schemas as JSON. It's easy to write code to parse JSON into create_model arguments, and it would make sense to use the output of BaseModel.schema_json() since that already defines a serialization format. That makes me think that there should already be some sort of BaseModel.from_json_schema classmethod that could dynamically create a model like so

from pydantic import BaseModel

class MyModel(BaseModel):
    i: int
    s: str

my_model = BaseModel.from_json_schema(MyModel.schema_json())
my_model(i=5, s="s") # returns MyModel(i=5, s="s")

I can't find any such function in the documentation. Am I overlooking something, or do I have to write my own own JSON schema deserialization code?

score 4 · Accepted Answer · answered Sep 25 '22 at 13:47

4

This has been discussed some time ago and Samuel Colvin said he didn't want to pursue this as a feature for Pydantic.

If you are fine with code generation instead of actual runtime creation of models, you can use the datamodel-code-generator.

To be honest, I struggle to see the use case for generating complex models at runtime, seeing as their main purpose is validation, implying that you think about correct schema before running your program. But that is just my view.

For simple models I guess you can throw together your own logic for this fairly quickly.

If do you need something more sophisticated, the aforementioned library does offer some extensibility. You should be able to import and inherit from some of their classes like the JsonSchemaParser. Maybe that will get you somewhere.

Ultimately I think this becomes non-trivial very quickly, which is why Pydantic's maintainer didn't want to deal with it and why there is a whole separate project for this.

answered Sep 25 '22 at 13:47

Daniil Fajnberg

12,753
2
10
41

I'm trying to create an ETL application that does data ingestion and transformation in a configurable manner. Users write their own configurations. An aspect of the data transformation is type validation and coercion for which Pydantic seems a good choice. I could just have users write this part as Pydantic classees in Python source code but I don't want to because (1) writing configuration as source code gets tricky fast and (2) there are going to be other configurable aspects that require JSON. – W.P. McNeill Sep 25 '22 at 19:04
The github discussion is really helpful. I'll probably just write my own schema deserialization, since if this grows into a full-blown ETL application I'll end up using someone else's implementation of that anyway. – W.P. McNeill Sep 25 '22 at 19:15
I hvae the use case: the Architect writes the schema in a repository, the Dev will generate on the fly the classes, validation and so on... – gdm Jun 01 '23 at 10:02

Dynamically Generating Pydantic Model from a Schema JSON File

1 Answers1