1

So I have this large JSON dataset with a huge number of properties (as well as nested objects). I have used datamodel-code-generator to build pydantic models. The JSON data has been imported into pydantic models (with nested models).

Now I need to push them into the database.

How can I automatically generate SQLAlchemy/SQLModel (or any other ORM) model definition classes from pydantic models?

Note: SQLModel and Pydantic seem to have similar syntax for model definition, so I guess I could copy-paste the pydantic models and edit as necessary. But I'm not that familiar with SQLModel yet. I was wondering if there was an automated way to do this.

masroore
  • 9,668
  • 3
  • 23
  • 28
  • An automated way to generate database schemas with a "huge number of properties" sounds like a terrible idea, to be honest. Yes, `SQLModel` inherits from `BaseModel`, so you could just mindlessly replace that and add the `table=True` argument, but I would suggest taking a closer look at them first. Especially relationships usually require some attention. In general I would say, if you care so little about the schemas that you are willing to automatically generate them, maybe you should just use a schemaless database to begin with. – Daniil Fajnberg Mar 21 '23 at 23:10
  • @DaniilFajnberg I know that. Obviously the "automated technique" will do 80% of the work, and remaining will have to be taken care of manually. Yes, the relationships will have their own tables (pydantic extracts those related objects into different models, which helps a lot). NOSQL is not an option unfortunately. The raw JSON data I'm dealing with can go up to 330KB in size per entity, and has a lot of invalid attributes like "TAX RATE (1)", "ADDL_PROP-ID". Each of the entities have hundreds of such properties, and most entities contain a Russian doll of dependent entities. – masroore Mar 22 '23 at 05:22
  • @DaniilFajnberg With Pydantic I was able to convert the mess into neat little discrete python objects (with some manual tweaking of course). Now I want to import this data into the database without recreating SQL model classes. Obviously the table relationships will have to be manually defined, but for the 95% of the simple properties I need to automate. For the record, the consumer end of this data will be built with Asp.net and entity framework core. Hence the DDL will be a one time job. – masroore Mar 22 '23 at 05:22
  • I see, this was not clear to me from the way you phrased the question initially. Would you mind providing a highly reduced/simplified example (ideally including a relationship) of two or three Pydantic models you ended up with, which you wish to turn into database schemas? Also, how many models have you ended up with actually (approximately)? – Daniil Fajnberg Mar 22 '23 at 08:45

1 Answers1

1

There are several python packages that add an ORM layer on top of Pydantic.

Pynocular - supports pgsql only

pydbantic - lightweight ORM on top of pydantic models.

SQLModel - from the author of FastAPI. I'd chosen to go with sqlmodel as it provides the power of SQLAlchemy under the hood.

masroore
  • 9,668
  • 3
  • 23
  • 28