Is there a way to insert foreign keys when using pandas to_sql function?
I am processing uploaded Consultation
s (n=40k) with pandas in django, before adding them to the database (postgres). I got this working row by row, but that takes 15 to 20 minutes. This is longer than I want my users to wait, so I am looking for a more efficient solution.
I tried pandas to_sql, but I cannot figure out how to add the two foreign key relations as columns to my consultations
dataframe before calling the to_sql
function. Is there a way to add the Patient and Praktijk foreign keys as a column in the consultations
dataframe?
More specifically, when inserting row by row, I use objects of type Patient
or Praktijk
when creating new consultations in the database. In a dataframe however, I cannot use these types, and therefore don't know how I could add the foreign keys correctly. Is there possibly a value of type object
or int
(a patient's id?) which can substitute a value of type Patient, and thereby set the foreign key?
The Consultation
model:
class Consultation(models.Model):
# the foreign keys
patient = models.ForeignKey(Patient, on_delete=models.CASCADE, null=True, blank=True)
praktijk = models.ForeignKey(Praktijk, on_delete=models.CASCADE, default='')
# other fields which do not give trouble with to_sql
patient_nr = models.IntegerField(blank=True, null=True)
# etc
The to_sql
call:
consultations.to_sql(Consult._meta.db_table, engine, if_exists='append', index=False, chunksize=10000)
If above is not possible, any hints towards another more efficient solution?